Lecture Notes in Artificial Intelligence Edited by R. Goebel, J. Siekmann, and W. Wahlster
Subseries of Lecture Notes in Computer Science
6277
Rossitza Setchi Ivan Jordanov Robert J. Howlett Lakhmi C. Jain (Eds.)
Knowledge-Based and Intelligent Information and Engineering Systems 14th International Conference, KES 2010 Cardiff, UK, September 8-10, 2010 Proceedings, Part II
13
Series Editors Randy Goebel, University of Alberta, Edmonton, Canada Jörg Siekmann, University of Saarland, Saarbrücken, Germany Wolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany Volume Editors Rossitza Setchi Cardiff University, School of Engineering The Parade, Cardiff CF24 3AA, UK E-mail:
[email protected] Ivan Jordanov University of Portsmouth, Dept. of Computer Science and Software Engineering Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, UK E-mail:
[email protected] Robert J. Howlett KES International 145-157 St. John Street, London EC1V 4PY, UK E-mail:
[email protected] Lakhmi C. Jain University of South Australia, School of Electrical and Information Engineering Adelaide, Mawson Lakes Campus, SA 5095, Australia E-mail:
[email protected]
Library of Congress Control Number: 2010932879
CR Subject Classification (1998): I.2, H.4, H.3, I.4, H.5, I.5 LNCS Sublibrary: SL 7 – Artificial Intelligence ISSN ISBN-10 ISBN-13
0302-9743 3-642-15389-5 Springer Berlin Heidelberg New York 978-3-642-15389-1 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2010 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper 06/3180
Preface
The 14th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems was held during September 8–10, 2010 in Cardiff, UK. The conference was organized by the School of Engineering at Cardiff University, UK and KES International. KES2010 provided an international scientific forum for the presentation of the results of high-quality research on a broad range of intelligent systems topics. The conference attracted over 360 submissions from 42 countries and 6 continents: Argentina, Australia, Belgium, Brazil, Bulgaria, Canada, Chile, China, Croatia, Czech Republic, Denmark, Finland, France, Germany, Greece, Hong Kong ROC, Hungary, India, Iran, Ireland, Israel, Italy, Japan, Korea, Malaysia, Mexico, The Netherlands, New Zealand, Pakistan, Poland, Romania, Singapore, Slovenia, Spain, Sweden, Syria, Taiwan, Tunisia, Turkey, UK, USA and Vietnam. The conference consisted of 6 keynote talks, 11 general tracks and 29 invited sessions and workshops, on the applications and theory of intelligent systems and related areas. The distinguished keynote speakers were Christopher Bishop, UK, Nikola Kasabov, New Zealand, Saeid Nahavandi, Australia, Tetsuo Sawaragi, Japan, Yuzuru Tanaka, Japan and Roger Whitaker, UK. Over 240 oral and poster presentations provided excellent opportunities for the presentation of interesting new research results and discussion about them, leading to knowledge transfer and generation of new ideas. Extended versions of selected papers were considered for publication in the International Journal of Knowledge-Based and Intelligent Engineering Systems, Engineering Applications of Artificial Intelligence, Journal of Intelligent Manufacturing, and Neural Computing and Applications. We would like to acknowledge the contribution of the Track Chairs, Invited Sessions Chairs, all members of the Program Committee and external reviewers for coordinating and monitoring the review process. We are grateful to the editorial team of Springer led by Alfred Hofmann. Our sincere gratitude goes to all participants and the authors of the submitted papers. September 2010
Rossitza Setchi Ivan Jordanov Robert J. Howlett Lakhmi C. Jain
Organization
KES 2010 was hosted and organized by the School of Engineering at Cardiff University, UK and KES International. The conference was held at the Mercure Holland House Hotel, September 8–10, 2010.
Conference Committee General Chair Conference Co-chair Executive Chair Chair of the Organizing Committee Program Chair
Rossi Setchi, Cardiff University, UK Lakhmi C. Jain, University of South Australia, Australia Robert J. Howlett, University of Brighton, UK Y. Hicks, Cardiff University, UK I. Jordanov, University of Portsmouth, UK
Organizing Committee KES Operations Manager Publicity Chairs KES Systems Support Members
Peter Cushion, KES International D. Todorov, Cardiff University, UK Yu-kun Lai, Cardiff University, UK Shaun Lee, KES International Engku Fadzli, Cardiff University, UK Lei Shi, Cardiff University, UK Nedyalko Petrov, Portsmouth University, UK Panagiotis Loukakos, Cardiff University, UK
Track Chairs Bruno Apolloni Bojana Dalbelo Basic Floriana Esposito Anne Hakansson Ron Hartung Honghai Liu Ngoc Thanh Nguyen Andreas Nuernberger Bernd Reusch Tuan Pham Toyohide Watanabe
University of Milan, Italy University of Zagreb, Croatia University of Bari, Italy Stockholm University, Sweden Franklyn University, USA University of Portsmouth, UK Wroclaw University of Technology, Poland University of Magdeburg, Germany University of Dortmund, Germany University of New South Wales, Australia Nagoya University, Japan
VIII
Organization
Invited Sessions Chairs 3D Visualization of Natural Language Intelligent Data Processing in Process Systems and Plants A Meta-Heuristic Approach to Management Engineering Knowledge Engineering and Smart Systems Skill Acquisition and Ubiquitous Human– Computer Interaction Application of Knowledge Models in Healthcare Knowledge Environment for Supporting Creative Learning ICT in Innovation and Creativity Intelligent Support for Designing Social Information Infrastructure Intelligent Systems in Ambient-Assisted Living Environments Knowledge-Based Systems for e-Business Quality Assurance and Intelligent Web-Based Information Technology Knowledge-Based Interface Systems
Reasoning-Based Intelligent Systems Data Mining and Service Science for Innovation
Minhua Eunice Ma, University of Derby, UK Nikolaos Antonopoulos, University of Derby, UK Bob Coyne, Columbia University, USA Kazuhiro Takeda, Shizuoka University, Japan Takashi Hamaguchi, Nagoya Institute of Technology, Japan Junzo Watada, Waseda University, Japan Taki Kanda, Bunri University of Hospitality, Japan Huey-Ming Lee, Chinese Culture University, Taiwan Lily Lin, China University of Technology, Taiwan Cesar Sanin, University of Newcastle, Australia Hirokazu Taki, Wakayama University, Japan Masato Soga, Wakayama University, Japan István Vassányi, University of Pannonia, Hungary György Surján, National Institute for Strategic Health Research, Hungary Toyohide Watanabe, Nagoya University, Japan Tomoko Kojiri, Nagoya University, Japan Toyohide Watanabe, Nagoya University, Japan Taketoshi Ushiama, Kyushu University, Japan Toyohide Watanabe, Nagoya University, Japan Naoto Mukai, Tokyo University of Science, Japan Antonio F. Gómez-Skarmeta, Universidad de Murcia, Spain Juan A. Botía, Universidad de Murcia, Spain Kazuhiko Tsuda, University of Tsukuba, Japan Nobuo Suzuki, KDDI Corporation, Japan Anastasia N. Kastania, Athens University of Economics and Business, Greece Stelios Zimeras, University of the Aegean, Greece Yuji Iwahori, Chubu University, Japan Naohiro Ishii, Aichi Institute of Technology, Japan Yoshinori Adachi, Chubu University, Japan Nobuhiro Inuzuka, Nagoya Institute of Technology, Japan Kazumi Nakamatsu, University of Hyogo, Japan Jair Minoro Abe, University of Sao Paulo, Brazil Katsutoshi Yada, Kansai University, Japan Takahira Yamaguchi, Keio University, Japan Maria Alessandra Torsello, University of Bari, Italy
Organization
Web 2.0: Opportunities and Challenges for Social Recommender Systems Innovations in Chance Discovery Personalization of Web Contents and Services Advanced Knowledge-Based Systems Knowledge-Based Creativity Support Systems
Intelligent Network and Service Real-World Data Mining and Digital Intelligence
Advanced Design Techniques for Adaptive Systems
Soft Computing Techniques and Their Intelligent Utilizations Toward Gaming, Robotics, Stock Markets etc. Methods and Techniques of Artificial and Computational Intelligence in Engineering Design Philosophical and Methodological Aspects of Reasoning and Decision Making
IX
Jose J. Pazos-Arias, University of Vigo, Spain Ana Fernandez-Vilas, University of Vigo, Spain Akinori Abe, University of Tokyo, Japan In-Young Ko, Korea Advanced Institute of Science and Technology (KAIST), Korea Juan D. Velásquez, University of Chile, Chile Alfredo Cuzzocrea, ICAR-CNR and University of Calabria, Italy Susumu Kunifuji, Jaist, Japan Kazuo Misue, University of Tsukuba, Japan Hidehiko Hayashi, Naruto University of Education, Japan Motoki Miura, Kyushu Institute of Technology, Japan Toyohisa Nakada, Niigata University of International and Information Studies, Japan Tessai Hayama, JAIST, Japan Jun Munemori, Wakayama University, Japan Takaya Yuizono, JAIST, Japan Rashid Mehmood, Swansea School of Engineering, UK Omer F. Rana, Cardiff University, UK Ziad Salem, Aleppo University, Syria Sorin Hintea, Technical University of Cluj-Napoca, Romania Hernando Fernández-Canque, Glasgow Caledonian University, UK Gabriel Oltean, Technical University of Cluj-Napoca, Romania Norio Baba, Osaka Kyoiku University, Japan
Argyris Dentsoras, University of Patras, Greece Nikos Aspragathos, University of Patras, Greece Vassilis Moulianitis, University of the Aegean, Greece
Vesa A. Niskanen, University of Helsinki, Finland
X
Organization
Semantic Technologies for Knowledge Workers
Tools and Techniques for Effective Creation and Exploitation of Biodiversity Knowledge Immunity-Based Systems
Andreas Dengel, German Research Center for Arificial Intelligence (DFKI), Germany Ansgar Bernardi, German Research Center for Arificial Intelligence (DFKI), Germany Andrew C. Jones, Cardiff University, UK Richard J. White, Cardiff University, UK Gabriele Gianini, Università degli Studi di Milan, Italy Antonia Azzini, Università degli Studi di Milan, Italy Stefania Marrara, Università degli Studi di Milan, Italy Yoshiteru Ishida, Toyohashi University of Technology, Japan Takeshi Okamoto, Kanagawa Institute of Technology, Japan Yuji Watanabe, Nagoya City University, Japan Koji Harada, Toyohashi University of Technology, Japan
Program Committee Abe, Akinori Alexandre de Matos Araujo Angelov, Plamen Anwer, Nabil Aoki, Shingo Apolloni, Bruno Aspragathos, Nikos A. Bannore, Vivek Barb, Adrian S. Becker-Asano Bianchini, Monica Bichindaritz, Isabelle Boeva, Veselka Boutalis, Yiannis Brna, Paul Buckingham, Christopher Camastra, Francesco Cao, Cungen Ceccarelli, Michele Chalup, Stephan Chang, Bao Rong Chen, Lihui Chen, Toly Cheng, Kai Cheung, Benny Cobos Pérez, Ruth Crippa, Paolo
IREIIMS University, Japan Rui, University of Coimbra, Portugal Lancaster University, UK LURPA - ENS CACHAN, France Osaka Prefecture University, Japan University of Milan, Italy University of Patras, Greece University of South Australia, Australia Penn State University, USA Christian, Intelligent Robotics and Communication Labs, Japan University of Siena, Italy University of Washington, USA Technical University of Sofia, Bulgaria Democritus University of Thrace, Greece University of Glasgow, UK Aston University, UK University of Naples Parthenope, Italy Chinese Academy of Sciences, China University of Sannio, Italy The University of Newcastle, Australia National University of Kaohsiung, Taiwan Nanyang Technological University, Singapore Feng Chia University, Taiwan Brunel University, UK Honk Kong Polytechnic University, Hong Kong Universidad Autónoma de Madrid, Spain Università Politecnica delle Marche, Italy
Organization
Cuzzocrea, Alfredo Damiana, Maria Luisa, Dasiopoulou, Stamatia De Cock, Martine De Wilde, Philippe Dengel, Andreas Duro, Richard J. Dustdar, Schahram Elomaa, Tapio Fernandez-Canque, Hernando Georgieva, Petia Godoy, Daniela Grabot, Bernard Graña Romay, Manuel Grecos, Christos Hara, Takahiro Hintea, Sorin Honda, Katsuhiro Hong, Tzung-Pei Hu, Chenyi Hurtado Larrain, Carlos Ichalkaranje, Nikhil Ishibuchi, Hisao Ishida, Yoshiteru Ito, Takayuki Ivancevic, Tijana Janicki, Ryszard Jastroch, Norbert Jensen, Richard Jones, Andrew Jordanov, Ivan Jung, Jason J. Juric, Matjaz B. Katagiri, Hideki Ko, In-Young Kodogiannis, Vassilis S. Koenig, Andreas Kojadinovic, Ivan Kompatsiaris, Yiannis Konar, Amit Koshizen, Takamasa Koychev, Ivan Kwong, C.K. Lee, Dah-Jye Lee, W.B. Likas, Aristidis
XI
University of Calabria, Italy University of Milan, Italy Informatics and Telematics Institute, Greece University of Washington, USA Heriot-Watt University, UK German Research Center for Arificial Intelligence (DFKI), Germany Universidade da Coruña, Spain Vienna University of Technology, Austria Tampere University of Technology, Finland Glasgow Caledonian University, UK University of Aveiro, Portugal UNICEN University, Argentina LGP-ENIT, France Universidad del Pais Vasco, Spain University of West Scotland, UK Osaka University, Japan Cluj-Napoca University, Romania Osaka Prefecture University, Japan National University of Kaohsiung, Taiwan University of Central Arkansas, USA University of Chile, Chile University of South Australia, Australia Osaka Prefecture University, Japan Toyohashi University of Technology, Japan Massachusetts Institute of Technology, USA University of South Australia, Australia McMaster University, Canada MET Communications GmbH, Germany Aberystwyth University, UK Cardiff University, UK University of Portsmouth, UK Yeungnam University, Korea University of Maribor, Slovenia Hiroshima University, Japan KAIST, Korea University of Westminster, UK Technische Universitaet Kaiserslautern, Germany University of Auckland, New Zealand Informatics and Telematics Institute, Greece Jadavpur University, India Honda R&D Co., Ltd., Japan University of Sofia, Bulgaria The Hong Kong Polytechnic University, Hong Kong Brigham Young University, USA Hong Kong Polytechnic University, Hong Kong University of Ioannina, Greece
XII
Organization
Lim, C.P. Liu, Lei Maglogiannis, Ilias Maier, Patrick Marinov, Milko T. McCauley Bush, Pamela Montani, Stefania Moreno Jimenez, Ramón Nguyen, Ngoc Thanh Nishida, Toyoaki Niskanen, Vesa A. Ohkura, Kazuhiro Palade, Vasile Pallares, Alvaro Paranjape, Raman Pasek, Zbigniew J. Pasi, Gabriella Passerini, Andrea Pazos-Arias, Jose Petrosino, Alfredo Prada, Rui Pratihar, Dilip Kumar Putnik, Goran D. Reidsema, Carl Resconi, Germano Rovetta, Stefano Sansone, Carlo Sarangapani, Jagannathan Sato-Ilic, Mika Schockaert, Steven Seiffert, Udo Simperl, Elena Smrz, Pavel Soroka, Anthony Szczerbicki, Edward Tanaka, Takushi Teng, Wei-Chung Tichy, Pavel Tino, Peter Tolk, Andreas Toro, Carlos Torra, Vicenc Tsihrintzis, George Tsiporkova, Elena Turchetti, Claudio
Universiti Sains Malaysia, Malaysia Beijing University of Technology, China University of Central Greece, Greece The University of Edinburgh, UK University of Ruse, Bulgaria University of Central Florida, USA Università del Piemonte Orientale, Italy Universidad del Pais Vasco, Spain Wroclaw University of Technology, Poland Kyoto University, Japan University of Helsinki, Finland Hiroshima University, Japan Oxford University, UK Plastiasite S.A., Spain University of Regina, Canada University of Windsor, Canada University of Milan, Italy Università degli Studi di Trento, Italy University of Vigo, Spain Università di Napoli Parthenope, Italy IST-UTL and INESC-ID, Portugal Osaka Prefecture University, Japan University of Minho, Portugal University of New South Wales, Australia Catholic University in Brescia, Italy University of Genoa, Italy Università di Napoli Federico II, Italy Missouri University of Science and Technology, USA University of Tsukuba, Japan Ghent University, Belgium Fraunhofer-Institute IFF Magdeburg, Germany University of Innsbruck, Austria Brno University of Technology, Czech Republic Cardiff University, UK The University of Newcastle, Australia Fukuoka Institute of Technology, Japan National Taiwan University of Science and Technology, Taiwan Rockwell Automation Research Centre, Czech Republic The University of Birmingham, UK Old Dominion University, USA VICOMTech, Spain IIIA-CSIC, Spain University of Piraeus, Greece Sirris, Belgium Università Politecnica delle Marche, Italy
Organization
Uchino, Eiji Urlings, Pierre Vadera, Sunil Valdéz Vela, Mercedes Vellido, Alfredo Virvou, Maria Wang, Zidong Watts, Mike White, Richard J. Williams, M. Howard Yang, Zijiang Yoshida, Hiroyuki Zanni-Merk, Cecilia Zheng, Li-Rong
Yamaguchi University, Japan DSTO, Department of Defence, Australia University of Salford, UK Universidad de Murcia, Spain Universitat Politècnica de Catalunya, Spain University of Piraeus, Greece Brunel University, UK TBA, New Zealand Cardiff University, UK Heriot-Watt University, UK York University, Canada Harvard Medical School, USA LGeCo - INSA de Strasbourg, France Royal Institute of Technology (KTH), Sweden
Reviewers Adam Nowak Adam Slowik Adrian S. Barb Akinori Abe Akira Hattori Alan Paton Alessandra Micheletti Alfredo Cuzzocrea Ammar Aljer Amparo Vila Ana Fernandez-Vilas Anastasia Kastania Anastasius Moumtzoglou Andrea Visconti Andreas Abecker Andreas Dengel Andreas Oikonomou Andrew Jones Annalisa Appice Anne Håkansson Ansgar Bernardi Anthony Soroka Antonio Gomez-Skarmeta Antonio Zippo Aristidis Likas Armando Buzzanca Artur Silic Athina Lazakidou Azizul Azhar Ramli Balázs Gaál
Bao Rong Chang Benjamin Adrian Bernard Grabot Bernd Reusch Bettina Waldvogel Björn Forcher Bob Coyne Bojan Basrak Bojana Dalbelo Basic Bozidar Ivankovic Branko Zitko Bruno Apolloni Calin Ciufudean Carlo Sansone Carlos Ocampo Carlos Pedrinaci Carlos Toro Cecilia Zanni-Merk Cesar Sanin Chang-Tien Lu Christian Becker-Asano Christine Mumford Chunbo Chu Costantino Lucisano C.P. Lim Cristos Orovas Daniela Godoy Danijel Radosevic Danilo Dell'Agnello David Martens
David Vallejo Davor Skrlec Dickson Lukose Dilip Pratihar Doctor Jair Abe Don Jeng Donggang Yu Doris Csipkes Eduardo Cerqueira Eduardo Merlo Edward Szczerbicki Eiji Uchino Elena Pagani Elena Simperl Esmail Bonakdarian Esmiralda Moradian Francesco Camastra Frane Saric Fujiki Morii Fumihiko Anma Fumitaka Uchio Gabbar Hossam Gabor Csipkes Gabriel Oltean Gabriella Pasi George Mitchell George Tsihrintzis Gergely Héja Gianluca Sforza Giovanna Castellano
XIII
XIV
Organization
Giovanni Gomez Zuluaga Gunnar Grimnes Gyorgy Surjan Haoxi Dorje Zhang Haruhiko H. Nishimura Haruhiko Haruhiko Nishimura Haruki Kawanaka Hector Alvarez Hernando Fernandez-Canque Hideaki Ito Hidehiko Hayashi Hideo Funaoi Hideyuki Matsumoto Hirokazu Miura Hisayoshi Kunimune Hrvoje Markovic Huey-Ming Lee Ilias Maglogiannis Ing. Angelo Ciccazzo Ivan Koychev Ivan Stajduhar J. Mattila Jair Abe Jari Kortelainen Jayanthi Ranjan Jerome Darmont Jessie Kennedy Jesualdo Tomás Fernández-Breis Jiangtao Cao Jim Sheng Johnson Fader Jose Manuel Molina Juan Botia Juan Manuel Corchado Juan Pavon Julia Hirschberg Jun Munemori Jun Sawamoto Junzo Watada Jure Mijic Katalina Grigorova Katsuhiro Honda Katsumi Yamashita Kazuhiko Tsuda
Kazuhiro Ohkura Kazuhiro Takeda Kazuhisa Seta Kazumi Nakamatsu Kazunori Nishino Kazuo Misue Keiichiro Mitani Kenji Matsuura Koji Harada Kouji Yoshida Lars Hildebrand Laura Caponetti Lei Liu Lelia Festila Leonardo Mancilla Amaya Lily Lin Ljiljana Stojanovic Lorenzo Magnani Lorenzo Valerio Ludger van Elst Manuel Grana Marek Malski Maria Torsello Mario Koeppen Marko Banek Martin Lopez-Nores Martine Cock Masakazu Takahashi Masaru Noda Masato Soga Masayoshi Aritsugi Mayumi Ueda Melita Hajdinjak Michelangelo Ceci Michele Missikoff Miguel Delgado Milko Marinov Minhua Ma Minoru Minoru Fukumi Monica Bianchini Motoi Iwashita Motoki Miura Nahla Barakat Naohiro Ishii Naoto Mukai Naoyuki Naoyuki Kubota
Narayanan Kulathuramaiyer Nikica Hlupi Nikola Ljubesic Nikos Tsourveloudis Nobuhiro Inuzuka Nobuo Suzuki Norbert Jastroch Norio Baba Noriyuki Matsuda Omar Rana Orleo Marinaro Paolo Crippa Pasquale Di Meo Pavel Tichy Philippe Wilde Rafael Batres Raffaele Cannone Ramón Jimenez Rashid Mehmood Richard Pyle Richard White Robert Howlett Roberto Cordone Ronald Hartung Roumen Kountchev Rozália Lakner Ruediger Oehlmann Ruth Cobos Ryohei Nakano Ryuuki Sakamoto Sachio Hirokawa Satoru Fujii Sebastian Rios Sebastian Weber Seiji Isotani Seiki Akama Setsuya Kurahashi Shamshul Bahar Yaakob Shinji Fukui Shusaku Tsumoto Shyue-Liang Wang Simone Bassis Sophia Kossida Stamatia Dasiopoulou Stefan Zinsmeister Stefania Marrara
Organization
Stefania Montani Stephan Chalup Steven Schockaert Sunil Vadera susumu hashizume Susumu Kunifuji Takanobu Umetsu Takashi Hamaguchi Takashi Mitsuishi Takashi Yukawa Takaya Yuizono Takeshi Okamoto Taketoshi Kurooka Taketoshi Ushiama Takushi Tanaka Tapio Elomaa Tatiana Tambouratzis Tessai Hayama Thomas Roth-Berghofer Tomislav Hrkac Tomoko Kojiri
Toru Fukumoto Toshihiro Hayashi Toshio Mochizuki Toumoto Toyohide Watanabe Toyohis Nakada Tsuyoshi Nakamura Tuan Pham Valerio Arnaboldi Vassilis Kodogiannis Vassilis Moulianitis Vesa Niskanen Veselka Boeva Vivek Bannore Wataru Sunayama Wei-Chung Teng William Hochstettler Winston Jain Wolfgang Stock Xiaofei Ji Yi Xiao
Yiannis Boutalis Yoshifumi Tsuge Yoshihiro Okada Yoshihiro Takuya Yoshinori Adachi Yoshiyuki Yamashita Youji Ochi Young Ko Yuichiro Tateiwa Yuji Iwahori Yuji Wada Yuji Watanabe Yuki Hayashi Yukio Ohsawa Yumiko Nara Yurie Iribe Zdenek Zdrahal Ziad Salem Zijiang Yang Zlatko Drmac Zuwairie Ibrahim
XV
Table of Contents – Part II
Web Intelligence, Text and Multimedia Mining and Retrieval Semantics-Based Representation Model for Multi-layer Text Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jiali Yun, Liping Jing, Jian Yu, and Houkuan Huang
1
Frequent Itemset Based Hierarchical Document Clustering Using Wikipedia as External Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G.V.R. Kiran, Ravi Shankar, and Vikram Pudi
11
Automatic Authorship Attribution for Texts in Croatian Language Using Combinations of Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tomislav Reicher, Ivan Kriˇsto, Igor Belˇsa, and Artur ˇsili´c
21
Visualization of Text Streams: A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artur ˇsili´c and Bojana Dalbelo Baˇsi´c Combining Semantic and Content Based Image Retrieval in ORDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlos E. Alvez and Aldo R. Vecchietti A Historically-Based Task Composition Mechanism to Support Spontaneous Interactions among Users in Urban Computing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Angel Jimenez-Molina and In-Young Ko Multi-criteria Retrieval in Cultural Heritage Recommendation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pierpaolo Di Bitonto, Maria Laterza, Teresa Roselli, and Veronica Rossano An Approach for the Automatic Recommendation of Ontologies Using Collaborative Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marcos Mart´ınez-Romero, Jos´e M. V´ azquez-Naya, Cristian R. Munteanu, Javier Pereira, and Alejandro Pazos Knowledge Mining with ELM System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ilona Bluemke and Agnieszka Orlewicz DOCODE-Lite: A Meta-Search Engine for Document Similarity Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Felipe Bravo-Marquez, Gaston L’Huillier, Sebasti´ an A. R´ıos, Juan D. Vel´ asquez, and Luis A. Guerrero
31
44
54
64
74
82
93
XVIII
Table of Contents – Part II
Intelligent Tutoring Systems and E-Learning Environments Group Formation for Collaboration in Exploratory Learning Using Group Technology Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mihaela Cocea and George D. Magoulas
103
Applying Pedagogical Analyses to Create an On-Line Course for e Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D.-L. Le, V.-H. Tran, D.-T. Nguyen, A.-T. Nguyen, and A. Hunger
114
Adaptive Modelling of Users’ Strategies in Exploratory Learning Using Case-Based Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mihaela Cocea, Sergio Gutierrez-Santos, and George D. Magoulas
124
An Implementation of Reprogramming Scheme for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aoi Hashizume, Hiroshi Mineno, and Tadanori Mizuno
135
Predicting e-Learning Course Adaptability and Changes in Learning Preferences after Taking e-Learning Courses . . . . . . . . . . . . . . . . . . . . . . . . . Kazunori Nishino, Toshifumi Shimoda, Yurie Iribe, Shinji Mizuno, Kumiko Aoki, and Yoshimi Fukumura
143
Intelligent Systems A Logic for Incomplete Sequential Information . . . . . . . . . . . . . . . . . . . . . . . Norihiro Kamide A Power-Enhanced Algorithm for Spatial Anomaly Detection in Binary Labelled Point Data Using the Spatial Scan Statistic . . . . . . . . . . . . . . . . . Simon Read, Peter Bath, Peter Willett, and Ravi Maheswaran Vertical Fragmentation Design of Distributed Databases Considering the Nonlinear Nature of Roundtrip Response Time . . . . . . . . . . . . . . . . . . . Rodolfo A. Pazos R., Graciela V´ azquez A., Jos´e A. Mart´ınez F., and Joaqu´ın P´erez O. Improving Iterated Local Search Solution For The Linear Ordering Problem With Cumulative Costs (LOPCC) . . . . . . . . . . . . . . . . . . . . . . . . . David Ter´ an Villanueva, H´ector Joaqu´ın Fraire Huacuja, Abraham Duarte, Rodolfo Pazos R., Juan Mart´ın Carpio Valadez, and H´ector Jos´e Puga Soberanes A Common-Sense Planning Strategy for Ambient Intelligence . . . . . . . . . . Mar´ıa J. Santofimia, Scott E. Fahlman, Francisco Moya, and Juan C. L´ opez
153
163
173
183
193
Table of Contents – Part II
Dialogue Manager for a NLIDB for Solving the Semantic Ellipsis Problem in Query Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rodolfo A. Pazos R., Juan C. Rojas P., Ren´e Santaolaya S., Jos´e A. Mart´ınez F., and Juan J. Gonzalez B. Hand Gesture Recognition Based on Segmented Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jing Liu and Manolya Kavakli Reasoning and Inference Rules in Basic Linear Temporal Logic BLTL . . . S. Babenyshev and V. Rybakov Direct Adaptive Control of an Anaerobic Depollution Bioprocess Using Radial Basis Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Emil Petre, Dorin S ¸ endrescu, and Dan Seli¸steanu Visualisation of Test Coverage for Conformance Tests of Low Level Communication Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Katharina Tschumitschew, Frank Klawonn, Nils Oberm¨ oller, and Wolfhard Lawrenz Control Network Programming with SPIDER: Dynamic Search Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kostadin Kratchanov, Tzanko Golemanov, Emilia Golemanova, and Tuncay Ercan Non-procedural Implementation of Local Heuristic Search in Control Network Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kostadin Kratchanov, Emilia Golemanova, Tzanko Golemanov, and Tuncay Ercan
XIX
203
214
224
234
244
253
263
Meta Agents, Ontologies and Search, a Proposed Synthesis . . . . . . . . . . . . Ronald L. Hartung and Anne H˚ akansson
273
Categorizing User Interests in Recommender Systems . . . . . . . . . . . . . . . . . Sourav Saha, Sandipan Majumder, Sanjog Ray, and Ambuj Mahanti
282
Architecture of Hascheck – An Intelligent Spellchecker for Croatian Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ˇ Sandor Dembitz, Gordan Gledec, and Bruno Blaˇskovi´c Light-Weight Access Control Scheme for XML Data . . . . . . . . . . . . . . . . . . Dongchan An, Hakin Kim, and Seog Park A New Distributed Particle Swarm Optimization Algorithm for Constraint Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sadok Bouamama
292
302
312
XX
Table of Contents – Part II
Simulation of Fuzzy Control Applied to a Railway Pantograph-Catenary System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simon Walters
322
Floor Circulation Index and Optimal Positioning of Elevator Hoistways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Panagiotis Markos and Argyris Dentsoras
331
Rapid Evaluation of Reconfigurable Robots Anatomies Using Computational Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Harry Valsamos, Vassilis Moulianitis, and Nikos Aspragathos
341
Incremental Construction of Alpha Lattices and Association Rules . . . . . Henry Soldano, V´eronique Ventos, Marc Champesme, and David Forge Intelligent Magnetic Sensing System for Low Power WSN Localization Immersed in Liquid-Filled Industrial Containers . . . . . . . . . . . . . . . . . . . . . Kuncup Iswandy, Stefano Carrella, and Andreas K¨ onig
351
361
Intelligent Data Processing in Process Systems and Plants An Overview of a Microcontroller-Based Approach to Intelligent Machine Tool Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Raees Siddiqui, Roger Grosvenor, and Paul Prickett Use of Two-Layer Cause-Effect Model to Select Source of Signal in Plant Alarm System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kazuhiro Takeda, Takashi Hamaguchi, Masaru Noda, Naoki Kimura, and Toshiaki Itoh
371
381
Coloured Petri Net Diagnosers for Lumped Process Systems . . . . . . . . . . . Attila T´ oth, Erzs´ebet N´emeth, and Katalin M. Hangos
389
Proactive Control of Manufacturing Processes Using Historical Data . . . . Manfred Grauer, Sachin Karadgi, Ulf M¨ uller, Daniel Metz, and Walter Sch¨ afer
399
A Multiagent Approach for Sustainable Design of Heat Exchanger Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Naoki Kimura, Kizuki Yasue, Tekishi Kou, and Yoshifumi Tsuge Consistency Checking Method of Inventory Control for Countermeasures Planning System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Takashi Hamaguchi, Kazuhiro Takeda, Hideyuki Matsumoto, and Yoshihiro Hashimoto
409
417
Table of Contents – Part II
Fault Semantic Networks for Accident Forecasting of LNG Plants . . . . . . Hossam A. Gabbar
XXI
427
A Meta Heuristic Approach to Management Engineering Fuzzy Group Evaluating the Aggregative Risk Rate of Software Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huey-Ming Lee and Lily Lin
438
Fuzzy Power System Reliability Model Based on Value-at-Risk . . . . . . . . Bo Wang, You Li, and Junzo Watada
445
Human Tracking: A State-of-Art Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junzo Watada, Zalili Musa, Lakhmi C. Jain, and John Fulcher
454
Ordinal Structure Fuzzy Logic Predictor for Consumer Behaviour . . . . . . Rubiyah Yusof, Marzuki Khalid, and Mohd. Ridzuan Yunus
464
Kansei for Colors Depending on Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . Taki Kanda
477
A Hybrid Intelligent Algorithm for Solving the Bilevel Programming Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shamshul Bahar Yaakob and Junzo Watada
485
Knowledge Engineering and Smart Systems Using Semantics to Bridge the Information and Knowledge Sharing Gaps in Virtual Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Javier Vaquero, Carlos Toro, Carlos Palenzuela, and Eneko Azpeitia
495
Discovering and Usage of Customer Knowledge in QoS Mechanism for B2C Web Server Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Leszek Borzemski and Gra˙zyna Suchacka
505
Conceptual Fuzzy Model of the Polish Internet Mortgage Market . . . . . . Aleksander Orlowski and Edward Szczerbicki
515
Translations of Service Level Agreement in Systems Based on Service Oriented Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adam Grzech and Piotr Rygielski
523
Ontology Engineering Aspects in the Intelligent Systems Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adam Czarnecki and Cezary Orlowski
533
XXII
Table of Contents – Part II
Supporting Software Project Management Processes Using the Agent System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cezary Orlowski and Artur Zi´ olkowski
543
Knowledge-Based Virtual Organizations for the E-Decisional Community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Leonardo Mancilla-Amaya, Cesar San´ın, and Edward Szczerbicki
553
Decisional DNA Applied to Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Haoxi Zhang, Cesar Sanin, and Edward Szczerbicki Supporting Management Decisions with Intelligent Mechanisms of Obtaining and Processing Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cezary Orlowski and Tomasz Sitek Finding Inner Copy Communities Using Social Network Analysis . . . . . . . ´ Eduardo Merlo, Sebasti´ an A. R´ıos, H´ector Alvarez, Gaston L’Huillier, and Juan D. Vel´ asquez
563
571 581
Enhancing Social Network Analysis with a Concept-Based Text Mining Approach to Discover Key Members on a Virtual Community of Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H´ector Alvarez, Sebasti´ an A. R´ıos, Felipe Aguilera, Eduardo Merlo, and Luis A. Guerrero
591
Intelligence Infrastructure: Architecture Discussion: Performance, Availability and Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giovanni G´ omez Zuluaga, Cesar San´ın, and Edward Szczerbicki
601
Skill Acquisition and Ubiquitous Human Computer Interaction Geometric Considerations of Search Behavior . . . . . . . . . . . . . . . . . . . . . . . . Masaya Ashida and Hirokazu Taki A Web-Community Supporting Self-management for Runners with Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Naka Gotoda, Kenji Matsuura, Shinji Otsuka, Toshio Tanaka, and Yoneo Yano An Analysis of Background-Color Effects on the Scores of a Computer-Based English Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Atsuko K. Yamazaki Message Ferry Route Design Based on Clustering for Sparse Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hirokazu Miura, Daisuke Nishi, Noriyuki Matsuda, and Hirokazu Taki
611
620
630
637
Table of Contents – Part II
XXIII
Affordance in Dynamic Objects Based on Face Recognition . . . . . . . . . . . . Taizo Miyachi, Toshiki Maezawa, Takanao Nishihara, and Takeshi Suzuki
645
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
653
Semantics-Based Representation Model for Multi-layer Text Classification Jiali Yun, Liping Jing, Jian Yu, and Houkuan Huang School of Computer and Information Technology, Beijing Jiaotong University, China {yunjialiyjl,lpjinghk,yujian6463,hkhuang}@gmail.com
Abstract. Text categorization is one of the most common themes in data mining and machine learning fields. Unlike structured data, unstructured text data is more complicated to be analyzed because it contains too much information, e.g., syntactic and semantic. In this paper, we propose a semantics-based model to represent text data in two levels. One level is for syntactic information and the other is for semantic information. Syntactic level represents each document as a term vector, and the component records tf-idf value of each term. The semantic level represents document with Wikipedia concepts related to terms in syntactic level. The syntactic and semantic information are efficiently combined by our proposed multi-layer classification framework. Experimental results on benchmark dataset (Reuters-21578) have shown that the proposed representation model plus proposed classification framework improves the performance of text classification by comparing with the flat text representation models (term VSM, concept VSM, term+concept VSM) plus existing classification methods. Keywords: Representation Model, Semantics, Wikipedia, Text Classification, Multi-layer Classification.
1
Introduction
Text categorization is one of the most common themes in data mining and machine learning fields. The task of text categorization is to build a classifier based on some labeled documents and classify the unlabeled documents into the prespecified categories. As we know, text data is unstructured, thus it cannot be directly processed by the existing classification algorithms. In order to structure text data, Bag of Words (BOW) model [1] is popularly used here. In BOW (i.e., term VSM), a document is represented as a feature vector which consists of the words appearing in the document. However, BOW has two main drawbacks: (1) it does not consider the semantic relatedness between words, i.e., two words with similar meanings are treated as irrelevant features. (2) One word in different contexts cannot be differentiated if they have different meanings. In other words, BOW model does not cover the semantic information, which limits its performance in many text mining tasks, e.g., classification, clustering, information retrieval and etc.. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 1–10, 2010. c Springer-Verlag Berlin Heidelberg 2010
2
J. Yun et al.
To build semantic representation of the document, concepts related to documents are extracted from background knowledge. Major approach for concepts exaction is to map the terms or phrases of documents to their corresponding concepts in WordNet or Wikipedia [2, 3, 6, 7, 8, 9]. Based on these extracted concepts, semantics-based text representation model can be built. A basic idea is only adopting concepts as the features of document vector called BOC model (i.e., concept VSM) [4]. Another one is representing a document as a combined vector by extending term vector by concepts (i.e., term+concept VSM) [3, 6]. These researchers experimentally showed that combined vector usually can improve successfully the performance in text clustering and BOC model does not perform better than BOW in most real data. These observations gave us a hint: concept vector can supply more information for discriminating documents, but only using concepts cannot represent document sufficiently. Concept mapping could result in loss of information or addition of noise. It is necessary that both terms and concepts are included in representation model for document analysis. Another common method for considering both syntactic information and semantic information in document classification and clustering is applying liner combination of term-based similarity and concept-based similarity as the final similarity measure [7, 8, 9]. But it has to optimize parameters to achieve further improvement. Additionally, these methods use flat feature representation model which can not consider the spatial distribution of terms and concepts [10]. In this paper, we propose a semantics-based representation model which covering both syntactic and semantic information in two levels. The semantic information is represented with the appropriate concepts related to the terms in document with the aid of Wikipedia. For each term of a document, its relevant concepts are extracted from Wikipedia by measuring the semantic similarity between these concepts and the relevant concepts of the other terms in the document (i.e., the term’s context). Different from existing methods, the proposed semantics-based model represents a document as two-level vector containing syntactic (terms) and semantic (relevant concepts) information respectively. Syntactic level and semantic level are connected by the semantic relatedness between terms and concepts. In order to efficiently make use of the semantic information, furthermore, we design a multi-layer classification (ML-CLA) framework to analyze text data in a way of layer-by-layer. ML-CLA framework includes multiple classifiers and is implemented with two schemes. In the first scheme, there are as many classifiers as the number of levels in the proposed representation model, e.g., two classifiers for two-level model. The low-layer classifier takes the data represented in the low level as the input, and its output is used to represent the document in a compressed vector. Such compressed vectors will be input into the high-layer classifier together with the data represented in the high level. The second scheme consists of three classifiers. Among them, two classifiers are applied on document represented in two levels respectively and independently. Based on the output of these two classifiers, each document is represented as two compressed vectors. By combining these two compressed vectors, the third classifier is applied to obtain
Semantics-Based Representation Model for Multi-layer Text Classification
3
the final result. ML-CLA framework effectively keeps the primary information and reduce the influence of noise by compressing the original information, so that the proposed framework guarantees the quality of the input of the classifier. Thus we believe the final classification performance would be improved. The rest of the paper is organized as follows: Section 2 introduces the proposed semantics-based representation model. In Section 3, we present the ML-CLA framework. Section 4 describes experiments and discusses results. Finally, we conclude the paper in Section 5.
2
Document Representation
A new semantics-based representation model is proposed here to represent document as two levels feature vectors containing syntactic information and semantic information respectively. Syntactic level is based on terms appeared in document. Term-based VSM and tf-idf weighting scheme are used in this level. Semantic level consists of Wikipedia concepts related to the terms of the document. These two levels are connected via the semantic correlation between terms and relevant concepts. We note that there may be many to many mapping between terms and concepts and some terms may not have relevant concepts. The key part in this model is to construct the semantic level. Our presented method includes two steps: correctly mapping term to concept and effectively calculating the weight of the concept. Some researchers have proposed some methods on this point [3, 9, 8]. In this paper, we implement the concept mapping and weighting based on Wikipedia’s abundant links structure. To date, Wikipedia is the largest encyclopedia in the world and useful for natural language processing [5]. Each article in Wikipedia describes a topic denoted as a concept. Wikipedia’s documentation indicates that if the term or phrase of article relates to a significant topic, it should be linked to the corresponding article and this term or phrase is called anchor. We identify the candidate relevant concepts for term via these hyperlinks, as mentioned in [11]. In Wikipedia, there may be twenty concepts related to one term [9]. Which one or ones are truly semantic-related to the term in given document? Because it is intuitive that term is usually as its several most obvious senses, in this paper we first select top k (e.g. k = 3) obvious senses as candidate concepts for each term where the obviousness of sense c for term t is defined as the ratio of the frequency of that t as an anchor linked to c to the frequency of t as an anchor in all the Wikipedia articles. Next, we adopt context-based method [11] to calculate the semantic relatedness between term and its candidate concept in a given document. The relatedness is defined by Eq.(1). Rel(t, ci |dj ) =
1 |T | − 1
tl ∈T &tl =t
1 |csl |
SIM (ci , ck )
(1)
tl ∼ck &ck ∈csl
where T is the term set of the jth document dj , tl is a term in dj except for t. tl ∼ ck means that term tl is related to concept ck , csl is the obvious
4
J. Yun et al.
Wikipedia concepts set related to term tl . SIM (ci , ck ) is the semantic relatedness between two concepts, which is calculated by using the hyperlink structure of Wikipedia [12] as follows. SIM (ci , ck ) = 1 −
log(max(|A|, |B|)) − log(|A ∩ B|) log(|W |) − log(min(|A|, |B|))
(2)
where A and B are the sets of all articles that link to concepts ci and ck respectively, and W is the set of all articles Wikipedia. Eq.(2) is based on term occurrences on Wikipedia-pages. Pages that contain both terms indicate relatedness, while pages with only one of the terms suggest the opposite. Higher value of Rel(t, ci |dj ) shows that concept ci is more semantically related to term t, because ci is much more similar to the relevant concepts of other terms in dj (other terms are the context of term t in dj ). The concepts with highest relatedness will be used to properly build the concept vector (semantic level) in semantics-based representation model. Meanwhile, we define the concept’s weight as the weighted sum of the related terms’ weight w(tk , dj ) based on the semantic relatedness between concept and its related terms as follows. w(ci , dj ) =
(w(tk , dj ) ∗ Rel(tk , ci |dj ))
(3)
tk ∈dj
Table. 1 shows a simple example on how to build two-level representation model based on Eq.(3). T erm Column is extracted from a given document and terms’ weights are listed in Column 3. The concept Column is obtained by mapping terms into Wikipedia, meanwhile the relatedness between concepts and terms are listed in Column 4. Column 5 indicates the concepts’ weights calculated by Eq.(3). Table 1. A simple example on two-level representation model term (t) cash earn sell sale share agreement
concept (c) Money Income Trade Trade Stock Contract
w(t, d) 0.0404 0.0401 0.0794 0.0791 0.0201 0.0355
R(t, c|d) 0.2811 0.2641 0.2908 0.2908 0.2388 0.2617
w(c, d) 1.14e-2 1.06e-2 4.61e-2 4.61e-2 4.80e-3 9.29e-3
So far, a document is represented with a two-level model which efficiently integrates the syntactic information and semantic information.These two representation levels can be taken as low level and high level respectively, e.g., if syntactic level is low level, semantic level will be high level, and vice verse. Next, we will design a multi-layer classification framework to efficiently classify the document represented by our proposed model.
Semantics-Based Representation Model for Multi-layer Text Classification
3
5
Multi-Layer Classification Framework
Multi-layer classification (ML-CLA) framework is specially designed to handle large scale data with complex and high dimensions in a way of layer-by-layer. The proposed classification framework contains multiple classifiers which runs on different feature spaces. This idea is also adopted in co-training [13] which assumes the feature spaces are independent to each other. However, this assumption is not true because there is more or less relation between feature spaces in real applications. Our proposed framework considers such relationship in a way of layer-by-layer. ML-CLA framework consists of two schemes to classify the data represented with the semantics-based model proposed in Section 2. One scheme consists of two classifiers in two layers, one for low layer and the other for high layer. The low-layer classifier is applied on data represented by the low level of model, e.g., syntactic level including term information or semantic level including concept information. Its output can be treated as the compressed representation of the data in low level. The input of the high-layer classifier contains this compressed representation and the data represented by the high level of the proposed model. During the whole classification procedure, the feature space of high-layer classifier is complemented with the feature space in low level. For our proposed representation model, if the syntactic level (termbased VSM) is taken as the low level, semantic level (concept-based VSM) will be the high level, and vice verse. In the former strategy, we named the scheme as concept+K, while the later strategy as term+K. We use Fig. 1 to illustrate the concept+K schema more clearly. In Fig. 1, The bottom layer represents the low-layer classifier and the top layer represents the high-layer classifier. The low-layer classifier is first trained by training set, then classify the test set and assign the labels for the documents. Here, the training set and test set are both represented by term-based VSM as follows. = [tj1 , · · · , tjN ] dlow−level j Once the low-layer classifier is built, based on the truth labels of training set and the predicted labels of test set, the centers of each class can be determined by averaging the document vectors belonging to this class as showed in Eq. (4).
Zk =
dj ∈Ck
dj
|Zk |
(4)
where |Zk | is the number of documents in the kth class Ck . Based on the class centers, we can represent the documents with a K-dimension compressed vector by calculating the similarities between them and class centers (K equals to the number of classes). In our experiments, cosine measure is used here to compute the similarities between documents and centers. sjk = cos(dj , Zk )
(5)
As showed in the middle layer in Fig. 1, such compression representation will be input into the high-layer classifier together with the feature space in high level (concept-based VSM) as follows.
6
J. Yun et al.
dhigh−level = [cj1 , · · · , cjM , sj1 , · · · , sjK ] j where cji the weight of ith feature of document dj represented with high level of the proposed model, M is the number of features in high level. and sjk is the similarity between the current document and the kth class center, K is the number of classes. So far, the document could be represented as combined vector (Concept+K VSM). Based on this representation, the high layer classifier is trained by training set, then classify the test set and output the labels which are regarded as the final classification results, as showed in the top layer in Fig. 1. High Layer Training Set Concept+K VSM
training
Basic Clasisifier (KNN or SVM)
Training Set Compressed Representation
Training Set Concept-based VSM
testing
Test Set Concept+K VSM
Test Set Compressed Representation
(T Training Set Term-based VSM
Test Set Label
output
Test Set Concept-based VSM
(T training
Basic Clasisifier (KNN or SVM)
testing
Test Set Term-based VSM
Test Set Label
Low Layer
Fig. 1. ML-CLA Framework (Concept+K Schema)
The other ML-CLA scheme showed in Fig. 2 consists of three classifiers. The first two classifiers are applied on syntactic level and semantic level independently. Based on these two classifiers, we can represent one document with two K-dimension compressed vectors according to the similarities between document and all class centers. The third classifier is built based on the combination of these two K-dimension compressed vectors as follows. dj = [sj1 , · · · , sjK , sj1 , · · · , sjK ] sjk is the similarity between the jth document represented in syntactic level of the model and the kth class center obtained by the first classifier. sjk is the similarity between the jth document represented in semantic level of the model and the kth class center obtained by the second classifier. We denoted this scheme as K+K. In ML-CLA framework, the primary information is effectively kept and the noise is reduced by efficiently compressing the original information, so that MLCLA guarantees the quality of the input of the classifier. Thus we believe the final classification performance would be improved.
4 4.1
Experiments Datasets
The proposed representation model and multi-layer classification framework were tested on real data, Reuters-21578. Two data subsets were created from
Semantics-Based Representation Model for Multi-layer Text Classification
7
High Layer Training Set K+K VSM
Training Set Compressed Representation
training
Basic Clasisifier (KNN or SVM)
Training Set Compressed Representation
Test Set K+K VSM
testing
Test Set Compressed Representation
Eq(5) Eq(5)
Training Set Term-based VSM
Training Set Concept-based VSM
Test Set Label
output
Test Set Compressed Representation
Eq(5) training
training
Basic Clasisifier (KNN or SVM)
Basic Clasisifier (KNN or SVM)
testing
Test Set Term-based VSM
testing
Test Set Label
Test Set Concept-based VSM
Eq(5)
Test Set Label
Low Layer
Fig. 2. ML-CLA Framework (K+K Schema)
Reuters-21578 following [8]: R-Min20Max200 and R-Top10. R-Min20Max200 consists of 25 categories with at least 20 and at most 200 documents, 1413 documents totally. In R-Top10, 10 largest categories are extracted from the original data set including 8023 doucments. In this paper, we only consider the singlelabel documents. Wikipedia is used as background knowledge which contains 2,388,612 articles (i.e., concepts) and 8,339,823 anchors in English. 4.2
Experimental Results
In order to show the efficiency of the proposed classification methods, we use three flat document representation methods including term VSM, concept VSM and term+concept VSM as baseline models. Term VSM means representing document as term vector, that is, BOW model. Concept VSM means representing document as Wikipedia concept vector, also called BOC model in literatures. In term+concept VSM, a document is represented by a flat combined vector by appending concepts to term vector. Based on these baseline models, one classifier is enough to be classify text data. The proposed multi-layer classification framework can analyze text data represented with the proposed semantics-based model in three ways, concept+K, term+K and K+K in short. In concept+K, syntactic level (i.e., term information) is the input of the low-layer classifier. The input of the high-layer classifier contains two parts. One is concept information in semantic level. The other is Kdimension compressed representation of the syntactic level (K is the number of classes in dataset) obtained by the low-layer classifier. Contrary to concept+K, term+K takes semantic level as the input of low-layer classifier, while the syntactic level and K-dimension compression representation are the input of highlayer classifier. For K+K scheme, we firstly applied two classifiers on syntactic level and semantic level respectively. Each classifier will be used to construct K-dimension compressed representation of the original information. Next, the third classifier will be applied on the data combining these two K-dimension compressed representations.
8
J. Yun et al. 1
0.9
0.8
0.9
0.7 0.8
F−measure
F−measure
0.6 0.7
0.6
0.5
0.4 0.5 0.3
0.4
0.2
3/4
2/3
1/2
1/3
1/4
1/6 1/8 Training Set Size
1/10
1/12
1/14
0.1 3/4
1/16
(a) R-Min20Max200, SVM
2/3
1/2
1/3
1/4
1/6 1/8 Training Set Size
1/10
1/12
1/14
1/16
1/14
1/16
(b) R-Min20Max200, 1NN
1
1
0.9
0.95
0.8 0.9
F−measure
F−measure
0.7 0.85
0.8
0.6
0.5 0.75 0.4
0.7
0.65 3/4
0.3
2/3
1/2
1/3
1/4
1/6 1/8 Training Set Size
1/10
1/12
1/14
1/16
(c) R-Top10, SVM term VSM ML-CLA (concept+K)
0.2 3/4
2/3
1/2
1/3
1/4
1/6 1/8 Training Set Size
1/10
1/12
(d) R-Top10, 1NN concept VSM ML-CLA (term+K)
term+concept VSM ML-CLA (K+K)
Fig. 3. Comparison of classification results (F-Measure of n-fold cross validation on testing data) using six models (term VSM, concept VSM, term+concept VSM, MLCLA (concept+K), ML-CLA (term+K), ML-CLA (K+K)) on two datasets based on SVM and 1NN respectively
Furthermore, we constructed a series of experiments with different percentages of the training and test data for each dataset (3/4, 2/3, 1/2, 1/3, 1/4, 1/6, 1/8, 1/10, 1/12, 1/14 and 1/16 are the percentages of training data). When the percentage of training data is 3/4, we will use 3/4 of the whole data as training set, while only 1/4 part as training set when the percentage is 1/4, and so on. These experiments would show how the size of training data affects the performance of text classification. SVM [14] and KNN [15] were used as basic classifers, where K is set to be 1 in KNN. Here, SVM and 1NN algorithms in Weka1 were called with default parameter value. In our experiments, each framework only use one kind of classifier. For example, concept+K covers two SVM classifiers or two 1NN classifiers, while K+K contains three SVM or three 1NN classifiers. Meanwhile, we run the same classifier (SVM or 1NN) on three baseline representation models. The classification performance is measured with F-measure [16] of n-fold cross validation on testing dataset, and higher F-measure 1
http://www.cs.waikato.ac.nz/ml/weka/
Semantics-Based Representation Model for Multi-layer Text Classification
9
value means better classification result. For different sizes of training data, n is set to be different values (4,3,2,3,4,6,8,10,12,14 and 16 respectively). Fig. 3 shows the F-measure curves of SVM and 1NN classifiers with different representation models and different sizes of training data. From Fig. 3, we can see that K+K classification framework yields the best results in most cases. Both concept+K and term+K framework achieve better results than three baseline models. Meanwhile, the experimental results also demonstrate that only using concepts to represent document (concept VSM) usually gets worse performance than traditional BOW model (term VSM), because it is difficult to avoid inducing noises and losing information during concept mapping due to the limitation of word sense disambiguation technique. For term+concept VSM, which is popular document representation model with semantic information, its performances have a little improvement compared with the other two baseline models in some cases, but sometimes it also gets much worse results than term VSM. In addition, the results shown in Fig. 3 give us an interesting and exciting hint. The proposed multi-layer classification framework plus the proposed representation model can get more outstanding improvement when the size of training data is small. This observation is very important because there is always very small number of labeled data (i.e., training data) in real application. Thus, our proposed method is more practical.
5
Conclusion
In this paper, we present a semantics-based model for text representation with the aid of Wikipedia. The proposed representation model contains two levels, one for term information, the other for concept information and these levels are connected by the semantic relatedness between terms and concepts. A contextbased method is adopted to identify the relatedness between terms and concepts by utilizing the link structure among Wikipedia articles, which is also used to select the most appropriate concept for a term in a given document and perform word sense disambiguation. Based on the proposed representation model, we present a multi-layer classification (ML-CLA) framework to analyze text data in three ways (concept+K ,term+K and K+K). Experimental results on real dataset (Reuters-21578) have shown that the proposed model and classification framework significantly improved the classification performance by comparing with the other flat vector models plus the traditional classification algorithms. Acknowledgments. We thank David Milne for providing us with the Wikipedia Miner tool-kit. This work was supported in part by the National Natural Science Foundation of China (60905028, 90820013, 60875031), 973 project (2007CB311002), program for New Century Excellent Talents in University in 2006, Grant NCET-06-0078 and the Fundamental Research Funds for the Central Universities (2009YJS023).
10
J. Yun et al.
References 1. Yates, R., Neto, B.: Modern information retrieval. Addison-Wesley Longman, Amsterdam (1999) 2. Jing, L., Zhou, L., Ng, M., Huang, J.: Ontology-based distance measure for text clustering. In: 4th Workshop on Text Mining, the 6th SDM, Bethesda, Maryland (2006) 3. Hotho, A., Staab, S., Stumme, G.: WordNet improves text document clustering. In: The Semantic web workshop at the 26th ACM SIGIR, Toronto, Canada, pp. 541–544 (2003) 4. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipediabased explicit semantic analysis. In: The 20th IJCAI, Hyderabad, India, pp. 1606– 1611 (2007) 5. Gabrilovich, E., Markovitch, S.: Wikipedia-based semantic interpretation for natural language processing. J. of Artificial Intelligence Research 34, 443–498 (2009) 6. Wang, P., Domeniconi, C.: Building semantic kernels for text classification using Wikipedia. In: The 14th ACM SIGKDD, New York, pp. 713–721 (2008) 7. Hu, J., Fang, L., Cao, Y., Zeng, H., Li, H., Yang, Q., Chen, Z.: Enhancing text clustering by leveraging Wikipedia semantics. In: The 31st ACM SIGIR, Singapore, pp. 179–186 (2008) 8. Huang, A., Milne, D., Frank, E., Witten, I.: Clustering documents using a Wikipedia-based concept representation. In: The 13rd PAKDD, Bangkok, Thailand, pp. 628–636 (2009) 9. Hu, X., Zhang, X., Lu, C., Park, E.K., Zhou, X.: Exploiting Wikipedia as External Knowledge for Document Clustering. In: The 15th ACM SIGKDD, Paris, pp. 389– 396 (2009) 10. Chow, T., Rahman, M.: Multilayer som with tree-structured data for efficient document retrieval and plagiarism detection. IEEE Trans. on Neural Networks 20, 1385–1402 (2009) 11. Medelyan, O., Witten, I., Milne, D.: Topic indexing with wikipedia. In: The AAAI Wikipedia and AI workshop, Chicago (2008) 12. Milne, D., Witten, I.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: The Workshop on Wikipedia and Artificial Intelligence at AAAI, Chicago, pp. 25–30 (2008) 13. Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-trainin. In: The 9th CIKM, New York, pp. 86–93 (2000) 14. Scholkopf, B., Burges, C., Smola, A.: Advances in kernel methods: support vector learning. MIT Press, Cambridge (1999) 15. Shakhnarovich, G., Darrell, T., Indyk, P.: Nearest-Neighbor methods in learning and vision. The MIT Press, Cambridge (2005) 16. Feldman, R., Sanger, J.: The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, Cambridge (2007)
Frequent Itemset Based Hierarchical Document Clustering Using Wikipedia as External Knowledge Kiran G V R, Ravi Shankar, and Vikram Pudi International Institute of Information Technology, Hyderabad {kiran gvr,krs reddy}@students.iiit.ac.in,
[email protected]
Abstract. High dimensionality is a major challenge in document clustering. Some of the recent algorithms address this problem by using frequent itemsets for clustering. But, most of these algorithms neglect the semantic relationship between the words. On the other hand there are algorithms that take care of the semantic relations between the words by making use of external knowledge contained in WordNet, Mesh, Wikipedia, etc but do not handle the high dimensionality. In this paper we present an efficient solution that addresses both these problems. We propose a hierarchical clustering algorithm using closed frequent itemsets that use Wikipedia as an external knowledge to enhance the document representation. We evaluate our methods based on F-Score on standard datasets and show our results to be better than existing approaches. Keywords: Frequent itemsets, Document clustering, Wikipedia, Ontology, TF-IDF.
1
Introduction
A recent trend in clustering documents is the use of frequent itemsets. These methods handle the high dimensionality of the data by considering only the words which are frequent for clustering. A frequent itemset is a set of words which occur together frequently and are good candidates for clusters. Many algorithms in this category consider the entire set of frequent itemsets for clustering, which may lead to redundant clusters. Most approaches performing document clustering do not consider the semantic relationship between the words. Thus if two documents talking about the same topic do that using different words (which may be synonyms), these algorithms can not find the similarity between them and may cluster them into two different clusters. A simple solution to this problem is to use an ontology to enhance the document representation. Topic detection is a problem very closely related to that of document clustering. Intuitively, the goal here is to determine the set of all topics that are contained in a document. A topic is not necessarily the same as a keyword, but can be defined by a set of related keywords. The problem of topic detection therefore reduces to finding sets of related keywords in a document collection. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 11–20, 2010. c Springer-Verlag Berlin Heidelberg 2010
12
K. G V R, R. Shankar, and V. Pudi
The contributions of this paper are: (1) First, we show a strong formal connection between the dual problems of document clustering and topic detection when seen in the context of frequent itemset mining. (2) Second, we propose an efficient document clustering algorithm based on frequent itemset mining concepts that has the following attractive features: (a) It handles high dimensional data – up to thousands of dimensions. (b) It allows the use of external knowledge such as that contained in Wikipedia to handle semantic relations between words. (c) Achieves a compact clustering using concepts from generalized frequent itemsets [1]. (d) It provides meaniningful labels to the clusters. While there are several document clustering algorithms that also have some of the above features, ours is unique in that it combines all of them. The rest of the paper is organized as follows: Section 2 describes the itemset based document clustering problem. Section 3 talks about the related work. Section 4 briefly describes our entire algorithm. Section 5 details our approaches to reduce document duplication and Section 6 presents the experiments that support our approaches. Finally we conclude our paper in Section 7.
2
Itemset-Based Document Clustering Problem
In this section, we formally describe the problem of itemset-based clustering of documents. Our formulation captures the essence of related methods (described in Section 3) in an elegant manner. In particular, we show that the dual problems of document clustering and topic detection are related very closely when seen in the context of frequent itemset mining. Intuitively, the document clustering problem is to cluster text documents by using the idea that similar documents share many common keywords. Alternatively, the topic detection problem is to group related keywords together into meaningful topics using the idea that similar keywords are present in the same documents. Both of these problems are naturally solved by utilizing frequent itemset mining as follows. For the first problem, keywords in documents are treated as items and the documents (treated as sets of keywords) are analogous to transactions in a marketbasket dataset. This forms a transaction space, that we refer to as doc-space as illustrated below, where the di ’s are documents and wij ’s are keywords. d1 – [w11 , w21 , w31 , w41 , . . . ] d2 – [w12 , w22 , w32 , w42 , . . . ] d3 – [w13 , w23 , w33 , w43 , . . . ] ... Then, in this doc-space, frequent combinations of keywords (i.e., frequent itemsets) that are common to a group of documents convey that those documents are similar to each other and thereby help in defining clusters. e.g: if (a, b, c) is a frequent itemset of keywords, then (d1 , d3 , d4 ) which are the documents that contain these keywords form a cluster. For the second problem, documents themselves are treated as items and the keywords are analogous to transactions – the set of documents that contain a
Frequent Itemset Based Hierarchical Document Clustering
13
keyword is the transaction for that keyword. This forms a transaction space, that we refer to as topic space as illustrated below, where the di ’s are documents and wij ’s are keywords. w11 – [di1 , dj1 , dk1 , . . . ] w12 – [di2 , dj2 , dk2 , . . . ] w13 – [di3 , dj3 , dk3 , . . . ] ... Then, in this topic-space frequent combinations of documents (i.e., frequent itemsets) that are common to a group of keywords convey that those keywords are similar to each other and thereby help in defining topics. In this context, the following lemma formally captures the relationship between the doc-space and topic-space representations. Lemma 1. If (w1 , w2 , w3 ,. . . ) is frequent in the doc space, (di , dj , dk , . . . ), the documents containing these words will also be frequent in the topic space and vice versa. Proof: Let W = (w1 , w2 , . . . , wn ) be frequent in the doc space and D = {d1 , d2 , . . . , dm } be the corresponding set of documents for these frequent words. =⇒ In the topic space, (d1 , d2 , d2 , . . . , dm ) occurs in each transaction wi ∈ W . =⇒ Each di ∈ D occurs in atleast n transactions. =⇒ This means D is frequent in the topic space for all minimum supports ≤ n, where n is the length of W . The above analogies between document clustering, topic detection and frequent itemset mining are summarized in Table 1. Table 1. Document Clustering and Topic Detection Frequent Itemset Mining Document Clustering Item Keyword Transaction Document Frequent Itemset Document Cluster
3
Topic Detection Document Keyword Topic
Related Work
Document clustering has been an interesting topic of study since a long time. Unweighted Pair Group Method with Arithmetic Mean (UPGMA) [2] is one of the best algorithms for agglomerative clustering [3]. K-Means and its family of algorithms have also been extensively used in document clustering. Bisecting K-means is the best one in this family for partitional clustering [3]. Clustering using frequent itemsets has been a topic of extensive research in recent times. Frequent Term based Clustering (HFTC) [4] has been the first algorithm in this regard. But HFTC was not scalable and Fung, et al came up with Hierarchical Document Clustering using Frequent itemsets (FIHC) [5] which outperforms
14
K. G V R, R. Shankar, and V. Pudi
HFTC. It provides a hierarchical clustering with labels to the clusters. Some of the drawbacks of FIHC include (i)using all the frequent itemsets to get the clustering (number of frequent itemsets may be very large and redundant) (ii) Not comparable with previous methods like UPGMA and Bisecting K-means in terms of clustering quality. [6] (iii) Use hard clustering (each document can belong to at most one cluster), etc. Then Yu, et al came up with a much more efficient algorithm using closed frequent itemsets for clustering(TDC) [7]. They also provide a method for estimating the support correctly. But they use closed itemsets which also may be redundant. Recently Hasan H Malik, et al. proposed Hierarchical Clustering using Closed Interesting Itemsets, (which we refer to as HCCI) [6] which is the current state of the art in clustering using frequent itemsets. They further reduce the closed itemsets produced by using only the closed itemsets which are “interesting”. They use Mutual Information, Added Value, Chi Square, etc as interestingness measures of the closed itemsets. They show that this provides significant dimensionality reduction over closed itemsets and as a result an increase in cluster quality and performance. The major drawback of using the various interestingness measures provided by them is that there might be a loss of information when we reduce the number of closed itemsets. We discuss the drawbacks in the score functions used by them in Section 5.1 and propose improvements that overcome these drawbacks. Research is also being done about improving the clustering quality by using an ontology to enhance the document representation. Some of the most commonly available ontologies include WordNet, MESH, etc. Several works [8,9,10] have been done to include these ontologies to enhance document representation by replacing the words with their synonyms or the concepts related to them. But all these methods have a very limited coverage. It can also happen that addition of new words could bring in noise into the document or while replacing the original content, there might be some information loss. Existing knowledge repositories like Wikipedia and ODP(open directory project) can be used as background knowledge. Gabrilovich and Markovitch [11,12] propose a method to improve text classification performance by enriching document representation with Wikipedia concepts. Further, [13,14] present a framework for using Wikipedia concepts and categories in document clustering. Out of the existing knowledge repositories, we chose wikipedia because it captures a wide range of domains, is frequently updated, is ontologically well structured, and is less noisy. [11,12]
4
Our Approach : Overview
In our approach, we first apply any simple frequent itemset mining algorithm like Apriori on all the documents to mine the frequent itemsets. We used the approach given in TDC [7] to calculate the best support. Since words which occur frequently need not necessarily mean they are important, we use the concept of generalized closed frequent itemsets [1] to filter out the redundant frequent
Frequent Itemset Based Hierarchical Document Clustering
15
itemsets. Even though the number of generalized closed frequent itemsets are very less compared to the number of frequent itemsets, we guarantee that there is very little loss of information in our method. Using the generalized closed frequent itemsets obtained, we construct the initial clusters (as mentioned in Section 2). These initial clusters have a lot of overlap between them. We use two approaches to reduce the overlapping and get the final clusters. The first approach, proposed in TDC uses tf-idf scores of the words to give scores to each document in a cluster. This can help us in deciding which cluster the document best belongs to. The second method uses Wikipedia as an ontology to enrich the document representation. We take each document and enrich its content using the outlinks and categories from Wikipedia. Then we calculate the score of each document in the initial clusters. This score is used to find out the best clusters to which a document belongs. We then introduce a new method to generate labels to clusters using Wikipedia categories which we find to be more interesting than the existing methods which use frequent itemsets as labels to clusters. Our experiments show that the approach using Wikipedia gives better clustering results than the TFIDF approach because of the use of external knowledge. We also show that our clustering quality is better than most of the existing approaches (Section 6). Typically frequent itemset mining produces too many frequent itemsets. This is a problem because frequent itemsets correspond to the cluster candidates and the number of clusters need to be small in number. To alleviate this problem, we use Generalized Closed frequent itemsets [1] which are much fewer in number. The approach guarantees that the actual frequent itemsets with their supports can be calculated back within a deterministic factor range.
5
Reducing Document Duplication
The initial clusters have a lot of overlaps between themselves leading to a single document being present in multiple clusters. In order to reduce document duplication we propose two different approaches as described in the next section. In this section, we propose two methods for reducing the document duplication from the initial clusters. Section 5.1 covers the TF-IDF approach and Section 5.2 explains the Wikipedia approach. The basic idea of clustering is to produce disjoint groups of objects/entities. But in the case of document clustering, this idea of disjointness doesn’t hold in all cases. This is because a document may belong to multiple clusters, e.g. a document about Arnold Schwarzenegger, may belong to clusters like actors, politics, movies, etc. But allowing a document to be present in too many clusters might lead to redundant clusters. So, we put a limit on the number of clusters a document can belong to using the score functions proposed below. 5.1
TF-IDF Approach
We used a document duplication removal and score calculation method similar to TDC [7], which uses the documents TF-IDF vector (using frequent 1-itemsets
16
K. G V R, R. Shankar, and V. Pudi
only) and adds the term frequencies of individual items that exist in the itemset. We incorporated a few major changes which help in improving the performance. TDC limits document duplication to a maximum of max dup clusters(the maximum number of clusters in which a document can occur), a used defined parameter. A major drawback of this approach is that they are using the same max dup for all the documents. But in a general scenario, all documents need not be equally important. So, instead we use a threshold which is the percentage of clusters in which a document can be present. In our experiments, we observed that setting the value of max dup between 2-4% gives us good results. For calculating the score TDC and HCCI [6] use the following approach: (d × t) (1) Score(d, T ) = t∈T
where d × t denotes the tf-idf score of each t in T , the frequent itemset in document d. e.g: if (a, b, c) is a frequent itemset and (d1 , d3 , d4 ) is the corresponding document set, score of d1 in (a, b, c) is the sum of tf-idf scores of a, b, c in d1 respectively. The top max dup documents having the highest scores are put into their respective clusters by this score. But this method has a drawback that longer frequent itemsets would get a higher score. We can solve this problem by using Eq. 2 instead. Score(d, T ) = (d × t) / length(T ) (2) t∈T
5.2
Using Wikipedia
Definition 1 (Wikipedia Categories). Wikipedia contains preclassified topic labels to each document. Each document of Wikipedia belongs to at least one such topic. These can be accessed at the bottom of any Wikipedia page. e.g: (F ederer)category = {World No. 1 tennis players, 21st-century male tennis players, Wimbledon Champions, . . . }, etc. Definition 2 (Wikipedia Outlinks). We define outlinks of Wikipedia to be the words which have an internal hyperlink to a different Wikipedia document. e.g: A page on Federer in Wikipedia has outlinks like (F ederer)outlinks = {tennis, world no. 1, Switzerland, Grand slam, . . . }, etc. Outlinks indicate the major topics which are present in a document. 5.2.1 Mapping Documents to Wikipedia Outlinks and Categories The mapping process is divided into four steps: 1. Extracting Wikipedia outlinks and categories from each page of Wikipedia. 2. Building an inverted index separately for categories and outlinks consisting of the categories/outlinks and the list of wikipedia pages. 3. Constructing a multi level index on these inverted index for fast access.
Frequent Itemset Based Hierarchical Document Clustering
17
4. Now, we take a dataset (like Re0, Hitec, etc) and for each word in a document of the dataset, we check if a Wikipedia page exists and find the outlinks and categories of that page. 5.2.2 Document Duplication Removal Using Wikipedia We now have the initial clusters and we have to assign each document to only a max dup clusters. To do this, we compute the score of each document(d) in an initial cluster(C) as follows: Score(d, C) = sim (d, di ) /size(C) (3) di ∈C
where di is the document that belongs to a cluster C, where sim(di , dj ) = csim(di , dj )word + α ∗ csim(di , dj )outlinks + β ∗ csim(di , dj )category (4) and csim(di , dj )word , csim(di , dj )outlinks , csim(di , dj )category indicates the cosine similarity between words, outlinks and categories in the documents di , dj respectively. A similarity measure similar to Eqn. (4) was proposed in [13]. In Eqn. (3), we divide by size(Cj ) for the same reason as explained in Eqn. (2). An advantage of using Eqn. (3) is that we compute the similarity of documents unlike in Eqn. (2) where we calculate the similarity of a frequent itemset and a document. In our experiments we found that setting the values of α and β such that α > 1 and β < 1 yield good clusters because: (i) α indicates the similarity between the outlinks which are nothing but the words which contain hyperlinks to other pages in that document. So, they need to have a higher weight than ordinary words. In our experiments, we found that setting α between 1.1 and 1.9 gives us good results. (ii) β indicates the similarity between the categories of Wikipedia which are generalizations of the topics in a document. So, if the categories of two documents match, we cannot be sure as to whether the two documents talk about the same topic as opposed to the case where two words are the same. In our experiments, we found that setting β between 0.5 and 1 gives us good results. 5.3
Generation of Hierarchy and Labeling of Clusters
We generate the hierarchy of clusters using a method similar to the one used in HCCI [6], except that we use generalized closed itemsets instead of closed interesting itemsets. Many previous approaches like FIHC [5], TDC [7], HCCI [6] propose a method of creating labels to clusters. The labeling given to a cluster is the frequent itemset to which the documents in the cluster belong to. We propose a new method for labeling of clusters using Wikipedia categories as labels which we find to be more interesting and efficient. Using frequent itemsets as labels to clusters is a good approach but it has the problem that there can be frequent itemsets of length > 30, some words being not so important, etc. In
18
K. G V R, R. Shankar, and V. Pudi
the approach we propose, we take the Wikipedia categories of all the documents of a cluster and find out the top k frequently occurring categories and assign them as the labels to the cluster. We find this method more interesting because categories in Wikipedia represent a more general sense of the documents in the cluster (like the parent nodes in a hierarchy).
6
Experimental Evaluation
We performed extensive experiments on 5 standard datasets of varying characteristics like number of documents, number of classes, etc. on both our approaches. The datasets we used and their properties are explained in Table 2. We also compared our approach to the existing state of the art approaches like UPGMA [2], Bisecting K means [3], Closed itemset clustering methods like TDC [7] and HCCI [6]. We used F-score measure to compute the quality of clustering. The quality of our clustering solution was determined by analyzing how documents of different classes are distributed in the nodes of the hierarchical tree produced by our algorithm on various datasets. A perfect clustering solution will be one in which every class has a corresponding cluster containing exactly the same documents in the resulting hierarchical tree. The results of our approaches have been discussed in Section 6.2 and 6.3. Our results are better than the existing algorithms because of the following reasons: (i) We use Generalized closed frequent itemsets which guarantee minimum loss of information, at the same time, reduce the number of closed itemsets; (ii) we propose improvements to the score functions provided by the existing algorithms and also propose new score functions; (iii) we make use of knowledge contained in Wikipedia to enhance the clustering. Table 2. Datasets Dataset no. of classes no. of attributes no. of docs Hitech 6 13170 2301 Wap 20 8460 1560 Re0 13 2886 1504 Ohscal 10 11162 11465 K1a 6 2340 13879
6.1
Source San Jose Mercury news WebACE Reuters-21578 Ohsumed-233445 WebACE
Datasets
Wikipedia releases periodic dumps of its data1 . We used the latest dump consisting of 2 million documents having a total size of 20GB. The data was present in XML format. We processed the data and extracted the categories and outlinks out of them. Datasets for clustering were obtained from Cluto clustering toolkit [15]. The results of various algorithms like UPGMA, Bisecting K means, 1
http://download.wikipedia.org
Frequent Itemset Based Hierarchical Document Clustering
19
FIHC [5], etc were taken from the results reported in HCCI [6]. HCCI [6] presents different approaches for clustering, we have taken the best scores out of all of them. We used CHARM toolkit for mining closed frequent itemsets. 6.2
Evaluation of TF-IDF Approach
Our TF-IDF approach stated in Section 5.1 reduces the document duplication and produces the final clusters. In this approach, we propose improvements to the existing methods used in TDC and HCCI. We can see from the results in Table 3 that our approach performs better than TDC for all the datasets. Compared to HCCI, our approach performs better for 3 datasets Hitech, Wap, Ohscal, and comparitively for the remaining datasets. Table 3. Comparision of F-Score using our approaches Dataset UPGMA Bisecting k-means Hitech 0.499 0.561 Wap 0.640 0.638 Re0 0.548 0.59 Ohscal 0.399 0.493 K1a 0.646 0.634
6.3
FIHC 0.458 0.391 0.529 0.325 0.398
TDC HCCI 0.57 0.559 0.47 0.663 0.57 0.701 N/A 0.547 N/A 0.654
Tf-idf Wikipedia 0.578 0.591 0.665 0.681 0.645 0.594 0.583 0.626 0.626 0.672
Evaluation of Wikipedia Approach
The Wikipedia approach proposed in Section 5.2 uses background knowledge from Wikipedia to enhance the document representation. We expect better results(than the TF-IDF approach) for this approach because of use of an ontology. The results in Table 3 illustrate that our approach performs better than all other existing approaches for 4 datasets (Hitech, Wap, Ohscal, K1a). The drawback in our approach is that it might not be of great use for datasets which do not have sufficient coverage in Wikipedia i.e. the documents in this dataset do not have corresponding pages in Wikipedia(like Re0).
7
Conclusions and Future Work
In this paper, we presented a hierarchical algorithm to cluster documents using frequent itemsets and Wikipedia as background knowledge. We used generalized closed frequent itemsets to construct the initial clusters. Then we proposed two methods using, 1) TF-IDF 2) Wikipedia as external knowledge, to remove the document duplication and construct the final clusters. In addition to these, we proposed a method for labeling of clusters. We evaluated our approaches on five standard datasets and found that our results are better than the current state of the art methods. In future, these ideas can be extended to other areas like evolutionary clustering, data streams, etc by using incremental frequent itemsets.
20
K. G V R, R. Shankar, and V. Pudi
References 1. Pudi, V., Haritsa, J.R.: Generalized Closed Itemsets for Association Rule Mining. In: Proc. of IEEE Conf. on Data Engineering (2003) 2. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data. In: An introduction to Cluster Analysis. John Wiley & Sons, Inc., Chichester (1990) 3. Zhao, Y., Karypis, G.: Evaluation of Hierarchial Clustering Algorithms for Document Datasets. In: Proc. of Intl. Conf. on Information and Knowledge Management (2002) 4. Beil, F., Ester, M., Xu, X.: Frequent Term-based Text Clustering. In: Proc. of Intl. Conf. on Knowledge Discovery and Data Mining (2002) 5. Fung, B., Wang, K., Ester, M.: Hierarchical Document Clustering using Frequent Itemsets. In: Proc. of SIAM Intl. Conf. on Data Mining (2003) 6. Malik, H.H., Kender, J.R.: High Quality, Efficient Hierarchical Document Clustering Using Closed Interesting Itemsets. In: Proc. of IEEE Intl. Conf. on Data Mining (2006) 7. Yu, H., Searsmith, D., Li, X., Han, J.: Scalable Construction of Topic Directory with Nonparametric Closed Termset Mining. In: Proc. of Fourth IEEE Intl. Conf. on Data Mining (2004) 8. Hotho, A., Staab, S., et al.: Wordnet Improves Text Document Clustering. In: The 26th Annual Intl. ACM SIGIR Conf. on Proc. of Semantic Web Workshop (2003) 9. Hotho, A., Maedche, A., Staab, S.: Text Clustering Based on Good Aggregations. In: Proc. of IEEE Intl. Conf. on Data Mining (2001) 10. Zhang, X., Jing, L., Hu, X., et al.: A Comparative Study of Ontology Based Term Similarity Measures on Document Clustering. In: Proc. of 12th Intl. Conf. on Database Systems for Advanced Applications (2007) 11. Gabrilovich, E., Markovitch, S.: Overcoming the Brittleness Bottleneck Using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge. In: Proc. of The 21st National Conf. on Artificial Intelligence (2006) 12. Gabrilovich, E., Markovitch, S.: Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis. In: Proc. of The 20th Intl. Joint Conf. on Artificial Intelligence (2007) 13. Hu, X., Zhang, X., Lu, C., et al.: Exploiting Wikipedia as External Knowledge for Document Clustering. In: Proc. of Knowledge Discovery and Data Mining (2009) 14. Hu, J., Fang, L., Cao, Y., et al.: Enhancing Text Clustering by Leveraging Wikipedia Semantics. In: Proc. of 31st Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval (2008) 15. Cluto: http://glaros.dtc.umn.edu/gkhome/views/cluto
Automatic Authorship Attribution for Texts in Croatian Language Using Combinations of Features Tomislav Reicher, Ivan Krišto, Igor Belša, and Artur Šilić Faculty of Electrical and Computing Engineering University of Zagreb Unska 3, 10000 Zagreb, Croatia {tomislav.reicher,ivan.kristo,igor.belsa,artur.silic}@fer.hr
Abstract. In this work we investigate the use of various character, lexical, and syntactic level features and their combinations in automatic authorship attribution. Since the majority of text representation features are language specific, we examine their application on texts written in Croatian language. Our work differs from the similar work in at least three aspects. Firstly, we use slightly different set of features than previously proposed. Secondly, we use four different data sets and compare the same features across those data sets to draw stronger conclusions. The data sets that we use consist of articles, blogs, books, and forum posts written in Croatian language. Finally, we employ a classification method based on a strong classifier. We use the Support Vector Machines to learn classifiers which achieve excellent results for longer texts: 91% accuracy and F1 measure for blogs, 93% acc. and F1 for articles, and 99% acc. and F1 for books. Experiments conducted on forum posts show that more complex features need to be employed for shorter texts. Keywords: author attribution, function words, POS n-grams, feature combinations, SVM.
1
Introduction
Automatic authorship attribution is a process in the field of text classification dealing with author identification of a given text. It can be interpreted as a problem of text classification based on linguistic features specific to certain authors. The main concern in computer-based authorship attribution is defining the appropriate characterization of the text. Such characterization should capture the writing style of the authors [4]. Authorship attribution can help in document indexing, document filtering and hierarchical categorization of web pages [12]. These applications are common in the field of information retrieval. It must be noted that authorship attribution differs from plagiarism detection. Plagiarism detection attempts to detect similarities between two substantially different pieces of work. However, it is unable to determine if they were produced by the same author or not [5]. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 21–30, 2010. c Springer-Verlag Berlin Heidelberg 2010
22
T. Reicher et al.
The problem of authorship attribution can be divided into three categories [23]: binary, multi-class and single-class (or one-class) classification. Binary classification solves the problem when the data set contains the texts written by one of two authors. Multi-class classification is a generalization of the binary classification when there are more than two authors in the data set. One-class classification is applied when only some of the texts from the data set are written by a particular author while the authorship of all the other texts is unspecified. This classification ascertains whether a given text belongs to a single known author or not. This paper presents a study of multi-class classification for the texts written in the Croatian language. The work is oriented on the combination and evaluation of different text representation features on different data sets. The rest of the paper is organized in the following manner. Section 2 discusses related work in authorship attribution and similar problems. Section 3 introduces different types of text representation features that we have utilized. Section 4 describes the classification method, Section 5 describes the used data sets and Section 6 presents the evaluation methods and experiment results. The conclusion and future work are given in Section 7.
2
Related Work
There are several approaches to author attribution in respect of different text representation features used for the classification. Based on those features, the following taxonomy can be made [17]: character, lexical, syntactic, semantic and application-specific features. Feature choice for topic classification of Croatian texts was investigated in [13,21]. These methods use character and lexical features. The following paragraphs describe character, lexical and syntactic features in more depth and relate our work with the existing research. Character features are the simplest text representation features. They consider text as a mere sequence of characters and are thereby usable for any natural language or corpus. Various measures can be defined, such as characters frequencies, digit frequencies, uppercase and lowercase character frequencies, punctuation marks frequencies, etc. [5]. Another type of character based features, which has been proven as quite successful [15,16], considers extracting frequencies of character n-grams. Text representation using lexical features is characterized by dividing the text into a sequence of tokens (words) that group into sentences. Features directly derived from that representation are the length of words, the length of sentences and vocabulary richness. This types of features have been used in [7,14] . Results achieved demonstrate that they are not sufficient for the task mostly due to their significant dependence on the text genre and length. However, taking advantage of features based on frequencies of different words, especially function words, can produce fairly better results [1,10,19,23]. Analogous to character n-grams, word n-gram features can be defined. They are shown to be quite successful as well [4,8]. The use of syntactic features is governed by the idea that authors tend to unconsciously use similar syntactic patterns. Information related to the structure
Automatic Authorship Attribution for Texts in Croatian Language
23
of the language is obtained by an in-depth syntactic analysis of the text, usually using some sort of an NLP tool. A single text is characterized by the presence and frequency of certain syntactic structures. Syntax-based features were introduced in [20], where the rewrite rule frequencies were utilized. Stamatatos et al. [18] used noun, verb and prepositional phrase frequencies. Using a Part-ofspeech (POS) tagger one can obtain POS tags and POS tag n-gram frequencies. Using such features excellent results can be achieved [6,10,11,12]. Koppel et al. [10] show that the use of grammatical errors and informal styling (e.g., writing sentences in capitals) as text features can be useful in authorship attribution. Our work is based on the composition and evaluation of various afore mentioned text representation features. We use different character, lexical and syntactic features and adapt them for the use with the Croatian language. We use punctuation marks and vowels frequency as character features. Word length, sentence length and function words frequencies are used as lexical features. For the syntax-based features we use those relatively similar to POS tag and POS tag n-grams frequencies.
3
Text Representation
When constructing an authorship attribution system, the central issue is the selection of sufficiently discriminative features. A feature is discriminative if it is common for one author and rare for all the others. Due to the large number of authors some complex features are very useful if their distribution is specific to each author. Moreover, as the texts from dataset greatly differ in length and topic, it is necessary to use the features independent of such variations. If the features were not independent of such variation that would most certainly reduce generality of system’s application and could lead to a decrease of accuracy (e.g., relating author to concrete topic or terms). In the following subsections we will describe different features used. 3.1
Function Words
Function words, such as adverbs, prepositions, conjunctions, or interjections, are words that have little or no semantic content of their own. They usually indicate a grammatical relationship or a generic property [23]. Although one would assume that frequencies of some of the less used function words would be a useful indicator of author’s style, even the frequencies of more common function words can adequately distinguish between the authors. Due to the high frequency of function words and their significant roles in the grammar, an author usually has no conscious control over their usage in a particular text [1]. They are also topic-independent. Therefore, function words are good indicators of the author’s style. It is difficult to predict whether these words will give equally good results for different languages. Moreover, despite the abundance of research in this field, due to various languages, types and sizes of the texts, it is currently impossible to conclude if these features are generally effective [23].
24
T. Reicher et al.
In addition to function words, in this work we also consider frequencies of auxiliary verbs and pronouns. Their frequencies might be representative of the author’s style. This makes the set of totally 652 function words that were used. 3.2
Idf Weighted Function Words
The use of features based on function word frequencies often implies the problem of determining how important, in terms of discrimination, a specific given function word is [6]. To cope with this problem we used a combination of Lp normalization of the length and transformation of the function word occurrence frequency, in particular idf (inverse document frequency) measure. Idf measure is defined as [6] Fidf (tk ) = log
nd , nd (tk )
(1)
where nd (tk ) is the number of texts from learning data set that contain word tk and nd the total number of the texts in that learning data set. The shown measure gives high values for words that appear in a small number of texts and are thus very discriminatory. As the idf measure uses only the information of the presence of a certain function word in the texts ignoring frequency of that word, a word that appears many times in one single text and once in all the others gets the same value as the one which appears once in all of the texts. Therefore it is necessary to multiply the obtained idf measure, of the given function word, with the occurrence frequency of that word in the observed text. 3.3
Part-of-Speech
Next three features we use are morphosyntactic features. To obtain these features, some sort of NLP tool was required. Croatian language is morphologically complex and it is difficult to develop an accurate and robust POS or MSD (Morphosyntactic Description) tagger. We utilized the method given in [22] that uses inflectional morphological lexicon to obtain POS and MSD of each word from the text. However, the problem of homography remained. Homographs are equally spelled words with different meanings in their context, sequently having different MSDs. Large percentage of word forms are ambiguous and using the given method we cannot disambiguate homographs, so all possible POS and MSD tags for such words are considered. Simplest morphosyntactic features that we use are features based on POS frequency, similar to the one used in [11]. The given text is preprocessed and the POS of each word in the text is determined. Features are obtained by counting the number of occurrences of different POS tags and then normalized by the total number of words in the text. The used POS tags are adpositions, conjunctions, interjections, nouns, verbs, adjectives, and pronouns. In addition, category “unknown” is introduced for all the words not found in the lexicon.
Automatic Authorship Attribution for Texts in Croatian Language
3.4
25
Word Morphological Category
More complex morphosyntactic features that we use take advantage of words’ morphological categories. Each word can be described by a set of morphological categories depending on the word’s POS: case, gender, and number for nouns, form, gender, number, and person for verbs, and case, degree, gender, and number for adjectives. Moreover, each morphological category can take one of a number of different values; for example, noun can be in nominative, genitive, dative, accusative, vocative, locative, or instrumental case. Features that we use are obtained by counting the number of occurrences of different values for each morphological category and then normalizing them by the total number of words in the text. If, for example, a sentence consists of two nouns and an adjective in nominative case then the number of occurrences of nominative case would be equal three and the normalized nominative case frequency would equal one. 3.5
Part-of-Speech n-Grams
POS n-grams are similar to word n-grams. Instead of using the actual words found in a text, all words are replaced by their POS to make a new text representation that is then used to count the number of occurrences of different POS n-grams. The number of occurrences of every single POS n-gram is normalized by the total number of n-grams to make this feature independent of the text length. The POS we use are those given in Subsection 3.3. Since POS n-gram features can cause very large dimensionality, only 3-grams are considered. Example of a POS n-gram on a word 3-gram “Adam i Eva” (“Adam and Eve”) is the “noun conjunction noun” trigram. Further, we investigate the use of n-grams on POS and function words in parallel. The nouns, verbs and adjectives in a given text are replaced by their POS. All the other words, which are considered to be the function words, are left as they were. Therefore, the 3-gram “Adam i Eva” transforms to “noun i noun” 3-gram. Frequencies of such n-grams are then used as features. The use of such features is motivated by the idea of capturing contextual information of function words. Due to many different pronouns, conjunctions, interjections and prepositions that make many different n-grams, frequency filtering is applied – only the frequency of 500 most frequent 3-grams in the training data set is considered. Used dimension reduction method is not optimal, so in future work other methods should be evaluated, such as information gain, χ2 test, mutual information, maximum relevance, minimum redundancy, etc. 3.6
Other Features
Other features that we use are simple character and lexical features: punctuation marks, vowels, words length and sentence length frequencies. A set of following punctuation marks is used: “.”, “,”, “!”, “?”,“’”, “"”, “-”, “:”, “;”, “+”, and “*”. Their appearance in the text is counted and the result is normalized by the total number of characters in the text.
26
T. Reicher et al.
Features based on the frequency of vowel occurrence (a, e, i, o, u) are obtained in an equal manner. The frequencies of word lengths are obtained by counting occurrences of words for each possible word length. The number is normalized by the total number of words in the given text. To increase robustness, words longer than 10 characters are counted as if they were 10 characters long. The sentence length frequency is obtained in a similar procedure. For the same reason as with the word lengths, sentences longer than 20 words are counted as if they were 20 words long. All features suggested in this subsection have weak discriminatory power on their own. However, they have proved to be very useful in combination with other features as shown in Table 1.
4
Classification
All features mentioned in this work use some sort of frequency information that makes it possible to represent texts with real-valued feature vectors. Having texts in a form of vectors, for the classification we used an SVM (Support Vector Machine) with radial basis function as the kernel. It is shown that, with the use of parameter selection, linear SVM is a special case of an SVM with RBF kernel [9], which removes the need to consider a linear SVM as a potential classifier. RBF kernel is defined as: k(xi , xj ) = exp(−γxi − xj 2 ).
(2)
Before commencing the learning process and classification with SVM, we scale the data to ensure equal contribution of every attribute to the classification. Components of every feature vector are scaled to an interval [0, 1]. For the application of SVM classifier in practical problems it is reasonable to consider SVM with soft margins defined by parameter C, as in [3]. Parameter C together with γ used in RBF kernel completely define an SVM model. The search for the appropriate parameters (C, γ), i.e., model selection, is done by the means of cross-validation: using the 5-fold cross-validation on the learning set parameters (C, γ) yielding the highest accuracy are selected and the SVM classifier is learned using them. The accuracy of classification is measured by the expression: nc , (3) acc = N where nc is the number of correctly classified texts and N is the total number of texts. Parameters (C, γ) that were considered are: C ∈ {2−5 , 2−4 , . . . , 215 }, γ ∈ {2−15 , 2−14 , . . . , 23 } [2].
5
Data Set
We have used three different data sets in our experimental evaluation. First we experimented using an on-line archive of proofread articles (journals) from a
Automatic Authorship Attribution for Texts in Croatian Language
27
daily Croatian newspaper “Jutarnji list,” available at http://www.jutarnji.hr/ komentari/. The data set consists of 4571 texts written by 25 different authors. The texts are not evenly distributed among authors – numbers vary from 14 to 1285 texts per author with an average of 182 texts per author. The articles also differ in size – the lowest average number of words in the text per author is 315 words, and the highest average is 1347 words. An average number of words in a text per author is 717 words. Considering this analysis, we can conclude that the used data set is very heterogeneous. Since the topics in these articles tend to be time-specific, to avoid overfitting we split the set by dates – 20% of the newest articles of each author are taken for testing (hold-out method). The second data set that we have used is a collection of on-line blogs. Blog authors were carefully selected: only those authors with more than 40 posts and posts longer than 500 words were considered. In addition, special attention was paid to select only posts that contain original information as authors often tend to quote other sources like news, books, other authors, etc. As a result, we obtained a dataset consisting of 22 authors with a total of 3662 posts. An average number of words in a post is 846 words. For the testing purposes we split the data set by dates using the same technique that was applied to the articles. The third data set is comprised of Croatian literature classics. We have selected 52 novels and novellas from 20 different authors and divided them by chapters, treating each chapter as a separate text (the same approach was used by Argamon et al. [1]). All chapters longer than 4000 words were further divided. That resulted with a total of 1149 texts, with an average of 2236 words in a text. The lowest average number of words in a text per author is 1407 words and the highest average is 3652 words. We have randomly selected 20% of the texts from each author for the test set, leaving the rest as a training set. The last data set used is a set of Internet forum (message board) posts from http://www.forum.hr/. We have selected all the posts from the threads in Politics category. Initially there were 2269 different authors with a total of 64765 posts. Subsequently we have chosen only those authors that on average have more than 100 words per post and more than 64 posts in total. As a result, the data set consisting of 19 authors and 4026 posts was obtained. Twenty percent of the newest posts of each author were taken as the test set.
6
Evaluation
Classifier quality is measured as the ratio of correctly classified texts and the total number of texts in the test set i.e., micro average accuracy. First we tested all features separately and then we tested their various combinations. Results of the evaluation on articles, blogs, and books are shown in Table 1. Total accuracy does not explain the behavior of a classifier for every class by itself, so precision and recall were calculated for each class and then used to calculate the total macro F1 measure. The SVM parameter selection is very time consuming. Thus, not all of the feature combinations were tested. We focused on the evaluation of feature
28
T. Reicher et al. Table 1. Evaluation of different features using accuracy and F1 measures [%] Features Function Words (F ) Idf Weighted Func. Words (I) Word POS (C) Punctuation Marks (P) Vowels (V) Words Length (L) Sentence Length (S) POS n-grams, 1st meth. (N1 ) POS n-grams, 2nd meth. (N2 ) Word Morph. Category (M) C, M P, F F, M F , N1 F , N2 F, C I, M N1 , M I, M, C P, F , L P, F , L, M P, F , L, C P, F , L, N2 P, F , L, M, C P, F , L, M, N2 P, F , L, M, C, N2 P, F , V, L P, C, N2 , M F , M, C, N1 S, P, F , L S, P, N2 , L S, P, F , V, L S, P, F , V, L, M S, P, F , V, L, M, N1 S, P, F , V, L, M, N1 , C
Number of Features 652 652 8 11 5 11 20 512 500 22 30 663 674 1164 1152 660 674 534 682 674 696 682 1174 704 1196 1204 679 541 1194 694 542 699 721 1233 1241
Newspapers Acc F1 88.39 87.96 44.50 57.50 30.54 43.19 40.49 71.29 76.09 61.17 63.17 92.41 91.18 89.44 90.92 90.05 90.84 71.38 91.36 93.37 93.46 93.28 90.31 93.46 91.54 91.80 93.37 85.42 89.62 92.67 83.33 93.19 93.19 92.41 92.41
87.38 86.84 38.18 52.93 16.24 33.22 33.70 68.68 72.52 58.91 62.06 91.93 90.68 88.52 90.43 89.52 90.35 70.15 90.89 92.96 93.09 92.97 89.24 93.15 90.52 90.88 93.04 84.11 88.70 92.18 81.68 92.85 92.88 91.96 91.96
Blogs Acc F1 87.21 87.10 34.95 60.64 25.58 38.59 28.00 56.67 67.03 58.32 59.65 88.75 88.53 84.45 86.55 88.20 88.53 62.29 88.31 90.30 90.85 89.97 87.32 90.85 88.09 88.64 90.19 73.87 86.11 91.29 73.98 91.18 91.51 89.08 89.75
87.08 86.94 29.85 59.35 21.84 35.11 21.79 54.05 66.16 57.04 57.98 88.69 88.41 84.04 86.25 87.99 88.30 60.20 88.16 90.22 90.65 89.87 87.21 90.73 87.95 88.49 90.11 73.06 85.69 91.23 73.31 91.10 91.38 88.71 89.44
Books Acc F1 97.24 97.24 38.97 76.55 42.41 47.24 25.86 80.34 91.72 67.24 71.72 98.62 97.93 97.59 97.93 97.59 97.93 84.48 97.93 98.62 98.28 98.62 98.28 98.62 98.28 97.93 98.62 94.14 98.62 99.66 95.17 98.62 98.62 98.28 98.28
97.12 97.11 33.63 76.31 39.72 44.15 20.12 79.83 91.45 66.13 71.84 98.57 97.89 97.54 97.80 97.49 97.88 83.80 97.81 98.57 98.23 98.57 98.14 98.58 98.14 97.81 98.57 94.02 98.57 99.65 95.06 98.58 98.57 98.23 98.23
combinations that are based on syntactic analysis of the Croatian language and those that are based on function word frequencies as they have proven to be most successful. Furthermore, some of the features like function words and idf weighted function words are very similar and the evaluation of their combination would show no considerable improvements. Also, certain feature combinations made no significant contributions so we did not conduct further evaluation. As we can see from Table 1, highest accuracies are achieved by using combination of simple features such as function words, punctuation marks, word length, and sentence length frequencies (F , P, L, S). On the other hand, the combinations of syntax-based features are slightly less accurate. If we use the combination of function words and syntax-based features, accuracy remains nearly the same as without the use of syntax-based features. We can also see that features based on function word frequencies achieve excellent results on different data sets, regardless of the text form as long as the data set is comprised of texts that are sufficiently large. By increasing the average text length we can see that classification results get higher. To further evaluate the impact of the text length on
Automatic Authorship Attribution for Texts in Croatian Language
29
performance of used features, we have performed experiments on much shorter texts. We have used forum posts data set, but the results were not satisfying. The highest accuracy of 27% was achieved by using punctuation marks frequencies.
7
Conclusion
It is shown that the authorship attribution problem, when applied to morphologically complex languages, such as the Croatian language, can be successfully solved using a combination of some relatively simple features. Our results with 91%, 93% and 99% accuracy on texts with average length longer than 700 words are quite notable considering the fact that different heterogeneous data sets were used. Evaluation on different data sets shows that the same feature combination based on function words, punctuation marks, word length, and sentence length frequencies, achieve the highest results. Therefore, we conclude that those features are the most suitable for use in the task of authorship attribution. Experimental results for short texts indicate that more complex features should be utilized. In the similar work of Uzuner et al. [19] on a smaller set of authors, fairly larger texts, and with the use of different classifiers nearly the same results and conclusions are obtained. The direct comparison of the results has proved very difficult due to different types of data sets used and different number of authors considered. However, our results fall within the interval of previously reported results, which range from 70% to 99% [1,4,8,12,18,19]. In addition, there are no other reported methods or results for the Croatian language, nor any of the related South Slavic languages, so our work presents a basis for further research. In future work, homography should be resolved in order to get more accurate results of syntax-based features. As suggested in [4,8,15], features based on word and character n-grams should be compared to features used in this work. Also, comparison to semantic based features should be conducted as well. It would be interesting to investigate other features that are able to cope with shorter texts such as Internet forum posts, e-mails, on-line conversations, or text messages.
Acknowledgement This research has been supported by the Croatian Ministry of Science, Education and Sports under the grant No. 036-1300646-1986. The authors would like to thank Bojana Dalbelo Bašić, Jan Šnajder, and the reviewers for their helpful suggestions.
References 1. Argamon, S., Levitan, S.: Measuring the usefulness of function words for authorship attribution. In: Proceedings of ACH/ALLC (2005) 2. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
30
T. Reicher et al.
3. Cortes, C., Vapnik, V.: Support-vector networks. Machine L. 20(3), 273–297 (1995) 4. Coyotl-Morales, R., Villaseñor-Pineda, L., Montes-y Gómez, M., Rosso, P., Lenguaje, L.: Authorship attribution using word sequences. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds.) CIARP 2006. LNCS, vol. 4225, pp. 844–853. Springer, Heidelberg (2006) 5. De Vel, O., Anderson, A., Corney, M., Mohay, G.: Mining e-mail content for author identification forensics. ACM Sigmod Record 30(4), 55–64 (2001) 6. Diederich, J., Kindermann, J., Leopold, E., Paass, G.: Authorship attribution with support vector machines. Applied Intelligence 19(1), 109–123 (2003) 7. Holmes, D.: Authorship attribution. Comp. and Humanities 28(2), 87–106 (1994) 8. Kešelj, V., Peng, F., Cercone, N., Thomas, C.: N-gram-based author profiles for authorship attribution. In: Proceedings of the Conference Pacific Association for Computational Linguistics, PACLING, vol. 3, pp. 255–264 (2003) 9. Keerthi, S., Lin, C.: Asymptotic behaviors of support vector machines with Gaussian kernel. Neural Computation 15(7), 1667–1689 (2003) 10. Koppel, M., Schler, J.: Exploiting stylistic idiosyncrasies for authorship attribution. Proceedings of IJCAI 3, 69–72 (2003) 11. Kukushkina, O., Polikarpov, A., Khmelev, D.: Using literal and grammatical statistics for authorship attribution. Probl. of Info. Trans. 37(2), 172–184 (2001) 12. Luyckx, K., Daelemans, W.: Shallow text analysis and machine learning for authorship attribution. In: Proceedings of the fifteenth meeting of Computational Linguistics in the Netherlands (CLIN 2004), pp. 149–160 (2005) 13. Malenica, M., Smuc, T., Šnajder, J., Dalbelo Bašić, B.: Language morphology offset: Text classification on a croatian-english parallel corpus. Information Processing and Management 44(1), 325–339 (2008) 14. Mendenhall, T.C.: The characteristic curves of composition. Science (214S), 237 (1887) 15. Peng, F., Schuurmans, D., Wang, S., Keselj, V.: Language independent authorship attribution using character level language models. In: Proc. 10th Conf. on European Chapter of the Assoc. Comp. Ling., vol. 1, pp. 267–274. ACL (2003) 16. Stamatatos, E.: Ensemble-based author identification using character n-grams. In: Proc. 3rd Int. Workshop on Text-based Information Retrieval, pp. 41–46 (2006) 17. Stamatatos, E.: A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology 60(3), 538–556 (2009) 18. Stamatatos, E., Fakotakis, N., Kokkinakis, G.: Computer-based authorship attribution without lexical measures. Comp. and Hum. 35(2), 193–214 (2001) 19. Uzuner, O., Katz, B.: A comparative study of language models for book and author recognition. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 969–980. Springer, Heidelberg (2005) 20. van Halteren, H., Tweedie, F., Baayen, H.: Outside the cave of shadows: Using syntactic annotation to enhance authorship attribution. Computers and the Humanities 28(2), 87–106 (1996) 21. Šilić, A., Chauchat, J.H., Dalbelo Bašić, B., Morin, A.: N-grams and morphological normalization in text classification: A comparison on a croatian-english parallel corpus. In: Neves, J., Santos, M.F., Machado, J.M. (eds.) EPIA 2007. LNCS (LNAI), vol. 4874, pp. 671–682. Springer, Heidelberg (2007) 22. Šnajder, J., Dalbelo Bašić, B., Tadić, M.: Automatic acquisition of inflectional lexica for morphological normalisation. Information Processing and Management 44(5), 1720–1731 (2008) 23. Zhao, Y., Zobel, J.: Effective and scalable authorship attribution using function words. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.-H. (eds.) AIRS 2005. LNCS, vol. 3689, pp. 174–189. Springer, Heidelberg (2005)
Visualization of Text Streams: A Survey ˇ c and Bojana Dalbelo Baˇsi´c Artur Sili´ Faculty of Electrical and Computing Engineering University of Zagreb Unska 3, 10000 Zagreb, Croatia {artur.silic,bojana.dalbelo}@fer.hr
Abstract. This work presents a survey of methods that visualize text streams. Existing methods are classified and compared from the aspect of visualization process. We introduce new aspects of method comparison: data type, text representation, and the temporal drawing approach. The subjectivity of visualization is described, and evaluation methodologies are explained. Related research areas are discussed and some future trends in the field anticipated. Keywords: Information Visualization, Visual Analytics, Topic Detection and Tracking, Text Mining, Trend Discovery, Visualization Evaluation, Visualization Subjectivity, Dimensionality Reduction, Text Representation, Information Extraction, User Interaction.
1
Introduction
From the earliest days of our civilization, visualizations have been an important way to record, communicate, or analyze real and abstract entities. Over the past two decades, digitalization of textual data has greatly increased information accessibility in all areas of human society. This has introduced a strong need for efficient storage, processing, retrieval, and analysis of texts, since the number of available documents has become quite large. Visual exploration is an efficient and natural way to analyze large text collections. This work presents a survey of methods that visualize text streams. The article is structured as follows. Research areas of wider scope and research areas that interact with text visualization are discussed in Section 2. Data collection types are structured in Section 3 and the visualization process is explained in Section 4. Insight, subjectivity and evaluation of visualizations are presented in Section 5 and successful applications are noted in Section 6. Section 7 concludes the article.
2
Related Research Areas
S. Owen defines visualization as ”a tool or method for interpreting image data fed into a computer and for generating images from complex multi-dimensional data sets” [1]. Information visualization is a subfield that studies how to visually R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 31–43, 2010. c Springer-Verlag Berlin Heidelberg 2010
32
ˇ c and B.D. Baˇsi´c A. Sili´
represent large collections of non-numeric information such as text, software source code, various physical or social networks, and so forth [2]. Texts carry information encoded in natural language, so text visualization is a subfield of information visualization. Visual analytics is a closely related area of research which has been described as ”the science of analytical reasoning facilitated by interactive visual interfaces” [3]. Information visualization enables visual analysis which is a subsequent step in knowledge discovery [4]. The importance of moving from confirmatory to explorative data analysis is emphasized in [5,6]. Integrating automatic analysis and exploratory analysis with the aid of visualization techniques will enable more efficient and effective knowledge discovery [5]. Another field of study that relates to text visualization is Topic Detection and Tracking (TDT) [7], a DARPA-sponsored initiative to investigate the stateof-the-art methods of finding and tracking new events in a stream of broadcast news stories. The pilot phase started in 1998 and the final phase of the project ended in 2004. The developed methods for detection and tracking of stories are elaborate and data-specific. They can be used as a background tool for text stream visualization, in which case the foreground drawing methods need not be complex while the visualizations can still be highly informative. An example of such a method is EventRiver [8]. Analogous to text and data mining methods, text visualization depends heavily on algebraic algorithms and statistical methods. Dimensionality reduction and feature selection are employed on matrices that represent text collections. When representing texts, all sorts of information extraction and processing methods can be employed. These are mostly developed within the computational linguistics community. In practice, designers of visualization systems have to be aware of how their visualizations are used and examine whether user needs are satisfied while performing exploratory or confirmatory analysis. Due to the human factor involved in evaluation of text visualization methods, methodologies from social sciences are used. Information visualization has common tasks of interest with empirical research in human-computer interaction, perceptual psychology, and cognitive reasoning. Although some mutual research has been done, more integration of these communities is encouraged [9]. At last, visualization researchers and developers can refer to design principles to improve readability, interaction and general usability. For such advice, see E. Tufte’s book [10].
3
Data Types
We specify three text data types upon which the visualization methods operate: collection of texts (C), single text (S), and short interval of a text stream (SI). The last is used to visualize trends in texts arriving in real time [11,12,13]. Such visualizations can be dynamic, viz. they include animations [11,13]. Essentially, text streams are characterized as collections of texts that arrive through time. The following is an explanation: A text collection is composed
Visualization of Text Streams: A Survey
33
of documents which are composed of paragraphs. These paragraphs are broken down into sentences which are the basic carriers of messages in a language. If we observe the paragraph sequence of a single text as a sequence of independent texts, we will come to the conclusion that the formerly established partition of methods is superficial and that any text stream method can be used to visualize a single text (and vice versa). However, the partition based on collection type is useful due to the following reasons: First, the statistical properties of text parts at a single text level and at a collection level differ significantly. For example, probabilities of word occurrence in a single text are noisier, so smoothing has to be used; see [14]. Second, computational demands differ since a single text can be represented within a matter of kilobytes in comparison to large corpora like the New York Times [15] corpus that contains more than 1.8 million news articles from a period of over 20 years and whose size is measured in gigabytes. Even larger corpora are processed: in [16], a collection of over 90 million texts is reported. Finally, it is useful to present methods at all levels of text streams since the basic concepts are common to all methods – to discover and analyze trends in texts. The relativity in comprehension of what is a text collection has already been noted in [17,18].
4
Visualization Process
In general, a visualization is generated in three steps. First, the texts are transformed into a representation more suitable for sequent operations. Second, in order to draw a view, a mapping onto a 2D or 3D space is performed. Documents, groups of documents, terms, events, relationships, summaries, or other entities are drawn and labelled on the view. Third, user interaction is enabled. Visualization process is refined in more detail in [4]. This section discusses a number of existing methods from the aspect of visualization process. A brief survey with an emphasis on historical aspects is available in [4]. A brief, but well-written review of methods is given as a related work in [13]. Text visualization methods in relation to TDT and temporal text mining are structured in [19]. An extensive survey of trend detection which includes visualization methods and commercial products is described in [20]. Also, a comprehensive survey oriented towards commercial patent analysis tools is available in [21]. 4.1
Text Representation
Unstructured text is not suitable for visualization, so a text is usually represented in vector space model (VSM) [22]. This concept is very popular in information retrieval. Bag-of-words, an instance of VSM, is the most widely used model for text representation. The method consists of counting the word occurrences in the text. This produces a vector in the space of features, which correspond to the words found in the text. Doing so, word order is abandoned. Often, the vectors are weighted using the TFIDF method [23]. Alternatives to VSM exist: a text can be represented using a language model where the text is treated as a set of probabilities; for example refer to [24].
34
ˇ c and B.D. Baˇsi´c A. Sili´
Another example of a novel approach to text representation is employed within the MemeTracker framework [16] in which a collection of texts is represented as a phrase graph. For this method VSM is not suitable since information on word order is needed to extract phrases from the text. Other information, apart from bare words, can be extracted from text. Some relations can be trivial to extract from texts (e.g. co-occurrence of names in texts), while others are not expected to be feasible for extraction in the foreseeable future (e.g. author’s stance on some entities not directly written in the text, obtainable only by precise logic reasoning and background knowledge). However, as research in the computational linguistics field progresses, information extraction techniques will advance and information visualization will have more data to feed on. We introduce the following structuring of feature extraction methods that are currently in use or are foreseen to be used in the near future: 1. Bag-of-words: This is a baseline used by many methods. It can be improved with stop-word filtering, morphological normalization and word sense disambiguation. Similarly, significant word n-grams can be extracted. 2. Entity recognition: These techniques automatically identify names of persons, organizations, places, or countries [25]. Moreover, relationships among entities found in text can be introduced. Also, events can be visualized [8]. 3. Summarization: These techniques include keyword extraction, keyword assignment, thematic categorization, and fact extraction. The aim is to shorten texts and present only the most relevant information, hence the visualization is able to filter the clutter even more effectively. An example of extracted facts visualization can be found in a commercial product ClearForest [20]. 4. Document structure parsing: These relatively simple techniques are used to automatically identify elements like title, author, and publication date. Such structural information can be used in visualization as well. 5. Sentiment and affect analysis: These techniques are used to emotionally characterize the content of texts. An example of affect visualization for single documents can be found in [26]. By using these feature extraction methods, the relative importance of each feature present in a text is obtained by general or specific methods. For example, simple occurrence frequency is a common numeric value of a feature in the bagof-words model. In contrast, relations among entities can be the result of a very complex model that involves syntactic parsing of sentences. 4.2
Drawing a View
Table 1 summarizes this subsection by listing existing visualization methods and their underlying methods. The table also notes the publication year and if the method has an inherent temporal drawing capability apart from time slicing. For visualization examples, see appendix. It is easily seen that temporally oriented methods have been researched only in recent years. In this paper, we classify text visualization methods into two categories: term trend and semantic space approach.
Visualization of Text Streams: A Survey
35
Table 1. List of visualization methods Method name
Basic underlying methods
Sammon Lin et al. BEAD Galaxy of News SPIRE / IN-SPIRE TOPIC ISLANDS VxInsight WEBSOM Starlight ThemeRiver Kaban and Girolami InfoSky Wong et al. NewsMap TextPool Document Atlas Text Map Explorer FeatureLens NewsRiver, LensRiver Projection Explorer (PEx) SDV Temporal-PEx T-Scroll Benson et al. FACT-Graph Petrovi´ c et al. Document Cards EventRiver MemeTracker STORIES
Sammon’s mapping SOM FDP ARN MDS, ALS, PCA, Clustering MDS, Wavelets FDP, Laplacian eigenvectors SOM, Random Projections TRUST FP HMM FDP, Voronoi Tessellations MDS, Wavelets Treemapping FDP LSI, MDS PROJCLUS FP FP PROJCLUS, IDMAP, LSP, PCA PCA IDMAP, LSP, DTW, CDM GD, Special clustering Agent-based clustering GD CA Rectangle packing Clustering, 1D MDS FP, Phrase clustering GD, Term co-occurence statistics
Data type Temporal Year Ref. C C C C C/S S C C C C C C C SI SI C C C C C S C C SI C C S C C C
N\A + + ˜ ˜ + + N\A + + ˜ + N\A + + +
1969 1991 1992 1994 1995 1998 1998 1998 1999 2000 2002 2002 2003 2004 2004 2005 2006 2007 2007 2007 2007 2007 2007 2008 2008 2009 2009 2009 2009 2009
[27] [28] [29] [30] [17] [18] [31] [32] [33] [34] [35] [36] [37] [12] [13] [38] [39] [40] [41] [42] [14] [43] [44] [11] [45] [46] [47] [8] [16] [19]
Term Trend Approach. The most straightforward way to visualize trends in text streams is to plot frequencies (FP, see 1) of important terms found in texts at a given window of time. Feature selection is employed to reduce the number of dimensions and prevent visual occlusion. Feature selection methods range from simple frequency criteria like [34], to more complex statistical measures of feature importance such as information gain or χ2 [23] used in [48]. The term ”trend approach” is used in [16,34,40,41]. MemeTracker [16] plots trends by efficiently detecting memetic phrases that can be regarded as an advanced phrase selection. EventRiver [8] plots frequencies of documents relating to certain events. Event labelling can be regarded as a sophisticated term selection. Semantic Space Approach. In semantic space approach each part of the view corresponds to some semantic category found in the text collection. This approach is used by [11,13,14,17,18,27,28,29,30,31,32,33,35,36,37,38,39,42,43]. Often the vectors representing texts are of high dimensions because textual features are numerous, so dimensionality reduction techniques are employed in order to map these vectors to 2D or 3D space. Used dimensionality reduction techniques include Latent Semantic Indexing (LSI) [49], Principle Component Analysis (PCA) [50], and Correspondence Analysis (CA) [51], which are all methods based on Singular Value Decomposition (SVD). Before performing these linear algebra operations, a distance function between the vectors has to be defined – usually Euclidean distance is used.
36
ˇ c and B.D. Baˇsi´c A. Sili´
Multidimensional Scaling (MDS) [52] is another popular method for reducing dimensions. MDS attempts to embed a set of objects into Rn such that the defined distances among the objects are preserved as much as possible. Sammon’s mapping [27] is a version of MDS. Since MDS-based and SVD-based methods used on large text collections can be too computationally demanding, some tricks are employed. Usually, transformation to the low dimensional space is calculated using a smaller set of vectors which can be a subset of the original set or centroids produced by clustering the original set. Examples of such methods are Anchored Least Stress (ALS) [53], Boeing Text Representation Using Subspace Transformation (TRUST) [33], Least Square Projection (LSP) [54], and Projection by Clustering (PROJCLUS) [39]. Kohonen’s Self Organizing Maps (SOM) [55] is a neurocomputational algorithm used to map high-dimensional data to a lower dimensional space through an unsupervised and competitive learning process. Force-Directed Placement (FDP) [56] is an algorithm that has an analogy in a physical system of masses and springs. By simulation, the algorithm finds a stable point. IDMAP [57] is an adapted FDP-based method used to improve computational complexity. Other general or custom graph-drawing (GD) methods are also used; for examples see [19,44,45]. The TreeMap [58] visualization technique used in [12] makes use of all available display space, mapping the full hierarchy onto a rectangular region in a space-filling manner. The hierarchy among texts can be defined in various ways starting from simple hierarchical clustering. A similar placement method called Rectangle Packing is used in [47]. Other specific methods that are used to draw visualizations include: Wavelet filters used in [18], Hidden Markov Models (HMM) used in [35], Associative Relation Network (ARN) used in [30], Voronoi Tessellations used in [36], and Random Projections used in [32]. Different clustering methods are used in many of the described methods as an intermediate step, with the aim of converting a large set of texts into a smaller set of elements, to find groups of texts with specific properties, or to summarize a number of texts. For now, most methods that use the semantic space approach enable trend discovery in text streams by time slicing. The time slicing method constrains a series of views to a series of time intervals. By analyzing the differences in the views, the user gains insight into changes in the text stream. Time slicing has explicitly been noted for [12,13,17,31,33], although it can be performed with any method since it consists only of choosing a time interval to be visualized. Time slicing has been criticized for being limited in discovery since human memory is limited and change blindness is present [8]. An interesting approach to time series visualization is presented in [43]. Entities are visualized using points that have an associated time signal. The similarity between time signals is introduced by the following measures: Euclidean, Dynamic Time Warping (DTW) [59], and Compression-based Dissimilarity Measure (CDM) [60]. The DTW distance is used for time series with different sizes or distortions in the time axis. The CDM distance is used to detect structural differences and only works in long time
Visualization of Text Streams: A Survey
37
series. Having defined a distance between time series, LSP or IDMAP enable to find projections of points to a low dimensional space. 4.3
User Interaction
User interaction is concerned with the way users generate a view and how they advance to the next one while gaining insights into the analyzed data. Usually, approaches to interaction are the following [61]: 1. Brushing and linking – a change in perspective on one view of the data set affects other views as well. For example, refer to [62]. 2. Panning and zooming – concerns physically constraining the view in order to enable the user to analyze more general or more specific parts of the visualized space. This is the most prevalent method of user interaction, by default enabled in most visualization methods. 3. Focus-plus-context – enables to enlarge a specific part of the view while simultaneously shrinking the context, making distant objects smaller. One example of this approach is the fisheye view. 4. Magic lenses – filters that can transform the view of a visualized object in order to emphasize some other characteristic. For example, in the Galaxies view [17], the documents can be viewed through a magic lens in order to discover their temporal dimension. User interaction also consists of selecting time intervals, thematic categories, persons involved, or any other available structured unit of information upon which conditions can be set. Interaction techniques can be advanced in the sense that animations are enabled. For now, animation has been used in methods oriented to the current stream change. In [11], the user can select a term and the visualization animates to make it a central point in the view. Next, in [13] the visualization dynamically changes in real time as texts arrive.
5 5.1
Insight, Subjectivity and Evaluation of Visualization Insight and Subjectivity
The primary goal of the visualization process is insight that occurs within the user. Insight is a vaguely defined term closely connected with the increase of knowledge that helps users to define new questions and hypotheses [63], as well as answering existing ones. It is subjective, so careful approach to design and evaluation is required. In J. J. van Wijk’s work [63], a general mathematical model of visualization within its usage context is given. According to this simple but useful model, the increase in knowledge is influenced by several subjective elements: starting knowledge, perceptual and cognitive skills, and specifications (choice of hardware, algorithm, and parameters). In addition, user’s motivation and interest are noted in [9]. In the context of computer applications and user interfaces, subjectivity is a property that is present when the process result substantially depends on the
38
ˇ c and B.D. Baˇsi´c A. Sili´
person involved – this implies that visualization is a subjective process. In [63] a partial elimination of subjectivity is proposed by reducing the number of userselectable parameters, since the choice of settings contributes to subjectivity. Next, it is suggested to expose the visualization to a number of users instead of only one in order to have a collective judgment. In order to create better visualizations, subjectivity has to be taken into account, and its reduction should be utilized during the design and evaluation phases. 5.2
Evaluation
Evaluation is important since the users and researchers need to compare and validate visualization methods. Design of evaluation methods and methodologies for information visualization is still an ongoing effort of the research community. All difficulties are related to the human factor as described in the previous subsection. In contrast to purely computational processes, not all variables can be observed and controlled. An intrinsic approach to evaluation has an inspiration in social sciences [64]. The other approach is extrinsic: to evaluate a visualization by assessing the economic value of decisions based on knowledge gained during the visualization [63]. Also, Van Wijk notes that the value of a visualization for which subsequent user actions cannot be defined is doubtful. The rest of this section is a brief summary of a recent overview on the topic of intrinsic evaluation written as a book chapter by S. Carpendale [9]. In the past, usability of a visualization was only emphasized without systematic research, since an easily understandable visualization speaks for itself. As this field developed, a plethora of methods were created, so an empirical approach was needed. In contrast to the current practice of artificial settings, it is noted that the evaluations should be made by real users on real data sets and oriented towards real tasks. When choosing such realistic settings, it might be more difficult to obtain large enough sample sizes in order to control all variables or to get precise measurements. Many of the challenges in information visualization are common to empirical research in the following disciplines: human-computer interaction, perceptual psychology, and cognitive reasoning. There are three important factors in evaluation methodologies: Generalizability – can the results be extended to a wider scope of users or data sets?; Precision – to what extent can the measurements be trusted?; Realism – can the result be considered realistic from the aspect of the context it was studied in? As discussed in [64], evaluation methodologies in social sciences can be structured in the following cases: laboratory experiments, experimental simulations, field experiments, field studies, computer simulations, formal theories, sample surveys, and judgment studies. These methodologies enable the actualization of one, or at most two, of the given factors what suggests that more than one methodology is preferable to be used. Methodologies can be quantitative or qualitative. Quantitative methodology has evolved through centuries of experimental research. The data within it is expressed using numbers. Challenges to quantitative methodologies are: conclusion validity, type I and II errors, internal validity, construct validity, external validity, and ecological validity.
Visualization of Text Streams: A Survey
39
Qualitative methodology uses data expressed in natural language. It can be categorized as observational where the participants’ behaviour is described or as interview where the participants give answers to a set of questions. Challenges to qualitative methodologies are: sample size, subjectivity, and analysis. Evaluation of information visualization is still a work in progress. It is a labour-intensive task since manual work is evaluated and little automatization can be employed. A number of researchers point out the problem of evaluation and some of them encourage others to present more evaluation methodologies in their future works.
6
Applications
Text stream visualization has a wide range of applications – in all situations where texts have a time stamp or where the collections are not static. In the publications cited in this article, various successful applications are mentioned: analyzing news texts, scientific publications, institutional documents, company filings, patent claims, private and enterprise correspondence, web content, and historical archives. The intended users of text visualization suites are media analysts, historians and other scientists from all fields, public policy makers, intelligence officers, journalists, business managers, and private users. In the future, we can expect more applications using text visualization on the task of social media analysis. Also, apart from task-specific and data-specific applications, general applications are being developed. An example of such a general suite is Microsoft’s Pivot Project [65] which aims to enable seamless information visualization and interaction.
7
Conclusion
We have presented a general survey of text visualization methods. We have introduced and discussed three novel categorizations of visualization methods: data type, text representation, and the temporal drawing approach. Related areas of research were presented. Notable works were listed and their methods discussed. Subjectivity as an inherent property of insight was described and suggested approaches to evaluation were structured. A wide scope of applications and users was noted. This field of study has great potential since the volume of digitally available texts has become huge. Advanced applications are expected to arise from using cutting-edge information extraction techniques developed within the computational linguistics community. Technical advances might include novel methods or solving scalability issues. Applicative advances might include integration with collaborative visualization methods. More effort needs to be invested in research and deployment of evaluation methodologies. Also, cognitive and psychological aspects need to be included in the research as well. Acknowledgement. This research has been supported by the Croatian Ministry of Science, Education and Sports under the grant No. 036-1300646-1986.
40
ˇ c and B.D. Baˇsi´c A. Sili´
Appendix. Please find examples of visualizations at the following address: http://ktlab.fer.hr/˜artur/vis-survey-appendix.pdf
References 1. Scott Owen, G., Domik, G., Rhyne, T.M., Brodlie, K.W., Santos, B.S.: Definitions and rationale for visualization, http://www.siggraph.org/education/ materials/HyperVis/visgoals/visgoal2.htm (accessed in February 2010) 2. Friendly, M., Denis, D.: Milestones in the history of thematic cartography, statistical graphics, and data visualization, vol. 9 (2008) 3. Thomas, J.J., Cook, K.A.: Illuminating the Path: The Research and Development Agenda for Visual Analytics. National Visualization and Analytics Ctr (2005) 4. Risch, J., Kao, A., Poteet, S., Wu, Y.: Text Visualization for Visual Text Analytics. In: Simoff, S.J., B¨ ohlen, M.H., Mazeika, A. (eds.) Visual Data Mining. LNCS, vol. 4404, pp. 154–171. Springer, Heidelberg (2008) 5. Keim, D.A., Mansmann, F., Thomas, J.: Visual Analytics: How Much Visualization and How Much Analytics? In: ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery - VAKD 2009 (2009) 6. Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading (1977) 7. Allan, J.: Tracking, Event- based Information Organization. Kluwer Academic Publishers, Dordrecht (2002) 8. Luo, D., Yang, J., Fan, J., Ribarsky, W., Luo, H.: EventRiver: Interactive Visual Exploration of Streaming Text. EG/IEEE EuroVis 2009 (2009) (to be published) 9. Carpendale, S.: Evaluating information visualizations. In: Information Visualization: Human-Centered Issues and Perspectives, pp. 19–45. Springer, Heidelberg (2008) 10. Tufte, E.R.: Visual Explanations. Graphics Press (1997) 11. Benson, J., Crist, D., Lafleur, P.: Agent-based visualization of streaming text. In: Proc. IEEE Info. Vis. Conf., Raleigh (2008) 12. Weskamp, M.: (2004), http://marumushi.com/projects/newsmap (acc. in Apr. 2010) 13. Albrecht-Buehler, C., Watson, B., Shamma, D.A.: Visualizing live text streams using motion and temporal pooling. IEEE Comp. Graph. App. 25(3), 52–59 (2005) 14. Mao, Y., Dillon, J., Lebanon, G.: Sequential document visualization. IEEE Transactions on Visualization and Computer Graphics 13(6), 1208–1215 (2007) 15. Linguistic Data Consortium: The New York Times Annotated Corpus, http:// www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2008T19 16. Leskovec, J., Backstrom, L., Kleinberg, J.M.: Meme-tracking and the dynamics of the news cycle. In: Proc. of the 15th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 497–506 (2009) 17. Wise, J.A., Thomas, J.J., Pennock, K., Lantrip, D., Pottier, M., Schur, A., Crow, V.: Visualizing the non-visual: Spatial analysis and interaction with information from text documents. In: Proc. IEEE Symp. Info. Vis., pp. 51–58 (1995) 18. Miller, N.E., Wong, P.C., Brewster, M., Foote, H.: TOPIC ISLANDS - a waveletbased text visualization system. In: Proc. 9th IEEE Conf. on Vis., pp. 189–196 (1998) 19. Berendt, B., Subasic, I.: STORIES in time: A graph-based interface for news tracking and discovery. In: Web Intel./IAT Workshops, pp. 531–534. IEEE, Los Alamitos (2009)
Visualization of Text Streams: A Survey
41
20. Kontostathis, A., Galitsky, L., Pottenger, W.M., Roy, S., Phelps, D.J.: A Survey of Emerging Trend Detection in Textual Data Mining (2003) 21. Yang, Y., Akers, L., Klose, T., Yang, C.B.: Text mining and visualization tools impressions of emerging capabilities. World Patent Info. 30(4), 280–293 (2008) 22. Salton, G., Wong, A., Yang, A.C.S.: A vector space model for automatic indexing. Communications of the ACM 18, 229–237 (1975) 23. Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002) 24. Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Grossman, R., Bayardo, R.J., Bennett, K.P. (eds.) Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, August 21-24, pp. 198–207. ACM, New York (2005) 25. Moens, M.F.: Information Extraction, Algorithms and Prospects in a Retrieval Context. Springer, Heidelberg (2006) 26. Gregory, M., Chinchor, N., Whitney, P.: User-directed sentiment analysis: Visualizing the affective content of documents. In: Proc. of the Workshop on Sentiment and Subjectivity in Text. Association for Computational Linguistics (2006) 27. Sammon, J.: A nonlinear mapping for data structure analysis. IEEE Transactions on Computing 5(18), 401–409 (1969) 28. Lin, X., Soergel, D., Marchionini, G.: A Self-organizing semantic map for information retrieval. In: Proc. 14th. Ann. Int. ACM SIGIR Conf. on R&D In Information Retrieval, pp. 262–269 (1991) 29. Chalmers, M., Chitson, P.: Bead: Explorations in information visualization. In: Proc. of the 15th ACM SIGIR Conf. on R&D in Information Retrieval (1992) 30. Rennison, E.: Galaxy of news: An approach to visualizing and understanding expansive news landscapes. In: ACM User Interface Soft. and Tech., pp. 3–12 (1994) 31. Davidson, G.S., Hendrickson, B., Johnson, D.K., Meyers, C.E., Wylie, B.N.: Knowledge mining with VxInsight: Discovery through interaction. J. Intell. Inf. Syst. 11(3), 259–285 (1998) 32. Kaski, S., Lagus, K., Kohonen, T.: WEBSOM - Self-organizing maps of document collections. Neurocomputing 21, 101–117 (1998) 33. Risch, J., Rex, D., Dowson, S., Walters, T., May, R., Moon, B.: The starlight information visualization system. In: Proc. IEEE Conf. Info. Vis., pp. 42–49 (1997) 34. Havre, S., Hetzler, E.G., Nowell, L.T.: Themeriver: Visualizing theme changes over time. In: Proc. IEEE Conf. Info. Vis., pp. 115–124 (2000) 35. Kab´ an, A., Girolami, M.: A dynamic probabilistic model to visualise topic evolution in text streams. J. Intelligent Information Systems 18(2-3), 107–125 (2002) 36. Andrews, K., Kienreich, W., Sabol, V., Becker, J., Droschl, G., Kappe, F., Granitzer, M., Auer, P., Tochtermann, K.: The infosky visual explorer: exploiting hierarchical structure and document similarities. Info. Vis. 1(3-4), 166–181 (2002) 37. Wong, P.C., Foote, H., Adams, D., Cowley, W., Thomas, J.: Dynamic visualization of transient data streams. In: Proc. IEEE Symp. Info. Vis. (2003) 38. Fortuna, B., Grobelnik, M., Mladenic, D.: Visualization of text document corpus. Informatica (Slovenia) 29(4), 497–504 (2005) 39. Paulovich, F.V., Minghim, R.: Text map explorer: a tool to create and explore document maps. In: IV, pp. 245–251. IEEE Computer Society, Los Alamitos (2006) 40. Don, A., Zheleva, E., Gregory, M., Tarkan, S., Auvil, L., Clement, T., Shneiderman, B., Plaisant, C.: Discovering interesting usage patterns in text collections: integrating text mining with visualization. In: Proc. 16th ACM Conf. Information and Knowledge Management, CIKM, pp. 213–222 (2007)
42
ˇ c and B.D. Baˇsi´c A. Sili´
41. Ghoniem, M., Luo, D., Yang, J., Ribarsky, W.: NewsLab: Exploratory Broadcast News Video Analysis. In: IEEE Symp. Vis. Analytics Sci. and Tech., pp. 123–130 (2007) 42. Paulovich, F.V., de Oliveira, M.C.F., Minghim, R.: The projection explorer: A flexible tool for projection-based multidimensional visualization. In: Proc. 20th Brazilian Symp. Comp. Graph. and Image Processing (SIBGRAPI), pp. 27–36 (2007) 43. Alencar, A.B., de Oliveira, M.C.F., Paulovich, F.V., Minghim, R., Andrade, M.G.: Temporal-PEx: Similarity-based visualization of time series. In: Proc. 20th Brazilian Symp. Comp. Graph. and Image Processing, SIBGRAPI (2007) 44. Ishikawa, Y., Hasegawa, M.: T-Scroll: Visualizing trends in a time-series of documents for interactive user exploration. In: Kov´ acs, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 235–246. Springer, Heidelberg (2007) 45. Terachi, M., Saga, R., Sheng, Z., Tsuji, H.: Visualized technique for trend analysis of news articles. In: Nguyen, N.T., Borzemski, L., Grzech, A., Ali, M. (eds.) IEA/AIE 2008. LNCS (LNAI), vol. 5027, pp. 659–668. Springer, Heidelberg (2008) 46. Petrovi´c, S., Dalbelo Baˇsi´c, B., Morin, A., Zupan, B., Chauchat, J.H.: Textual features for corpus visualization using correspondence analysis. Intell. Data Anal. 13(5), 795–813 (2009) 47. Strobelt, H., Oelke, D., Rohrdantz, C., Stoffel, A., Keim, D.A., Deussen, O.: Document cards: A top trumps visualization for documents. IEEE Trans. Vis. Comput. Graph 15(6), 1145–1152 (2009) 48. Prabowo, R., Thelwall, M., Alexandrov, M.: Generating overview timelines for major events in an RSS corpus. J. Informetrics 1(2), 131–144 (2007) 49. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. American Society for Info. Science 41 (1990) 50. Jackson, J.E.: A User’s Guide to Principal Components. John Willey, New York (1991) 51. Greenacre, M.J.: Correspondence analysis in practice. Chapman and Hall, Boca Raton (2007) 52. Kruskal, J.B., Wish, M.: Multidimensional Scaling. Sage Publications, CA (1978) 53. York, J., Bohn, S., Pennock, K., Lantrip, D.: Clustering and dimensionality reduction in spire. In: AIPA Steering Group. Proc. Symp. Advanced Intelligence Processing and Analysis. Office of R&D, Washington (1995) 54. Paulovich, F.V., Nonato, L.G., Minghim, R., Levkowitz, H.: Least square projection: A fast high-precision multidimensional projection technique and its application to document mapping. IEEE T. Vis. Comp. Graph. 14(3), 564–575 (2008) 55. Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1995) 56. Fruchterman, T.M.J., Reingold, E.M.: Graph drawing by force-directed placement. Software: Practice and Experience 21(11), 1129–1164 (1991) 57. Minghim, R., Paulovich, F.V., Lopes, A.A.: Content-based text mapping using multidimensional projections for exploration of document collections. In: IS&T/SPIE Symp. on Elect. Imag. - Vis. and Data Anal., San Jose (2006) 58. Shneiderman, B.: Treemaps for space-constrained visualization of hierarchies, http://www.cs.umd.edu/hcil/treemap-history/index.shtml (accessed in April 2010) 59. Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: AAAI Workshop on Knowledge Discovery in Databases, pp. 359–370 (1994)
Visualization of Text Streams: A Survey
43
60. Keogh, E.J., Lonardi, S., Ratanamahatana, C.A., Wei, L., Lee, S.H., Handley, J.: Compression-based data mining of sequential data. Data Min. Knowl. Discov. 14(1), 99–129 (2007) 61. Hearst, M.: User Interfaces and Visualization. Addison-Wesley Longman, Amsterdam (1999) 62. Eler, D.M., Paulovich, F.V., de Oliveira, M.C.F., Minghim, R.: Coordinated and multiple views for visualizing text collections. In: IEEE 12th Conf. Info. Vis., pp. 246–251 (2008) 63. van Wijk, J.J.: Views on visualization. IEEE T. Vis. Comp. Graph. 12(4) (2006) 64. McGrath, J.E.: Methodology matters: doing research in the behavioral and social sciences. Morgan Kaufmann Publishers Inc., San Francisco (1995) 65. Microsoft: Pivot project, http://www.getpivot.com/
Combining Semantic and Content Based Image Retrieval in ORDBMS Carlos E. Alvez1 and Aldo R.Vecchietti2 1
Facultad de Ciencias de la Administración Universidad Nacional de Entre Ríos
[email protected] 2 INGAR - UTN Facultad Regional Santa Fe
[email protected]
Abstract. In this article, an architecture for image retrieval in an ObjectRelational Database Management System is proposed. It combines the use of low-level descriptors and semantic metadata for similarity search. The architecture has three levels: content-based, semantic data and an interface integrating them. Several database User Defined Types (UDT) and operations are defined for that purpose. A case study about vehicles is implemented and results obtained show an important improvement in image similarity search. Keywords: content – semantic – image retrieval – ORDBMS.
1 Introduction Several techniques have been proposed during the last decade for recovering images by using content-based retrieval (CBIR). Most of them are limited by the semantic gap separating the low level information from its metadata annotations. Semantic gap for image retrieval is the difference between low level data extracted and the interpretation the user has for the same picture [1]. Nowadays, the trend in image retrieval is the integration of both low-level and semantic data. Several approaches can be found in the literature about this topic. RETIN is a Search Engine developed by Gony et al. [2] with the goal of fulfilling the semantic gap. It uses an interactive process that allows the user to refine the query as much as necessary. The interaction with the user consists of binary levels to indicate if a document belongs to a category or not. SemRetriev proposed by Popescu et al. [3] is a prototype system which uses an ontology to structure an image repository in combination with CBIR techniques. The repository includes pictures gathered from Internet. Two methods are employed for image recovery based on keywords and visual similarities where the proposed ontology is used for both cases. Atnafu et al. [4] proposed several similarity-based operators for recovering images stored on relational databases tables. This is one of the first efforts to integrate the content-based and semantic for pictures in a DBMS. Tests were performed extending the prototype called EMIMS. The authors use a database as image repository and added several algorithms to EMIMS. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 44–53, 2010. © Springer-Verlag Berlin Heidelberg 2010
Combining Semantic and Content Based Image Retrieval in ORDBMS
45
Research in image retrieval has been handled separately in database management systems and computer vision. In general systems related to image retrieval assume some sort of system containing complex modules for processing images stored in a database. Nowadays commercial DBMSs are offering more tools to incorporate semantic such that queries can be formulated ontology-assisted or via traditional data. In this paper, we introduce a software architecture to address image retrieval under an Object-Relational Database Management System (ORDBMS) [5] combining lowlevel and semantic features with the objective of maximizing the use of components and tools the DBMS provides. In this sense the architecture is composed of several User Defined Type (UDTs) and methods. The article is outlined as follows: first the proposed architecture is presented and levels composing it are explained with more detail. Then a vehicle case study is introduced where the reference ontology used for vehicle classification and MPEG-7 lowlevel descriptors selected for CBIR are described. After that, the considerations made to include semantic data to images are explained. Finally experiments performed, results obtained and conclusions are presented.
2 Proposed Architecture The architecture proposed can be seen in Fig. 1. Three main parts composes this model: Low Level: are in charge of loading, extracting and management of images visual descriptors (color, texture, etc.). Combinations of user define type (UDT) and java functions and SQL queries are employed in this layer. Semantic: corresponds to database interfaces to introduce semantic ontology concepts. Images are related to these concepts such that SQL semantic queries can be performed. Connection: is the interface linking both low-level and semantic features. This section is the main contribution of this work. Special UDTs functions and operators have been defined in order to reduce the semantic gap. The database consists of an infrastructure of UDTs to support images, matrices, arrays of low level descriptors and semantic information. Fig. 2 shows the UML class diagram defined for the design of the object relational schema. On top of the hierarchy there is an interface SetOperations containing the definition of set operators which can be used to query low-level or semantic features separately or combined. Image is a class defined to store the image and some other properties of them. It inherits the operations declared in SetOperations. Image has a composition relationship with the generic LowLevelDescriptor abstract class from which special classes to implement specific low-level descriptors can be derived. In this class, method similarScore is used to obtain a set of images having similar low-level descriptor values. SemanticData class is included to support domain ontology. It inherits also from interface SetOperations. SemResultSet is the method to return a set of images with same semantic values. Classes of Fig. 2 are transformed (via mapping functions) to UDT when defining the ORDBMS schema based on SQL:2003 standard [5].
46
C.E. Alvez and A.R.Vecchietti
Fig. 1. Architecture proposed for image retrieval
Fig. 2. UML class diagram model of the object-relational database defined for the architecture
Combining Semantic and Content Based Image Retrieval in ORDBMS
47
2.1 Low Level Image UDT is defined for pictures loading and management. Images are stored in a BLOB (Binary Large Object) attribute together with some others to save characteristics like size, width, height, etc. Operations importImage and exportImage are employed to import and export images from files respectively. This UDT also includes attributes corresponding to low-level descriptors UDTs as a consequence of mapping the composition relationship with LowLevelDescriptor shown in Fig. 2. Low-level UDTs are implemented according to the application domain. For example, for general images descriptors proposed by the MPEG-7 standard can be employed, for more specific applications like biometry and medicine explicit domain descriptors must be used. Set operators inherited from the interface can be redefined in Image UDT for special purposes. These UDTs and operators infrastructure facilitates CBIR queries by using SQL sentences which are familiar in the IT community. 2.2 Semantic Semantic data are not loaded automatically and several alternatives are possible to work with them: metadata generated by the user, structured concepts extracted from Internet, ontology driven concepts, etc. Since semantic model must be structured in order to query them by its content, it is convenient to count with a domain ontology like the one proposed by MPEG-7 group [6]. In this work, semantic is introduced for every image in the database. SemanticData UDT is generated for this purpose. Concepts taken from the domain ontology or literals x are associated with a specific UDT Image instance y by means of the following triple: <subject property object> Links can be performed by asserted triples (
) or can be inferred by means of some other properties like “subclass of”, “range”, “subproperty of”, etc. [7]. For example, following the definition made by Allemang and Hendler [8], given the diagram of Fig. 3, where two ontology classes (C and B) are associated by a subClassOf relationship and given the property p whose domain belongs to class B, then, if we assert: We can infer: - y is of type B because is of rdfs:domain B - y can be inferred of type C by rdfs:subClassOf relationship
Fig. 3. Example of graph with rdfs:domain and rdfs:subClassOf property
48
C.E. Alvez and A.R.Vecchietti
RDF queries are expressed by posing a question mark in any component of the triple defining in this way a variable that must be verified. Queries are expressed like: S := For our case, result S is a set of image references satisfying the query triple; for example, query: return references to images having a matching with car, wagon, 5_doors, 4_doors, etc (see ontology of Fig. 4). 2.3 Low-Level and Semantic Integration The previous levels can be queried individually; in cases that low-level and semantic data are needed together the interface setOperations which is inherited by both is used to integrate and get results of combined queries. SetOperations lets the definition of combined union, intersection and difference set operations, which calls similarScore and semResultSet functions of LowLevelDescriptor and SemanticData UDT respectively. By these methods it is also possible to get results from two low-level queries or two semantic queries. Both similarScore and semResultSet functions returns a set of Image references (set[). Since both methods return a set][ they can be easily employed in SQL queries using clauses IN, EXISTS and some other.]
3 Case Study For the case study worked in this paper Oracle 11g database was used. The purpose is to get the benefit of Oracle ORDBMS and it’s Semantic Technologies Tools [9]. In the experiment, vehicle images gathered from Flickr by means of Downloadr [10] are stored in the database. Tags and texts are used to obtain pictures from Flickr site. In section 3.1 there is an explanation about low-level descriptors selected and how data are obtained. Fig. 4 presents an excerpt of the vehicle ontology used in this example, images tags and texts have been introduced as concepts in the database like RDF triples. Section 3.2 describes with more details this issue. Vehicle
Truck
Car
Light_Truck
Furgon
PickUp
Wagon
Station
Van
5_dors
Bus
4_dors
3_dors
Coupe
Fig. 4. Excerpt of vehicle reference ontology
2_dors
Convertible
Combining Semantic and Content Based Image Retrieval in ORDBMS
49
3.1 Low-Level Implementation Four UDTs suclases of LowLevelDescriptor are implemented in Oracle to load the following MPEG-7 standard descriptors: Dominant Color (DCD), Color Layout (CLD), Scalable Color (SCD) and Edge Histogram (EHD). Extract methods were defined to obtain descriptors data and similarScore for content based search. similarScore method was written in PL/SQL, while extract is a Java wrapper code linked to LIRe library defined by Caliph&Emir[11]. Extract are static methods performing the task of creating objects, obtain descriptors data and store the information in the corresponding arrays defined in the UDTs [12]. For example, extractionEHD creates an instance of EdgeHistogramImplementation calls method getStringRepresentation and returns the 80 bins of this descriptor. The code is compiled and load in the database by means of Loadjava utility. In order to use ExtractionEHD function in Oracle another wrapper written in PL/SQL is needed for each method in JavaWrapper.class. Fig. 5 represents the implementation model of low-level data descriptors. Database type table ImgFlickr is defined to store objects of Image UDT. Each table row contains data items of low-level descriptor UDTs such that once an image is loaded the appropriated extract method is invoked to initialize the objects. 3.2 Semantic Implementation The reference ontology was generated having in mind tags used to download the images from Flickr. It categorizes different vehicle types. By means of Oracle Semantic Technologies a “semantic data network” was created and also a RDF Vehicle model. These tasks are completed by implementing SemanticData UDT and semResultSet function. Besides the triples of Fig. 4, some other are defined to relate concepts with class instances, domains, etc. Every image loaded in database table ImgFlickr has a link to the reference ontology by adding a triple in the form of ), which associates rows of ImgFlickr table with the hierarchy shown in Fig. 4. Some other functions provided by Oracle Semantic Technologies can be employed after this step, for example SEM_APIS.CREATE_ENTAILMENT creates a rule index to perform RDFS, OWLPRIME, user rules inferencing, or SEM_MATCH to query semantic data directly.
Fig. 5. Low-level description implementation model
50
C.E. Alvez and A.R.Vecchietti
3.3 Low-Level and Semantic Integration Implementation Results returned by semResultSet y similarScore are set of image references that can be combined and manipulated by defining and implementing functions of the interface SetOperations. These operations can be overloaded and/or overridden for special purposes giving several options and flexibility to develop an application.
4 Experiments and Results A database of medium size containing 10000 images gathered from Flickr has been generated to perform the analysis, where 30 of them are proposed like descriptors query patterns. For each one between 15 and 20 images visually similar are proponed as the ground truth set (gts) [13]. The recovery rate (RR) can be obtained as follows:
RR(q) =
NF (α , q) NG(q)
(1)
where NG(q) is the gts size for query q and NF(α, q) is the amount of images of the gts obtained between the first αNG(q) recovered. RR(q) takes values between 0 and 1, if the value is 0 no images have been found from the gts while 1 indicates that all images of the gts were recovered. The α factor must be ≥ 1, tolerance is larger for higher α and therefore it increases the likely of obtaining images from the gts. For the experiment we set α = 1.6 y NG(q) = 5 meaning that NF(α, q) is the amount of gts images recovered from the first 8. Figure 6 presents an example and the gts proposed. The performance of selected descriptors for all queries (NQ=30 for our study) is evaluated by the average recovery rate (ARR) [13] given by:
ARR =
1 NQ ∑ RR(q) NQ q=1
Fig. 6. Examples of ground truth set
(2)
Combining Semantic and Content Based Image Retrieval in ORDBMS
51
Different levels of the hierarchy tree of Fig. 4 are selected as query objectives in order to compare the efficiency and precision of the answers. Results obtained taking into account with and without the semantic models are shown in Table 1. For example for two doors sedan cars the comparison is made considering the whole set of car images against 2_doors cars set. The inference made by using specified domain and property rdfs:subClassOf can contain any of these values: 2_door, convertible, coupé. Images qualified by 2_doors concept are visited and also those belonging to coupe and convertible. The experiment is performed as follows: One query for every image in the database using each one of the descriptors presented in section 3.1. One query for every image satisfying the condition of the tree node selected in a direct or inferred manner. In total 240 queries have been executed. In Table 1 can be found the average recovered rate obtained for each one of the descriptors selected for this experiment, results without and including the semantic are shown in the first and second row of Table 1, respectively. From Table 1 can be concluded that SCD descriptor has the best behavior when recovering images by similarity. It can be also observed that there is an important improvement getting similar images when semantic inference is employed together low level descriptors data. In Figure 7 it can be seen the images obtained with queries involving SCD descriptor integrated with semantic data.
Fig. 7. Results obtained with SCD descriptor. The left most image of each row shows the query image, the other four on the right show the most relevant retrieved.
52
C.E. Alvez and A.R.Vecchietti
5 Conclusions In this article, an architecture for image retrieval in an Object-Relational Database Management System is proposed and implemented in Oracle 11g database. It combines the use of low-level descriptors and semantic data for similarity search. An infrastructure of UDTs an operator functions is built to support images and its characteristics. This infrastructure can be easily adapted to specific application domains. The proposed architecture shows ORDBMS provides all tools needed to develop in a single manner semantic and content-based image recovery. These capabilities can be seen in the case study implemented where MPEG 7 low-level descriptors: Dominant Color (DCD), Color Layout (CLD), Scalable Color (SCD) and Edge Histogram (EHD) are used. The extraction and loading data about those descriptors is made via Java and PL/SQL wrappers linked to LIRe library. Semantic data is included by means of RDF triples using Oracle tools. Ground truth sets of images and the average recovery rate (ARR) property are defined to drive the experiments. The scheme used to implement a vehicle images case study is simple but generic. Semantic data of images have been loaded in the database using reference ontology concepts and properties. A total of 240 queries have been executed. Results obtained from the scheme proposed demonstrate an important improvement in images similarity search. Table 1. ARR values obtained with queries performed
Low level features query Low level features + semantic query
SCD
CLD
DCD
EHD
0.290816
0.105442
0.068027
0.097506
0.450397
0.321429
0.265873
0.321145
Acknowledgments Authors thanks to Universidad Tecnologica Nacional (UTN) and Universidad Nacional Entre Ríos (UNER) for the financial support received.
References 1. Neumamm, D., Gegenfurtner, K.: Image Retrieval and Perceptual Similarity. ACM Transactions on Applied Perception 3(1), 31–47 (2006) 2. Gony, J., Cord, M., Philipp-Foliguet, S., Philippe, H.: RETIN: a Smart Interactive Digital Media Retrieval System. In: ACM Sixth International Conference on Image and Video Retrieval – CIVR 2007, Amsterdam, The Netherlands, July 9-11, pp. 93–96 (2007) 3. Popescu, A., Moellic, P.A., Millet, C.: SemRetriev – an Ontology Driven Image Retrieval System. In: ACM Sixth International Conference on Image and Video Retrieval – CIVR 2007, Amsterdam, The Netherlands, July 9-11, pp. 113–116 (2007)
Combining Semantic and Content Based Image Retrieval in ORDBMS
53
4. Atnafu, S., Chbeir, R., Coquil, D., Brunie, L.: Integrating Similarity-Based Queries in Image DBMSs. In: 2004 ACM Symposium on Applied Computing, Nicosia, Cyprus, March 14-17, pp. 735–739 (2004) 5. Jim, M. (ISO-ANSI Working Draft) Foundation (SQL/Foundation), ISO/IEC 9075-2:2003 (E), United States of America, ANSI (2003) 6. Hunter, J.: Multimedia Content and the Semantic Web: Methods, Standards and Tools. In: Adding Multimedia to the Semantic Web - Building and Applying an MPEG-7 Ontology. Wiley, Chichester (2006) 7. Brickley, D., Guha, R.V.: RDF Vocabulary Description Language 1.0: RDF Schema. W3C Recommendation (February 10, 2004), http://www.w3.org/TR/rdf-schema/ 8. Allemang, D., Hendler, J.: Semantic Web for the working Ontologist. In: Effective Modeling in RDFS and OWL. Morgan Kaufman, San Francisco (2008) 9. Murray, C.: Oracle Database Semantic Technologies Developer’s Guide. 11g Release 1 (11.1) Part B28397-02 (September 2007) 10. Flickr Downloadr 2.0.4, http://flickrdownloadr.codeplex.com 11. Caliph & Emir: Java & MPEG-7 based tools for annotation and retrieval of digital photos and images, http://sourceforge.net/projects/caliph-emir 12. Alvez, C., Vecchietti, A.: A model for similarity image search based on object-relational database. IV Congresso da Academia Trinacional de Ciências, 7 a 9 de Outubro de 2009 Foz do Iguaçu - Paraná / Brasil (2009) 13. Manjunath, B., Ohm, J.R., Vasudevan, V., Yamada, A.: Color and texture descriptors. IEEE Trans. on Circuits and Systems for Video Technology 11(6), 703–715 (2001)
A Historically-Based Task Composition Mechanism to Support Spontaneous Interactions among Users in Urban Computing Environments Angel Jimenez-Molina and In-Young Ko Department of Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea {anjimenez,iko}@kaist.ac.kr
Abstract. The area of Urban Computing has emerged from the paradigm of Ubiquitous Computing. Urban Computing shares the same requirements with its predecessor. However, Urban Computing needs to support spontaneous social groups. Therefore, a key requirement of Urban Computing is spontaneity, which is about composing services during runtime, without having predefined templates of applications. This paper leverages the approach of task-oriented computing, in order to compose a task by extending an existing task template with new or substitute functionality. The inclusion of new functionality is realized by analyzing the history of task execution. The appropriateness of the task composition mechanism is illustrated by an example scenario in our campus. Keywords: Urban Computing, Task-oriented Computing, Task Composition, Place-awareness, Ubiquitous Computing.
1 Introduction Recently the paradigm of Ubiquitous Computing (Ubicomp) has extended its principles and approaches toward the emergent area of Urban Computing (UrbComp). They share the goal of supporting individuals or groups of users with available services and networked resources in the environment, anytime, anywhere [1]. Moreover, UrbComp has inherited a set of common requirements from Ubicomp: user-centricity, context sensitivity, composability of services, and dynamicity of users’ goals and environmental information. However, UrbComp environments are denser in terms of the amount of users, larger regarding physical settings, more diverse in relation to the type of users, and more dynamic in terms of their goals and preferences. In such an environment, there exist opportunities for users to engage each other in social groups through serendipitous social encounters – physical or virtual. These features open new challenges to support these spontaneous interactions of users in UrbComp environments by selecting and composing appropriate available services during runtime. These challenges require the definition of a new requirement: spontaneity. Since a social group comes from an ad hoc interaction of users, such a requirement consists of R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 54–63, 2010. © Springer-Verlag Berlin Heidelberg 2010
A Historically-Based Task Composition Mechanism
55
creating new composite services during runtime. Moreover, due to the diversity of service requirements of social groups these new composites need to be created without having applications fully established in predefined descriptions [2]. In the best of our knowledge the literature does not provide yet solutions to compose services in a spontaneous fashion supporting the specific facts of UrbComp. In our previous work [2], [3], [4], we have leveraged the approach of task-oriented computing [5] to realize the requirements of Ubicomp environments. By doing that social group goals are represented in tasks. A task is composed of a set of unit-tasks, which coordination logic is predefined in advance in an application template. Each unit-task is a basic configuration of abstract services. In addition, unit-tasks are arranged by subsumption relationships in a hierarchical ontology. Therefore, in order to overcome the challenge of spontaneous composition, a set of appropriate unit-tasks as well as an appropriate template are firstly selected based on the context information tightly related to the space. This information is denominated placeness context [2]. Fig. 1 shows a unit-task description model with the essential semantic elements used to select unit-tasks: spatial and temporal information attached to the UrbComp environment; personal and social information associated to social groups. This model is used by a task selection algorithm described in [2].
Fig. 1. Representation of Unit-tasks Semantics [2]
Our solution consists of spontaneously composing a new task by extending the selected template with the selected unit-tasks. This composition is realized through two approaches: (1) by semantically contrasting the functionality of the template against the functionality of the unit-tasks in order to meet a placeness context; (2) by taking into account how historically users have accepted the functionality embedded in a unit-task in that specific situation of placeness context. The rationale of this historically-based approach is supported by users’ behavior studies in daily life activities [6], which have been carried out by the behavioral psychology area. Essentially these studies show that users keep a similar behavior in similar situations. The focus of this
56
A. Jimenez-Molina and I.-Y. Ko
paper is in two folds: first, on the description of a structural model of a task template; second, on the provision of a task composition mechanism. The structural model is for making it easier to compare the functionality of the selected template against the selected unit-tasks. We have also developed a task composition mechanism using this structural model, the selected unit-tasks, and the placeness context. This mechanism is about systematically performing two major steps – analysis on unit-task substitutability, and analysis on unit-task addition. These steps are applied over the selected template after performing a reduction of the candidate unit-tasks search space to be used to extend the template. An illustrative example scenario is provided to show the appropriateness of the task composition mechanism. The roadmap of this paper is as follows. Section 2 analyzes the related work. Section 3 describes the general structural model of a task template. Section 4 proposes the task composition mechanism. Section 5 describes the implementation of the mechanism illustrated by an example scenario, and section 6 concludes the paper.
2 Related Work Some studies have tended to focus on recognizing activities to support social groups based on the implicit meanings associated to an urban place. In this category of work is P3 (people-to-people-to-geographical-places) [7], a system that associates social networks with geographical places. Another example is MobiSoc [8], which arrange information about people, social relations, and physical spaces. In [9] social groups are clustered in relation to a common place, which is oriented to recommend users to join appropriate social groups. Major limitation of these approaches is that they do not enhance spontaneous interactions of users suggesting appropriate tasks. On the other hand, it is pertinent to consider certain works in the related areas of Ubicomp and Service Oriented Computing (SOC). Ubicomp traditionally has offered approaches based on the definition of applications or service composites into templates, which are selected and executed based on contextual information [3], [10], [11]. The disadvantage of this approach is that special ad hoc combinations of context information may not lead to select appropriate applications or composite of services. SOC offers the approach of automatic composition of services. Unfortunately, this approach does not support the requirement of context awareness. In addition, the approach of automatic composition of services requires explicit interventions of users to define the goal of such a composition.
3 Structural Model of a Template The absence of an explicit and global goal to meet hinders the spontaneous composition of a task by extending an existing template during runtime. Therefore, it seems to be appropriate to compose a task by reasoning about partial changes to be performed in a template. Jointly these changes allow meeting a specific situation of placeness context. These local changes need to be applied in specific parts of a template (see
A Historically-Based Task Composition Mechanism
57
Fig. 2). These parts can be represented in three generic categories that further divide the structure of a template – encountering method recommender, information delivery, or environmental effecter. The first part is about the required functionality for recommending a social encountering method among users of a social group, and the general purpose of carrying out that interaction. The second part concerns with delivering users useful cognitive information associated with the purpose of the encounter. The third part considers the functionality that generates physical effects in the UrbComp environment, like an audio content that produce noise, an adjustment of brightness or temperature, or movement of objects in the space – opening/closing windows or doors, etc.
Fig. 2. Generic Structure of a Task Template
4 Task Composition Mechanism The input of the task composition algorithm consists of a task template and a set of unittask candidates. The former is a template semantically closer to the placeness context provided by a place-aware context manager. Both the task template and the unit-tasks are selected according to the algorithm described in [2]. The output of this algorithm corresponds to a set of extended task template candidates. The extension is performed by substituting, adding and reducing unit-tasks on top of the original template. 4.1 Reduction of Unit-Task Search Space Let , ,···, , , ,···, , and , ,···, be the sets of unit-tasks that conform each of the three parts of the selected template structure respectively. Those sets are filtered out by matching in the unit-tasks of the template the structural type property value – encountering method recommender, information delivery, or environmental effecter. On the other hand, let , , and be the analogous sets for the selected unit-task candidates. All the property values of the unit-tasks that belong to the sets, , , and are included into the equivalent property values of those in the sets , , and . The reason is the template and the unit-tasks are selected based on the same placeness context. However, the last sets eventually would contain unit-tasks which properties with multiple-cardinality could include additional values. Therefore, it is needed to filter out the unit-tasks in , , and that provide a more complete functionality. It is pursued by matching the placeness context values against two specific properties of unit-tasks: (1) the place potential property related with the spatial
58
A. Jimenez-Molina and I.-Y. Ko
aspect of the unit-task description model, and (2) the preference property of the social group related with the personal and social aspect of the unit-task description model. This filtering allows reducing the search space of unit-tasks, conforming the new sets of less cardinality , , and . 4.2 Analysis on Unit-Task Substitutability For the unit-tasks of each structural type in the sets, , and , it is conducted a substitutability analysis against the unit-tasks that respectively belong to the sets , , and . This analysis consists of identifying those unit-tasks in the template that can be substituted by more appropriate unit-tasks. In order to find a substitute unit-task for each unit-task , the following two conditions are checked, which need to be jointly satisfied: (1) identification of a unit-task ̂ with an equivalent functionality; (2) ̂ needs to be interoperable in terms of its inputs, outputs, effects and preconditions (IOEP), with the neighboring unit-tasks that interacts with in and [12]. Substituent unit-tasks and those that were not substituted define the new set ̂ , ̂ ,···, ̂ of unit-tasks. Each ̂ defines a first structural part of a different extended template candidate – encountering method recommenders, and social activity recommenders. On the other hand, the substitutability analysis for each unit task consists ̂ of checking the following three conditions, also jointly satisfied: (1) needs to interoperates with a unit-task ̂ , the be functional equivalent; (2) in case of candidate ̂ also needs to be interoperable with its predecessor; (3) in case intero–, the perates with unit-tasks in – which do not interoperate with a unit-task ̂ candidate ̂ also needs to be interoperable with its successors in order to avoid closed cycles in the coordination logic. Meeting these conditions for each unit-task in allows defining the set for the extended template candidate corresponding to the unit-task ̂ . The set is composed of substituent unit-tasks and of those that were not substituted. That is, unit-tasks that compose the second structural part of the extended template candidate corresponding to ̂ – information deliverers. It should be noted that , which means that ̂ can be used as a substitute in the definition of one or more extended template candidate. consists of checking: Finally, the substitutability analysis for each unit-task (1) functional equivalence of ̂ with ; (2) interoperability of ̂ either with its predecessor ̂ or with its predecessors in . Once again, in order to avoid closed cycles ̂ is only allowed to interoperate with its predecessor ̂ or with its , but not jointly. Analogously, it is defined the set, , which unitpredecessors in tasks compose the third structural part of the extended template candidate corresponding to ̂ – environmental effecters. Note that . 4.3 Analysis on Unit-Task Addition It consists on the addition of new unit-tasks to each extended template candidate. Such an addition is performed out of the tasks of and that were used as substitute in the previous stage. Thus, let be the set defined by subtracting
A Historically-Based Task Composition Mechanism
59
the unit-tasks assigned as substitutes from . Analogously, it is defined the set, . The inclusion or not of a given unit-task into an extended template candidate is decided based on how historically users have accepted that functionality in situations with similar placeness context. The rationale of this historically-based approach comes from the results obtained by user’s behavior studies. From Magnusson et al. [6] in later, researchers in behavioral psychology have been demonstrating the basic assumption that users “behavior is more similar across situations which are perceived and interpreted as similar by individuals than across situations which are perceived as less similar or not similar at all” [6]. The results have shown a plenty of statistically significant cases. In this sense, the inclusion of new unit-tasks that would interoperate with others unit-tasks, needs to be decided in light of reflecting the experience associated to the execution of these unit-tasks in a specific situation of placeness context. First, such a reflection is done for the unit-tasks in the set for a specific situation of placeness context , informed by a place-aware context manager. This context is composed of the spatial, temporal, personal and social variable values specified in [2] (see Fig.1). Let , 1,···, be a set of unit-task composition candidates. These 1,···, be a set composites are extracted out of interoperable unit-tasks in . Let , of task templates historically executed under the placeness context . For each and for each it is computed the matching value , . This function takes the value 1 if the composite appears in the second structural part of the template exactly, independently of the appearance of other composites. It takes the value 0 otherwise. The affinity of the composite with the totality of templates , is computed as follow: ∑
,
0,1
(1)
In order to choose the most appropriate unit-task composite among the candidates , it is selected such that it is interoperable with ̂ or some unit-task in , and has the maximum affinity value in the ranked list of ordered by affinity values. By computing the maximum it is chosen the composite of unit-tasks from most frequently used by users in that placeness context . Therefore, the set of unit-tasks in the second structural part of the extended template contains additional unit-tasks, . which is denoted by Second, for the case of environmental effecter unit tasks in the set it is comon the one hand, and the unit-tasks in on puted an affinity value against ̂ the other hand. In general, the affinity between a pair of unit-tasks and , is defined as: , ,
0,1
(2)
As stated in the previous stage, in order to avoid closed cycles it is only allowed the or with predecessors in independently. interoperation with the predecessor ̂
60
A. Jimenez-Molina and I.-Y. Ko
5 Implementation and Evaluation The task manager server has been allocated on a machine attached to a public corridor of the research dependencies of our campus. It consists of an Intel (R) Core(TM) 2, 1.87 GHz CPU and 2 GB of memory. A client has been installed on three Ultra Mobile PCs (UMPC), 1.8 GHz CPU and 512 MB of memory. These clients have been defined to interface with the users in case of not having historical information and an end-user authoring of the task is required, or in case of visualizing information of the unit-tasks of the second category. The co-presence identification of users in the public corridor has been realized by detecting their UMPCs through Bluetooth. For that, it has been used the BlueSoleil stack [13] to allow a Bluetooth communication between the task manager server and the UMPCs. In addition, the BlueCove [14] Java API has been used to implement a required listener of Bluetooth signals, and extract the identification information of the devices. Let’s consider the following illustrative scenario: Chandler, Eun-Min and Antonella are three freshmen graduate students that regularly have a class in adjacent classrooms in one of the two campuses of the University. They are three familiar strangers [15]. They go to the other campus of the University at the same day and time. The Bluetooth detection determines their co-presence in one of the public corridors. The place-aware context manager has information regarding their familiarity type, belonging to the place, profile (age, preferences), and familiarity type. Moreover, the place-aware context manager keeps information about spatial and temporal aspects. For instance, the potential of the place range from research, to leisure and entertainment; the phase of the day is noon. Based on the placeness information the placeaware context manager create a group composed of three familiar strangers, which major common preferences are related with performing presence socializing activities with new people, and eating Thai food. By matching the placeness context against the task ontology nodes, the task selection algorithm determine “having an informal meeting” as the most appropriate task within the existing ones in the server to be performed. The template of this task is composed of the following unit-tasks: recommending an informal meeting, suggesting meeting place, and alarming proximity of users. Their sequential coordination logic is shown in Fig. 3a in BPEL [16]. On the other hand, the unit-task selection mechanism results on: recommending having a working lunch, displaying restaurant location map, having a social academic activity, having a research meeting, suggesting meeting place, alarming proximity of users, displaying seminar room location, among other unit-tasks. For this scenario, the reduction of the unit-task search space allows filtering out those unit-tasks that do not fully match with the place potential and user preferences. In that sense, it filters out the unit-task having a social academic activity because it does not meet either the leisure or entertainment place potential, none the Thai food preference. The same analysis is conducted to filter out the unit-tasks having a research meeting and displaying the seminar room location. The analysis on unit-task substitutability determines that recommending an informal meeting can be substituted by recommending having a working lunch. The reason is that the substitute gives an equivalent functionality, even more specialized, since it is in a subsumption relationship with the original unit-task of the template within the
A Historically-Based Task Composition Mechanism
61
unit-task ontology. In addition, the substitute is interoperable with suggesting meeting place. The reason is because one of the abstract services of recommending an informal meeting is the place to meet, which is presented to the user in a specific interface by activating the unit-task suggesting a meeting place.
Fig. 3. Task coordination logic in BPEL: 3a. original application template for “having an informal meeting” task, and 3b. the extended application template.
Therefore, recommending having a working lunch is set as the first structural part of the extended template. The second structural part of the template resulting from the substitutability analysis is kept the same, since they share suggesting meeting place. Because of the same reason the environmental effecter part is also kept the same, and set with the unit-task alarming proximity of users. By computing the affinity value among displaying restaurant location map and the historical templates that have been executed under the same configuration of placeness context, the analysis of unit-task addition determines a high affinity value. Then, the second structural part of the new
62
A. Jimenez-Molina and I.-Y. Ko
template is extended by adding displaying restaurant location map. The coordination logic of the extended template is shown in Fig. 3b in BPEL. Table 1. Placeness context for the illustrative scenario Aspect
Property Urban environment Constraint Spatial and Level of publicness Temporal Place potential Phase of the day Day of the week Familiarity type Belonging to the place Personal and Social Age range Social group preferences
Value Computer science Dept. corridor Quiet (level of noise) Public Research, leisure, entertainment Noon Friday Familiar strangers Insiders Youth Presence socialization, Thai food
6 Conclusion and Future Work This paper proposes a task composition mechanism to support spontaneous interaction of social groups in UrbComp environments. Such a mechanism is developed by a semantically-based approach, and a historically-based approach. By leveraging the task-oriented computing, the mechanism consists of contrasting the functionality of a semantically closer task template, and a set of unit-tasks. Both of them are selected according to a situation of placeness context. The objective of such a comparison is to extend a template to support the ad hoc interaction of users that spontaneously configure a social group. This paper also proposes a generic structural model of templates. This model is composed of three generic categories of unit-tasks: (1) those that are to recommend a social encounter method, and a social activity; (2) unit-tasks that are to deliver cognitive information through user interfaces, and (3) the unit-tasks that produce a physical effect in the UrbComp environment. The template extension is realized by three major steps in the task composition mechanism: (1) reduction of a unit-task search-space; (2) analysis on unit-tasks substitutability, and (3) analysis on unit-tasks addition. Historical information is used to measure the affinity of a unit-task with a set of templates executed in the past under a specific placeness context situation. This research will be extended to a task prediction mechanism based on historical information. The objective of this mechanism will be to extract users’ behavioral patterns that could improve the effectiveness of the task composition mechanism. In addition, the mechanism proposed herein will be enhanced with urban specific composition strategies that allow spontaneously composing unit-tasks without having any initial template to extend. Acknowledgments. This work was supported at the IT R&D program of MKE/KEIT under grant KI001877 [Locational/Societal Relation-Aware Social Media Service Technology]. Multiple discussions with Gonzalo Huerta-Canepa have benefited this research. The authors also thank to Byung-Seok Kang.
A Historically-Based Task Composition Mechanism
63
References 1. Weiser, M.: The Computer for the 21st Century. In: Scientific American, pp. 94–104; Reprinted in IEEE Pervasive Computing, pp. 19–25. IEEE, Los Alamitos (2003) 2. Jimenez-Molina, A.A., Kang, B., Kim, J., Ko, I.Y.: A Task-oriented Approach to Support Spontaneous Interactions among Users in Urban Computing Environments. In: Proceedings of the IEEE International Conference on Pervasive Computing and Communications (PERCOM). The 7th IEEE PerCom Workshop on Managing Ubiquitous Communications and Services (MUCS), pp. 176–181. IEEE, Los Alamitos (2010) 3. Huerta-Canepa, G.F., Jimenez-Molina, A.A., Ko, I.Y., Lee, D.: Adaptive Activity based Middleware. IEEE Pervasive Computing 7(2), 58–61 (2008) 4. Jimenez-Molina, A.A., Kim, J., Koo, H., Kang, B., Ko, I.Y.: A Semantically-based Task Model and Selection Mechanism in Ubiquitous Computing Environments. In: Proc. 13th International Conference on Knowledge-Based Intelligent Information & Engineering Systems (KES 2009). LNCS, pp. 829–837. Springer, Heidelberg (2009) 5. Wang, Z., Garlan, D.: Task-Driven Computing. Technical Report, CMU - CS -00-154 (2000) 6. Magnusson, D., Ekehammar, B.: Similar Situations-Similar Behaviors? A Study of the Intraindividual Congruence Between Situation Perception and Situation Reactions. Journal of Research in Personality 12, 41–48 (1978) 7. Quentin, J., Grandhi, S.: P3 Systems: Putting the Place Back into Social Networks. IEEE Internet Computing 9(5), 38–46 (2005) 8. Gupta, A., Kalra, A., Boston, D., Borcea, C.: MobiSoC: A Middleware for Mobile Social Computing Applications. Mobile Networks and Applications 14(1), 35–52 (2009) 9. Gupta, A., Paul, S., Jones, Q., Borcea, C.: Automatic Identification of Informal Social Groups and Places for Geosocial Recommendations. The International Journal of Mobile Network Design and Innovation (IJMNDI) 2(3/4), 159–171 (2007) 10. Sousa, J.P., Garlan, D.: Aura: An Architectural Framework for User Mobility in Ubiquitous Computing Environments. In: 3rd Working IEEE/IFIP Conf. on Software Architecture (2002) 11. Roman, M., Hess, C.K., Cerqueira, R., Ranganathan, A., Campbell, R.H., Nahrstedt, K.: Gaia: A Middleware Infrastructure to Enable Active Spaces. IEEE Pervasive Computing 1, 74–83 (2002) 12. Shajjar, A., Khalid, N., Ahmad, H.F., Suguri, H.: Service Interoperability between Agents and Semantic Web Services for Nomadic Environment. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 2, pp. 583–586 (2008) 13. The BlueSoleil Bluetooth Software, http://www.bluesoleil.com 14. The BlueCove Java library for Bluetooth, http://bluecove.org/ 15. Paulos, E., Goodman, E.: The Familiar Stranger: Anxiety, Comfort, and Play in Public Places. In: Proc. Conference on Human Factors in Computing Systems, pp. 223–230. ACM, New York (2004) 16. The Business Process Execution Language, http://bpelsource.com/
Multi-criteria Retrieval in Cultural Heritage Recommendation Systems Pierpaolo Di Bitonto, Maria Laterza, Teresa Roselli, and Veronica Rossano Department of Computer Science, University of Bari, Via Orabona 4, 70126 Bari, Italy {dibitonto,marialaterza,roselli,rossano}@di.uniba.it
Abstract. In recent years there has been a growing interest in mobile recommender systems for the tourism domain, because they can support users visiting new places, suggesting restaurants, hotels, attractions, or entire itineraries. The effectiveness of their suggestions mainly depends on the item retrieval process. How well is the system able to retrieve items that meet users’ needs and preferences? In this paper we propose a multi-criteria collaborative approach, that offers a complete method for calculating users’ similarities and rating predictions on items to be recommended. It is a purely multi-criteria approach, that uses Pearson's correlation coefficient to compute similarities among users. Experimental results evaluating the retrieval effectiveness of the proposed approach in a prototype mobile cultural heritage recommender system (that suggests visits to cultural locations in Apulia region) show a better retrieval precision than a standard collaborative approach based on the same metrics. Keywords: multi-criteria retrieval, cultural heritage, mobile recommender system.
1 Introduction In recent years there has been a growing interest in studying and implementing recommender systems for tourism available on mobile devices, because they can guide the tourist on visits to new places, suggesting localities that meet her/his needs and preferences: restaurants, museums, buildings of architectural interest, and so on. Each of these systems uses a more or less complex method to retrieve the items in the recommendation process. Up to now, the most widely used approaches have been content-based and case-based. The former describes the items to be suggested, together with their properties, and matches them to user needs or preferences [1], the latter stores cases of successful recommendation sessions, to be retrieved on the basis of their similarity with the current session [2]. In mobile recommendation systems for tourism, the collaborative method has not been used much as yet. Its main idea is to aggregate ratings or recommendations of objects, recognize common features among users on the basis of their ratings, and generate new recommendations based on interuser comparisons [3]. In the standard collaborative method (single criterion), the user most similar to the current one is detected by attributing a similarity function to the item’s overall ratings. Then, aggregation functions are used to predict the ratings values for unrated items. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 64–73, 2010. © Springer-Verlag Berlin Heidelberg 2010
Multi-criteria Retrieval in Cultural Heritage Recommendation Systems
65
In this process the user rates the item by attributing a single value that summarizes the evaluation of several aspects. For example, two users can provide the same overall rating for an item, e.g. How much do you like “Castel del Monte” (1-5)? Both users rate it 5, although in their evaluations they adopt different points of view (the first user considers the architectural style, the second the castle’s state of conservation, because they have different backgrounds). This gives rise to ambiguous similarity measures that lead to less precise recommendations. To overcome this problem, multi-criteria approaches to suggest movies or restaurants have been proposed. The items are rated according to different features: history, actors and direction for the movies, or food, service, cost and décor for restaurants. In this way the system can ascertain not only how much an item is liked by the user, but also why s/he prefers it. The similarity is computed on the various features rated for each item. Moreover, in the movies domain it has been shown that the use of such approaches provides more accurate similarity measures, improving the recommendation precision [4]. Our idea is to use a multi-criteria approach to retrieve cultural heritage items, because the complexity of the items to be recommended makes it necessary to consider several dimensions. Moreover, it has been proven that multi-criteria techniques lead to a higher recommendation accuracy. The paper is organised as follows: section 2 presents some research into recommender systems for tourism and multi-criteria techniques; section 3 describes the theoretical basis of the proposed approach; section 4 reports an application to the field of cultural heritage recommendation; section 5 illustrates the experimental plan and the results. Finally, some conclusions and future research directions are outlined.
2 Related Works In mobile recommender systems for tourism different approaches to retrieve the items to be suggested have been used. MobyRek [5], for example, retrieves travel products (i.e. restaurants) by matching the product features with users’ needs and preferences. In particular, the system represents each travel product (the item to be recommended) as a features vector (item name, distance from the user, type, cost, days open, and so on). The information about the users’ preferences (or needs) is taken from three different sources: the logical query, the favorite pattern, the feature importance weights vector. The logical query is a conjunction of logical constraints inserted by the user during the system interaction; the favorite pattern is a set of conditions used to make a trade-off between must conditions (what the user wants) expressed in the logical query and wish-conditions (what the user would like) inferred by relevance feedback; finally, the feature importance weight vector is used to represent the importance of each item feature. The retrieval process evolves in cycles. At each cycle the system produces a list of recommended products that the user can refine by adding new constraints (wish conditions). The city guide presented in [6] suggests multimedia content about the points of interest in Oldenburg (churches, museums, theatres, and so on), by matching the user’s interests with the metadata associated to the sightseeing spots. In [7], the CHIP (Cultural Heritage Information Personalisation) system (http://www.chip-project.org) is presented. The aim of the system is to suggest artworks in the Rijksmuseum in Amsterdam, using a content-based approach. The user
66
P. Di Bitonto et al.
profile is learned from the users’ relevance feedback and each artwork is described by several fixed properties (such as the artist, place, historical context, theme, and so on), identified by art experts. The system retrieves paintings and sculptures matching the individual user’s ratings and the properties of the artworks. There are very few examples of collaborative multi-criteria retrieval in mobile recommendation systems for tourism. The most cited is ZagatÊs Guide (http://www.zagat.com), a system which suggests restaurants in the main cities of the world. Zagat’s Guide describes each restaurant using three criteria (food, service, and decòr) that match users’ preferences. The rating for each criterion is the same for all users, and is not personalized to each individual consumer. This approach does not take full advantage of the potential of multi-criteria ratings. Even if multi-criteria ratings in recommender systems have not yet been much used, in the last few years there has been a growing interest in work on models, algorithms and experimentations tending toward this direction. In [8], Adomavicius and Tuzhilin state that “in some applications, such as restaurant recommenders, it is crucial to incorporate multi-criteria ratings into recommendation method”. Similarly, in the movies recommendation domain, two multi-criteria approaches and their performance evaluation are presented in [4]. The first extends the standard nearest neighbor collaborative algorithm; the second uses domain independent metrics that can be implemented in several recommendation methods (content-based, collaborative, and so on). In both cases, each user rates each item from two different points of view: general and specific. The experimental results obtained with a set of ratings supplied by the users to the Yahoo! Movies website (http://www.movies.yahoo.com) show an improvement in the recommendation accuracy in the movies domain as compared to the use of a singlecriterion approach. Lakiotaki and Matsatsinis in [9] observe that in a multi-criteria approach the similarity computation should consider the number of co-rated items as well as the user ratings. Our research aims to expand a standard nearest neighbor collaborative algorithm by adding three main novelties: it is a purely multi-criteria approach (lacking any overall evaluation); it uses Pearson's correlation coefficient [10] to compute similarities between two users and finally, it considers the number of co-rated items in the overall similarity computation.
3 A Multi-criteria Approach Standard use-based nearest neighbor algorithms solve the recommendation problem both by detecting a set of similar users to the target user (named neighbour), and by making rating predictions for unrated items. So, the retrieval process is closely related to the degree of similarity among users and depends on the aggregation function used. Adomavicius and Kwon in [4] propose multi-criteria techniques that consider both a set of k multi-criteria ratings and an overall rating for the item. They suggest computing k+1 similarity measures between each couple of users in order to obtain an overall similarity measure. To compute the k+1 similarity measures they suggest to use standard metrics, but focus their attention on cosine-based and multi-dimensional distance metrics (Euclidean distance, Chebychev distance and Manhattan distance).
Multi-criteria Retrieval in Cultural Heritage Recommendation Systems
67
If cosine-based metrics are used, in order to collapse the k+1 similarity measures into a single overall similarity value between two users, “Average similarity” (the average of k+1 similarity measures) and “Worst-case similarity” (the smallest similarity) are proposed. Otherwise, if multi-dimensional distance metrics are used, the overall similarity between two users is calculated on the basis of the inverse relation between distance and similarity. In both cases, a weighted sum and an adjusted weighted sum are proposed to make predictions. Because our proposal is a purely multi-criteria approach, it takes into account only k ratings. We compute k similarity measures between each couple of users using Pearson's correlation coefficient “that ranges from 1.0 for users in perfect agreement to -1.0 for users in perfect disagreement” [10] because this has been proven to work better than the cosine-based similarity [11], Spearman correlation, entropy-based uncertainty and mean-squared difference [12]. In order to fit the k+1 similarity measures into a single overall similarity value we compute the average of k similarity measures, considering the number of co-rated items by each couple of users. Finally, the adjusted weighted sum is used as the aggregation function. More formally, let us consider u as the target user, u’ as a generic user, k as the criteria, S the set of items, and, Su,u’ ⊆ S the set of co-rated items by users u and u’. For each co-rated item s∈Su,u’, the users supply the sets of ratings R(ru, s1 , ru, s 2 ,..., ru, s k ) and R ' (ru , , s , ru , , s ,..., ru , , s ) , respectively, where 1
2
k
ru, s1 is the rating by user u of the first
criterion of s. The vector of similarities measures is (sim1(u, u’), sim2(u, u’), … , simk(u, u’)), where simi(u, u’) is the similarity between the rating by users u and u’ for the criterion i (i∈[1, k]):
∑ (r
sim i (u, u' ) =
s∈S u, u'
∑ (r
s∈S u, u'
Let us consider that
u, s i
u, s i
)(
− ru, i ru', s i − ru', i
− ru, i
) ∑ (r 2
s∈S u,u'
u', s i
)
− ru', i
.
(1)
)
2
ru, i is the average of all user u ratings on criterion i. We assume
that all items are described by the same criteria and rated on all criteria. To fit the similarity vector values into a single similarity value μ(u, u’), the following formula is used: k
∑ sim (u, u' ) i
μ(u, u' ) =
i =1
k
*
S u, u'
,
(2)
S
where |S| and | Su,u’| are the S and Su,u’ cardinality set. Two users could have the same similarity measure with respect to the target user, even if the number of co-rated items is very different. For this reason the average similarity on all criteria is weighted on the confidence index (|Su,u’|/|S|). So, in the case of equal similarity measures, the higher the confidence index value, the higher μ(u, u’).
68
P. Di Bitonto et al.
Finally, we make the rating prediction by applying the adjusted weighted sum:
∑ μ(u, u')(r − r ) , ∑ μ(u, u') u',s
pred(u, s) = ru +
u'
u'⊂ neighbour(u)
(3)
u'⊂ neighbour(u)
where s is an unrated item by the target user u; ru is the average value of the average rate given by the target user u on all item criteria; in a similar way, ru' is the average rate computed considering the user u’ (where u’ ⊂ neighbour (u)); and ru’,s is the u’ average rate on different criteria of the item s. The main advantage of this aggregation formula is that it considers the subjective user’s perspective. In fact, as said in [15] “one optimistic happy user may consistently rate things 4 of 5 stars that a pessimistic sad user rates 3 of 5 stars. They mean the same thing (“one of my favorite items”), but use the numbers differently”.
4 Application of the Proposed Approach to Cultural Heritage Recommendation Systems The theoretical model presented in the previous section has been applied in a prototype mobile cultural heritage recommender system, with the aim of suggesting cultural heritage items to tourists. The system is able to guide tourists visiting several cultural locations in Apulia region (in the South of Italy) and it makes recommendations as follows. At the first user’s interaction, the system allows the user to express her/his interest (on a scale from 0 to 5) in several different categories: churches, castles, museums, and squares (figure 1(a)). If the user rates the castle category 0 it means he's not interested in visiting castles. If the user rates the castle category 5 and the museum category 3, he’s more interested in visiting castles than museums. So, the initial profile of the user's interests is acquired and a selection of items is presented (figure 1(b)). In this first selection the number of items proposed for each category is proportional to the user interest expressed. For each item a brief description is given (figure 1(b)); this includes some information about the distance from the tourist, the address, telephone number, cost of the ticket, historical information on the place to be visited, the visiting hours, the website link, and so on. After the visit, the user is asked to evaluate the experience (on a scale from 1 to 5) according to three criteria: architectural style, ease of access and welcome quality (figure 1(c)). The profile is updated on the basis of the user’s feedback. So, at the next interaction the system will be able to propose items filtered according not only to the preferred category, but also to the specific kind of item. For example, if the user visits “Castel del Monte”, that is a Swabian castle, and expresses positive feedback (i.e. a value equal or bigger than 3 for each criterion), at the next interaction other Swabian castles will be suggested. This is possible because the items are catalogued in the system according to the architectural style.
Multi-criteria Retrieval in Cultural Heritage Recommendation Systems
69
Fig. 1. The screenshots of the recommendation process: (a) the user expresses the interest in each item category (church, castle, square, museum; (b) a summary of the details of the recommended items is shown; (c) if the user visits the recommended items, s/he can evaluate the experience according to three criteria: architectural style, ease of access and welcome quality; (d) a new recommendation list is proposed according to the user feedbacks
The user’s feedbacks are the input for the multi-criteria retrieval approach. The system stores the multi-criteria ratings and measures the similarity between the target user and the others using Pearson’s correlation coefficient. Then, it computes the total similarity value (μ(u, u’)) and makes predictions for each item according to the adjusted weighted sum (as explained in the previous section). Each user’s feedback modifies the set of co-rated items (Su,u’), the similarities measures, the set of neighbors, and the set of retrieved items (figure 1(d)). Note that even if the system is able to acquire the user’s geographical position, we do not use this to calculate the distance between her/him and the items in the retrieval process. The distance information is used merely to filter (according to the user’s profiles) the recommendation list.
70
P. Di Bitonto et al.
5 Experimental Plan The main goal of the experimental plan is to compare the retrieval effectiveness of the proposed multi-criteria approach with a standard single-criterion approach to recommending cultural heritage items. The standard approach uses Pearson’s correlation coefficient and an adjusted weighted sum. The most popular metrics used to evaluate information retrieval systems are precision and recall [13]. These metrics have also been used to evaluate the retrieval process in recommender systems [12], [14], [15]. In order to compute the precision and recall measures, the set of items has to be divided into two subsets according to the user’s query (relevant and not-relevant). Precision is the ratio of relevant retrieved items on the total items selected (#relevant items selected / #items selected). It expresses the system’s ability to propose the most relevant items in the first positions of a ranked list. Instead, the recall measure is the ratio of relevant items retrieved on the total relevant items available (#relevant items selected / #relevant items available). It expresses the system’s ability to retrieve all the relevant items in a collection. Precision and recall depend on both the definition of the relevance of an item and the method used to compute it. According to Herlocker et al. [16], in recommender systems evaluation the “user is the only person who can determine if an item meets his taste requirements…the relevance is more inherently subjective in recommender systems than in traditional document retrieval”. In order to evaluate the proposed multi-criteria approach and to compare it with a standard single-criterion approach, the experimental plan defines both the item relevance on the basis of the user’s feedback and the recall points in the recommendation list, and it computes the precision values corresponding to the defined recall points. Finally, a precision-recall graph is drawn. We define as relevant those cultural heritage items that the user rates from 3 to 5 points. At the first interaction, this rate represents the evaluation that the user assigns to each category (churches, castles, museums and squares). At the next interaction, this rate represents the average evaluation assigned to each criterion (architectural style, ease of access and welcome quality) in the multi-criteria approach, and the overall evaluation in the single-criterion approach. For example, if the user gives positive feedback for a Swabian castle (i.e “Castel del Monte”), all the Swabian castles stored in the system (such as the Castle of Trani, Swabian Castle of Bari, and so on) will be considered relevant. On the basis of this defined relevance, the recall points were fixed. The precision was calculated for each of them. Because precision values were lacking for some standard recall points (0%, 10%, … , 100%), interpolated precision values were computed. In order to draw the precision-recall graph, for each approach we computed the average precision values, considering 50 recommendation requests corresponding to each recall point. The graph shows the precision trend of the single-criterion approach and the proposed multi-criteria approach. When the recall value is very low (up to 20%), both the compared approaches show a very high precision level. In other words, the system retrieves (on average) 2 relevant items out of 10 and proposes them in the first positions regardless of the approach used. From 20% to 100% of recall, the multi-criteria approach shows a higher precision than the single-criterion one.
Multi-criteria Retrieval in Cultural Heritage Recommendation Systems
71
The experimental plan has shown that the retrieval effectiveness of the multicriteria approach is better than the single-criterion approach. This is demonstrated by the precision values reported in the table and the distance of the precision-recall curve from the origin of the axes (the greater the distance, the better the performance). The experimental results prove that the multi-criteria approach achieves higher precision measures using a lower number of feedbacks than the single-criterion one. Table 1. Average Precision values at standard Recall levels Average Precision Recall Single-criterion Multi-criteria approach approach 0 100 100 10 100 100 20 100 100 30 66 87 40 60 75 50 60 66 60 50 66 70 50 66 80 50 60 90 50 60 100 50 60 120
100
Precision
80
60
40 single-criteria approach
20
multi-criteria approach
0 1
2
20 3
4
40 5
6
60 7
8
80 9
10
100 11
Recall
Fig. 2. Precision-recall graph
6 Conclusion and Future Developments This paper presents a multi-criteria collaborative approach and reports the experimental results of an evaluation of its retrieval effectiveness in cultural heritage recommendations.
72
P. Di Bitonto et al.
The novelties of our work are: it is purely multi-criteria approach (lacking any overall evaluation); it uses Pearson's correlation coefficient to compute similarities between two users, and it considers the number of co-rated items in the overall similarity computation. The proposed multi-criteria approach shows two main strengths. Firstly, it is able to consider the number of co-rated items in a similarity computation between two users. Secondly, it allows more accurate similarities to be obtained because they are based on preferences expressed as regards several different aspects of the item (in our application: architectural style, ease of access and welcome quality). To evaluate the theoretical approach, it has been applied in a mobile cultural heritage recommender system prototype that suggests visits to cultural locations in the Apulia region. The retrieval effectiveness of the multi-criteria approach was proven by comparing it with a standard single-criterion approach. Nevertheless, the multi-criteria method still has a weakness. It computes the overall similarity between two users (µ(u, u’)), giving the same weight to each aspect of the item. The overall similarity between a target user u and another user u’ is the average of similarity measures for each criterion considered. But each user could assign a different priority to each criterion. In the cultural heritage domain, for example, the user could be more interested in visiting locations with easy access than locations with a particular architectural style. A future development will be to introduce the user’s priorities for each criterion in the similarity computation. Thus, in the similarity measure between target user u and another user u’ (µ(u, u’)) the criterion given the highest priority by the target user will be attributed a greater importance than the others.
Acknowledgement We would like to thank students Biagio Anselmi, Davide Cavone, Alessandro Manta and Mauro Silvestri of the “Methods and Techniques for Digital Communication” course (in Master Degree in Informatics) for their contribution in implementing the software.
References 1. Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 325–341. Springer, Heidelberg (2007) 2. Smyth, B.: Case-based Recommendation. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 342–376. Springer, Heidelberg (2007) 3. Burke, R.: Hybrid recommender systems: survey and experiments. User Modeling and User-Adapted Interaction 12, 331–370 (2002) 4. Adomavicius, G., Kwon, Y.: New Recommendation Techniques for Multi-Criteria Rating Systems. IEEE Intelligent Systems 22, 48–55 (2007) 5. Ricci, F., Nguyen, Q.N.: Acquiring and revising preferences in a critique-based mobile recommender system. IEEE Intelligent Systems 22, 22–29 (2007)
Multi-criteria Retrieval in Cultural Heritage Recommendation Systems
73
6. Scherp, A., Boll, S.: Generic support for personalized mobile multimedia tourist applications. In: 12th Annual ACM International Conference on Multimedia, pp. 178–179. ACM Press, New York (2004) 7. Cramer, H., Evers, V., Ramlal, S., Someren, M., van Rutledge, L., Stash, N., Aroyo, L., Wielinga, B.: The effects of transparency on trust in and acceptance of a content-based art recommender. User Model User-Adap. Inter. 18, 455–496 (2008) 8. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17, 734–749 (2005) 9. Lakiotaki, K., Matsatsinis, N., Tsoukiàs, A.: Multicriteria user profiling in recommender systems, http://www.lamsade.dauphine.fr/~tsoukias/papers/ Lakiotakietal.pdf 10. Schafer, J.B., Frankowski, D., Herlocker, J., Sen, S.: Collaborative Filtering Recommender Systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 291–324. Springer, Heidelberg (2007) 11. Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings 14th Conference on Uncertainty in Artificial Intelligence, pp. 43–52. Morgan Kaufmann, San Francisco (1998) 12. Billsus, D., Pazzani, M.J.: Learning collaborative information filters. In: 15th National Conference on Artificial Intelligence, pp. 46–53. AAAI Press, CA (1998) 13. Bollmann, P., Raghavan, V.V., Jung, G.S., Shu, L.C.: On probabilistic notions of precision as a function of recall. Information Processing and Management 28, 291–315 (1992) 14. Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Analysis of recommendation algorithms for E-commerce. In: 2nd ACM Conference on Electronic Commerce, pp. 285–295. ACM Press, New York (2000) 15. Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Application of dimensionality reduction in recommender system–A case study. In: ACM WebKDD 2000 Web Mining for E-Commerce Workshop (2000) 16. Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J.: An algorithmic framework for performing collaborative filtering. In: 22nd ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 230–237. ACM Press, New York (1999)
An Approach for the Automatic Recommendation of Ontologies Using Collaborative Knowledge Marcos Martínez-Romero1, José M. Vázquez-Naya1, Cristian R. Munteanu2, Javier Pereira1, and Alejandro Pazos2 1
IMEDIR Center, University of A Coruña, 15071 A Coruña, Spain 2 Department of Information and Communication Technologies, University of A Coruña, 15071 A Coruña, Spain {marcosmartinez,jmvazquez,cmunteanu,javierp,apazos}@udc.es
Abstract. In recent years, ontologies have become an essential tool to structure and reuse the exponential growth of information in the Web. As the number of publicly available ontologies increases, researchers face the problem of finding the ontology (or ontologies) which provides the best coverage for a particular context. In this paper, we propose an approach to automatically recommend the best ontology for an initial set of terms. The approach is based on measuring the adequacy of the ontology according to three different criteria: (1) How well the ontology covers the given terms, (2) the semantic richness of the ontology and, importantly, (3) the popularity of the ontology in the Web 2.0. In order to evaluate this approach, we implemented a prototype to recommend ontologies in the biomedical domain. Results show the importance of using collaborative knowledge in the field of ontology recommendation. Keywords: ontology recommendation, Web 2.0, Semantic Web.
1 Introduction During the last years, ontologies have proven to be a crucial element in many intelligent applications. They are considered a practical way to describe entities within a domain and the relationships between them in a standardized manner, facilitating the exchange of information between people or systems that utilize different representations for the same knowledge. And, which is even more important, knowledge represented using ontologies can be computable by machines, allowing to perform automatic large-scale data search, analysis, inference and mining [1-2]. Due to the importance of ontologies as the backbone of emerging fields like the Semantic Web [3], a large number of ontologies have been developed and published online during the last decade. At the time of writing this paper, the Swoogle Semantic Web search engine1, which is a famous tool for searching ontologies, indexes more than 10,000 ontologies from multiple areas, which present different levels of quality and reliability. This huge ontological ocean makes difficult for users to find the best 1
http://swoogle.umbc.edu/
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 74–81, 2010. © Springer-Verlag Berlin Heidelberg 2010
An Approach for the Automatic Recommendation of Ontologies
75
ontology for a given task. As an example, a query in Swoogle for the keyword cancer provides 264 different results, and asking users to manually select the most suitable ontology is infeasible since it adds much labor costs for them. Due to this, as the number and variety of publicly available ontologies continues steadily growing, so does the need for proper and robust strategies and tools to facilitate the task of identifying the most appropriate ontologies for a particular context. This paper presents an approach for ontology recommendation which combines the assessment of three criteria: context coverage, semantic richness and popularity. The combination of these different but complimentary evaluation methods is aimed to give a robust reliability and performance in large-scale, heterogeneous and open environments like the Semantic Web.
2 Related Work Ontology evaluation is a complex task, which can be considered the cornerstone of the ontology recommendation process, The need for evaluation strategies in the field of ontologies showed up as soon as 1994 with the work of Gómez-Pérez in Stanford [4]. Since that moment, several works proposed to evaluate the quality of an ontology on the basis of ontology metrics such as number of classes, properties, taxonomical levels, etc. [5]. More recently, other authors have provided different strategies to assess the coverage that an ontology provides for a particular context or domain [6-9]. A complete review of evaluation strategies has been provided by Brank et al. [10]. We argue that even though it is important to know the quality of an ontology as a function of metrics applied to its elements, and to assess how well the ontology represents a particular context, it is also vital to have into account its social relevance or popularity in the ontological community. Some authors have previously stated the importance of taking popularity into account in ontology evaluation [11-12]. To the best of our knowledge, the approach proposed in this paper is the first that aims to recommend the most appropriate ontologies for a given context using collaborative knowledge extracted from the Web 2.0 and also taking into account the semantic richness of the candidate ontologies.
3 Methods In this section, we present the approach for automatic ontology recommendation, whose overall workflow is shown in Fig. 1. We start describing how the initial terms are semantically cleaned and expanded. Then, we illustrate the evaluation methods that constitute the core of the recommendation strategy. Finally, we describe the method followed to aggregate the different obtained scores into a unique score. 3.1 Semantic Cleaning and Expansion The process starts from a set of initial terms, which can be provided by a user, a Semantic Web agent, or automatically identified from an information resource (e.g. text, image, webpage, etc.). On the basis of one (or several) lexical or terminological
76
M. Martínez-Romero et al.
resources (e.g. WordNet2 as the general-purpose lexical resource for the English language, or UMLS3 as a specialized resource for the biomedical field), the set of initial terms are cleaned from erroneous or redundant terms. A term that is not contained in the terminological resources is considered an erroneous term and it is removed. Also, if there are terms from the initial set with the same meaning (synonyms), only one of them remains. After that, each remaining term is expanded with its synonyms to increase the possibility of finding it into the ontology. Example 1. Suppose that the initial set of terms is {leukocyte, white cell, blood, celdj}, and that the selected lexical resource is WordNet. Then, the set of cleaned terms would be {leukocyte, blood}. The term white cell has been removed for being a synonym of leukocyte, and celdj because it does not exist. The set of expanded terms for leukocyte would be {leukocyte, leucocyte, white blood cell, white cell, white blood corpuscle, white corpuscle, wbc}, and for blood would be just {blood}.
Fig. 1. Workflow of the ontology recommendation process
3.2 Context Coverage Evaluation This step is addressed to obtain the ontologies that (partially or completely) represent the context. This is achieved by evaluating how well each ontology from a given repository covers the expanded sets of terms. We propose a metric, called CCscore, which is based on the CMM metric proposed by Alani and Brewster [13] adapted to have into account only exact matches, because it has been found that considering partial matches (e.g. anaerobic, aerosol, aeroplane) can be problematic and reduce the search quality [14]. The CCscore consists on counting the number of initial terms that are contained in the ontology. An initial term is contained in the ontology if there is a class in the ontology that has a label matching the term or one of its expanded terms. Definition 1. Let o be an ontology, T = {t1, …, tn} is the initial set of terms, C = {t1, …, tm} is the set of terms after the semantic cleaning step, and S = {S1, …, Sm} is the 2 3
http://wordnet.princeton.edu/ http://www.nlm.nih.gov/research/umls/
An Approach for the Automatic Recommendation of Ontologies
77
set that contains all the sets of expanded terms, where Si is the set of expanded terms for the term ti ∈ C , with 1 ≤ i ≤ m . CCscore ( o , T ) =
M (o, S ) S
(1)
where M(o, S) is the function that counts the number of sets of expanded terms Si that contain at least one term matching one label from the ontology. Example 2. Suppose that we start from the set of terms presented in the Example 1. Then S = {S1, S2}, with S1={leukocyte, leucocyte, white blood cell, white cell, white blood corpuscle, white corpuscle, wbc} and S2={blood}. Also suppose that we are working with an ontology o, which has only five classes, whose labels are {finger, white cell, cell, bone, skull}. Then M ( o, S ) = 1 (the matching term is white cell from S1), S = 2 and CCscore(o, S ) = 0.5 . 3.3 Semantic Richness Evaluation
Ontologies with a richer set of elements are potentially more useful than more simple ones. In this section, we present a metric to assess the semantic richness of an ontology. It is called the Semantic Richness Score (SRscore), and it is based on the constructors provided by the Ontology Web Language (OWL)4, which is the most popular and widely used language for ontology construction. In OWL, concepts are called classes, and each class is considered as a set of individuals that share similar characteristics. We propose to measure the semantic richness of an ontology according to the presence of the following OWL elements: ─ Object properties. They represent relations between instances of classes. A widely-known example of an object property is the is_a hierarchical relation. ─ Datatype properties. Datatype properties relate individuals to RDF literals or simple types defined in accordance with XML Schema datatypes. For example, the datatype property hasAddress might link individuals from a class Person to the XML Schema datatype xsd:string. ─ Annotation properties. They provide additional semantic information to the ontology, such as definitions of terms, labels, comments, creation date, author, etc. As an example, the NCI Thesaurus5, which is a popular ontology for the cancer domain provides several definitions for the concept colorectal carcinoma. One of them is: A cancer arising from the epithelial cells that line the lumen of the colon. Other information provided by this ontology using annotation properties are the synonyms for this concept (e.g. Cancer of the Large Bowel, Carcinoma of Large Intestine, Colorectal Cancer, etc.), the most common name for this concept, its semantic type, etc.
On the basis of these elements, the structural richness score (SRscore) is defined. Definition 2. Let o be an ontology. 4 5
http://www.w3.org/TR/owl-features/ http://ncit.nci.nih.gov/
78
M. Martínez-Romero et al.
SRscore(o) = wobjobj (o) + wdatdat (o) + wannann(o )
(2)
where obj(o) is the number of object properties in the ontology, dat(o) is the number of datatype properties, ann(o) is the number of annotation properties and wobj,, wdat, wann are the weight factors. Example 3. The NCI Thesaurus (version 09.12d) has 124 object properties, 72 datatype properties and 78 annotation properties. Its SRscore would be calculated as: SRscore = 0.3 ∗ 124 + 0.3 ∗ 72 + 0.4 ∗ 78 = 90 (wobj,=0.3, wdat=0.3, wann =0.4). 3.4 Popularity Evaluation
Apart from evaluating how well an ontology covers a specific context and the richness of its structure, there is another aspect that requires special attention. According to the definition of ontology provided by Studer [15], which can be considered one of the most precise and complete definitions up to the moment, "an ontology captures consensual knowledge, that is, it is not private to some individual, but accepted by a group". With this in mind, any method addressed to evaluate an ontology should consider the level of collective acceptability or popularity of the ontology. Our approach proposes to evaluate the popularity of an ontology by using the collective knowledge stored in widely-known Web 2.0 resources, whose value is created by the aggregation of many individual user contributions. Our proposal consists on counting the number of references to the ontology in each resource. Depending of the kind of resource, the criteria to measure the references will be different. For example, in Wikipedia, we could count the number of results obtained when searching for the name of the ontology; in Del.icio.us, we could measure the number of users who have added the URL of the ontology to their favorites; in Google Scholar, retrieving the number of papers which reference the ontology; or even in Facebook, by checking the number of users in specific groups dedicated to the ontology. Apart from this generalpurpose resources, it could also be taken into account Web 2.0 resources aimed to cover special shared domains, like the NCBO Bioportal6 for the biomedical area. According to this idea, we propose to assess the popularity of an ontology on the basis of the following metric. Definition 3. Let o be an ontology, R = {r1, ..., rn} is the set of Web 2.0 resources taken into account and P = {p1, ..., pn} the number of references to the ontology from each resource in R, according to a set of predefined criteria.
n Pscore(o, R ) = ∑ wiαiPi i =1
(3)
where wi is a weight factor and αi is a factor to equilibrate the number of references among criteria that provide results in different ranges (default value 1). Example 4. Suppose that we are evaluating the popularity of Gene Ontology (GO)7, which is a well-known ontology aimed to provide consistent descriptions of gene products, and that we just want to take into account Wikipedia and Del.icio.us 6 7
http://bioportal.bioontology.org/ http://www.geneontology.org/
An Approach for the Automatic Recommendation of Ontologies
79
resources. At the moment of writing this paper, in Wikipedia there are 11 pages with the text "Gene Ontology", and at Del.icio.us there are 313 users who have added the GO website to their bookmarks. The Pscore would be worked out according to the following expression: Pscore = 0.5 ∗ 1 * 76 + 0.5 ∗ 1 * 313 = 194.5 (w1=0.5, w2=0.5, α1=1, α2=1). 3.5 Aggregated Ontology Score
The final score for an ontology is calculated once the three previously explained measures have been applied for all the ontologies that are being evaluated. It is calculated by aggregating the obtained values, taking into account the weight (importance) of each measure, according the following definition. Definition 4. Let o be an ontology from a set of candidate ontologies O, T is an initial set of tems and R is a set of Web 2.0 resources. Score (o, T , R ) = wCC ∗ CCscore (o, T ) +
wSR ∗ SRscore (o ) wP ∗ Pscore ( o, R ) + (4) max( SRscore (O )) max( Pscore (O , R ))
where wCC, wSR and wP are the weights assigned to CCscore, SRscore and Pscore, respectively. As it can be seen in the expression, values of SRscore and Pscore are normalised to be in the range [0, 1] by dividing them by the maximum value of the measure for all ontologies. Finally, ontologies are ranked according to their score.
4 Evaluation and Results On the basis of the proposed approach, we have implemented an early prototype of an ontology recommender system for the biomedical domain and carried out an experiment to test the validity of the approach. Starting from the following set of initial terms {colorectal carcinoma, apoptosis, leep, combination therapy, chemotherapy, DNA, skull, pelvic girdle, cavity of stomach, leukocyte}, randomly chosen from the areas of cancer and anatomy, the prototype provides the results that are shown in Table 1 (only top 10 recommended ontologies are shown). The repository of candidate ontologies is composed by 30 ontologies. 15 of them are widely known biomedical ontologies, while the other 15 correspond to the top results returned by Swoogle when searching the initial terms. The prototype uses WordNet and UMLS to semantically clean and expand the initial terms. Weights for Semantic Richness Evaluation are: wobj=0.2, wdat=0.2, wann=0.6. As the Web 2.0 resources, we chose Wikipedia and BioPortal. The prototype counts the number of search results returned by Wikipedia when searching for the name of the ontology and also checks if the ontology is listed in BioPortal. We used the following weights and parameters for popularity: w1=0.5, w2=0.5. α1=1, α2=10. Finally, weights for Score Aggregation are wCC=0.6, wSR=0.1, wP=0.3. As we consider vital to recommend ontologies that provide a good coverage of the domain, the greater weight was given to the Context Coverage Score, followed by the Popularity Score and the Semantic Richness Score.
80
M. Martínez-Romero et al. Table 1. Results of the evaluation Ontology NCI Thesaurus Medical Subject Headings SNOMED Clinical Terms Gene Ontology Foundational Model of Anatomy ACGT Master Ontology Ontosem ResearchCyc Systems Biology Ontology TAP
CCscore 0.900 0.600 0.800 0.200 0.500 0.400 0.400 0.300 0.100 0.100
SRscore 0.089 0.006 0.014 0.045 0.065 0.050 0.001 0.103 0.001 0.000
Pscore 0.140 0.628 0.128 1.000 0.186 0.128 0.000 0.070 0.186 0.000
Total Score 0.591 0.549 0.520 0.425 0.362 0.283 0.240 0.211 0.116 0.060
On examining the results, some interpretations stand out: (1) the NCI Thesaurus cancer ontology is most recommended ontology, providing a 90% of coverage for the initial terms. Considering that most of the initial terms belong to the cancer domain, this result makes sense. (2) The top 5 recommended ontologies are widely-known biomedical ontologies. (3) ontologies with a low popularity are recommended low in the results, even if they provide and acceptable context coverage (e.g. Ontosem). (4) An ontology with a high popularity can have a better aggregated score than another ontology which has better domain coverage (e.g. Medical Subject Headings and SNOMED).
5 Conclusions and Future Work Automatic ontology recommendation is a crucial task to enable ontology reusing in emerging knowledge-based applications and domains like the upcoming Semantic Web. In this work, we have presented an approach for the automatic recommendation of ontologies which, to the best of our knowledge, is unique in the field of ontology recommendation. Our approach is based on evaluating three different aspects of an ontology: (1) how well the ontology represents a given context, (2) its semantic richness and (3) the popularity of the ontology. Results show that ontology popularity is a crucial criteria to obtain a valid recommendation, and that the Web 2.0 is an adequate resource to obtain the collaborative knowledge required to this kind of evaluation. As an immediate future work, we plan to develop strategies to automatically identify the best weights and parameters for the operating context, instead of having to adjust their values by hand. Investigating new methods to evaluate ontology popularity is another future direction. We also have interest in studying if reasoning techniques can help to improve our scoring strategies.
Acknowledgements Work supported by the “Galician Network for Colorectal Cancer Research” (REGICC, Ref. 2009/58), funded by the Xunta de Galicia, and the "Ibero-NBIC Network" (209RT0366) funded by CYTED. The work of Marcos Martínez-Romero is supported by a predoctoral grant from the UDC. The work of José M. Vázquez-Naya
An Approach for the Automatic Recommendation of Ontologies
81
is supported by a FPU grant (Ref. AP2005-1415) from the Spanish Ministry of Education and Science. Cristian R. Munteanu acknowledges the support for a research position by “Isidro Parga Pondal” Program, Xunta de Galicia.
References 1. Gómez-Pérez, A., Fernández-López, M., Corcho, O.: Ontological Engineering: With Examples from the Areas of Knowledge Management. In: E-Commerce and the Semantic Web. Springer, Heidelberg (2004) 2. Gruber, T.R.: Toward principles for the design of ontologies used for knowledge sharing. Int. Journal of Human Computer Studies 43, 907–928 (1995) 3. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic Web. Scientific American 284, 34– 43 (2001) 4. Gómez-Pérez, A.: From Knowledge Based Systems to Knowledge Sharing Technology: Evaluation and Assessment. Knowledge Systems Lab., Stanford University, CA (1994) 5. Supekar, K., Patel, C., Lee, Y.: Characterizing quality of knowledge on semantic web. In: Seventeenth International FLAIRS Conference, Miami, Florida, USA, pp. 220–228 (2004) 6. Alani, H., Noy, N., Shah, N., Shadbolt, N., Musen, M.: Searching ontologies based on content: experiments in the biomedical domain. In: The Fourth International Conference on Knowledge Capture (K-Cap), Whistler, BC, Canada, pp. 55–62. ACM Press, New York (2007) 7. Netzer, Y., Gabay, D., Adler, M., Goldberg, Y., Elhadad, M.: Ontology Evaluation Through Text Classification. In: Chen, L., Liu, C., Zhang, X., Wang, S., Strasunskas, D., Tomassen, S.L., Rao, J., Li, W.-S., Candan, K.S., Chiu, D.K.W., Zhuang, Y., Ellis, C.A., Kim, K.-H. (eds.) RTBI 2009. LNCS, vol. 5731, pp. 210–221. Springer, Heidelberg (2009) 8. Vilches-Blázquez, L., Ramos, J., López-Pellicer, F., Corcho, O., Nogueras-Iso, J.: An approach to comparing different ontologies in the context of hydrographical information. In: Heidelberg, S.B. (ed.) Information Fusion and Geographic Information Systems, vol. 4, pp. 193–207. Springer, Heidelberg (2009) 9. Jonquet, C., Shah, N., Musen, M.: Prototyping a Biomedical Ontology Recommender Service. In: Bio-Ontologies: Knowledge in Biology, Stockholm, Sweden (2009) 10. Brank, J., Grobelnik, M., Mladenic, D.: A survey of ontology evaluation techniques. In: 8th International Multi-Conference Information Society IS, pp. 166–169 (2005) 11. Ding, L., Pan, R., Finin, T., Joshi, A., Peng, Y., Kolari, P.: Finding and ranking knowledge on the semantic web. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 156–170. Springer, Heidelberg (2005) 12. Buitelaar, P., Eigner, T., Declerck, T.: Ontoselect: A dynamic ontology library with support for ontology selection. Demo Session at the International Semantic Web Conference. Hiroshima, Japan (2004) 13. Alani, H., Brewster, C.: Ontology ranking based on the analysis of concept structures. In: 3rd International Conference on Knowledge Capture (K-Cap), Banff, Canada, pp. 51–58 (2005) 14. Jones, M., Alani, H.: Content-based ontology ranking. In: 9th Int. Protégé Conference, Stanford, CA (2006) 15. Studer, R., Benjamins, V.R., Fensel, D.: Knowledge engineering: principles and methods. Data & Knowledge Engineering 25, 161–197 (1998)
Knowledge Mining with ELM System Ilona Bluemke and Agnieszka Orlewicz Institute of Computer Science,Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland [email protected]
Abstract. The problem of knowledge extraction from the data left by web users during their interactions is a very attractive research task. The extracted knowledge can be used for different goals such as service personalization, site structure simplification, web server performance improvement or even for studying the human behavior. We constructed a system, called ELM (Event Logger Manager), able to register and analyze data from different applications. The registered data can be specified in an experiment. ELM provides several knowledge mining algorithms i.e. Apriori, ID3, C4.5. The objective of this paper is to present knowledge mining in data from interactions between user and a simple application conducted with ELM system. Keywords: web usage mining system, knowledge mining.
1 Introduction The time spent by people in front of computers still increases so the problem of knowledge extraction from the enormous amount of data left by web users during their interactions is a research task that has increasingly gained attention in the last years. The analysis of such data can be used to understand users preferences and behaviors in a process commonly referred to as Web Usage Mining (WUM). The extracted knowledge can make Web site management much more effective by enabling service personalization, site structure simplification, web server performance improvement or even for studying the human behavior. Leveraging these techniques we can create adaptive Web sites which better meet user requirements. In the past, several WUM systems have been implemented, some of them are presented in section 2. At the Department of Electronics and Information Systems Warsaw University of Technology a WUM system, called ELM (Event Logger Manager) was designed and implemented [1]. Our system significantly differs from existing WUM systems. The majority of WUM systems is dedicated to one web server or one application. ELM is flexible and easy to integrate with any Java application. We use aspect modification, as proposed in [2], to add code responsible for registering required data. ELM is able to collect and store data from users interactions from many applications in one data base, such feature is not provided by other systems. ELM system contains several knowledge mining algorithms e.g. Apriori, ID3 and C4.5 taken from weka library [3], but without any difficulty other algorithms can be added. In our system a user can also decide on the kind of data registered and filtered and, to R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 82–92, 2010. © Springer-Verlag Berlin Heidelberg 2010
Knowledge Mining with ELM System
83
our best knowledge, this property is not available in other systems. In section 3 ELM system is briefly presented. The objective of this paper is to present how web usage mining can be conducted with ELM system. The experiment is described in section 4 and conclusions are given in section 5.
2 Web Usage Mining Systems The information placed in Internet is still increasing. During navigation web users also leave many records of their activity. This enormous amount of data can be a useful source of knowledge but sophisticated processes are need for the analysis of these data. Data mining algorithms can be applied to extract, understood and use knowledge from these data and all these activities are called web mining. Depending on the source of input data web mining can be divided into three types: • • •
contents of internet documents are analyzed in Web Content Mining, structure of internet portals are analyzed in Web Structure Mining, the analysis of data left by users can be used to understand users preferences and behavior in a process commonly referred to as Web Usage Mining (WUM).
Overviews of web mining with long lists of references can be found in [4, 5, 6]. The web usage mining process consists of following phases: data acquisition, data preprocessing, pattern discovery and pattern analysis. Data preparation and pattern discovery processes are described e.g. in [7]. Often the results of pattern analysis are feedback to pattern discovery activity. A brief overview of pattern discovery methods is given in [8]. Pattern discovery methods can be divided into several groups: 1. statistical analysis (like frequency, mean), 2. association rules discover correlation among pages accessed, 3. clustering of users enable to discover groups of users with similar navigational patterns, 4. sequential patterns find inter session patterns presence of a set of items followed by another item in time order, 5. classification maps data item into one of predefined classes, 6. dependency modeling determines if there are significant dependencies among variables. Effective data acquisition phase is crucial for web usage mining. The sources of data for web usage mining are following: 1. Web servers – the most common source of data. Web logs usually contain fields like date, time, user IP address, bytes sent, request line, etc. The information is usually expressed in one of standard file format as Common Log Format, Extended Log Format and Log ML. 2. Proxy side – data are similar to data collected at server side but they contain data for groups of users accessing groups of web servers. 3. Client side -data can be obtained by modified browsers, Java applets, Java scripts. This approach avoids the problem of session identification and problems caused by caching. Detailed information about users behavior can be provided.
84
I. Bluemke and A. Orlewicz
The analysis of data from users’ interactions can be used to understand users preferences and behavior in a Web Usage Mining (WUM) [9, 10]. The knowledge extracted can be used for different goals such as service personalization, site structure simplification, and web server performance improvement e.g. [11,12]. There are different WUM systems: small systems performing fixed number of analysis and there are also systems dedicated to large internet services. In the past, several WUM projects have been proposed e.g. [13, 14, 15, 16, 17, 18]. In Analog system [13] users activity is recorded in server log files and processed to form clusters of user sessions. The online component builds active user sessions which are then classified into one of the clusters found by the offline component. The classification allows to identify pages related to the ones in the active session and to return the requested page with a list of related documents. Analog was one of the first project of WUM. The geometrical approach used for clustering is affected by several limitations, related to scalability and to the effectiveness of the results found. Nevertheless, the architectural solution introduced was maintained in several other more recent projects. Web Usage Mining (WUM) system, called SUGGEST [14], was designed to efficiently integrate the WUM process with the ordinary web server functionalities. It can provide useful information to make easier the web user navigation and to optimize the web server performance. Many Web usage mining systems are collaborating with an internet portal or service. The goal is to refine the service, modify its structure or make it more efficient. In this group are systems SUGGEST [14], WUM [16], WebMe [17], OLAM [18]. Data are extracted from www servers’ logs. Some examples of patterns extracted from server logs are presented in [7]. In [8] a method of extracting patterns from Web logs based on Kohonen’s Self Organizing Maps (SOM) is presented. This method is simple and fast comparing to clustering algorithms. It also enables to extract patterns of any length. Other advantage of this approach is that the number of clusters do not need to be known. Widely known big portals and companies are using web mining for different purposes. In Amazon techniques like associations between pages visited, click-path analysis are used to adapt the store for individual customer. Knowledge obtained from Web mining is used for instant recommendations or ‘wishes –lists’ [19]. Google [20], very popular and widely used search machine, was probably the first to identify the importance of link structure in mining information from the Web. Page-Rank algorithm measures the importance of a page using structural information of the Web graph. The relevance of a page is expressed in how many other pages cite it. PageRank is used in all Google search products. Google also introduced the Advertising Program providing advertisements, that are relevant to a search query. Other service offered by Google is “Google News” [21] integrates the latest news from online versions of newspapers and organizes the categorically to make it easier for users to read the most relevant news. In e-Bay web mining techniques are used by to analyze the bidding behavior to determine if a bid is a fraud [25]. Yahoo [26] introduced the concept of “personalized portal”. Mining usage logs provides valuable information about usage habits thus enabling providing personalized contents. An example of a system using data from client side is Archcollect [16]. This system is registering data by modified explorer. Archcollect is not able to perform
Knowledge Mining with ELM System
85
complex analysis. In [24] a method which combines graph based web usage mining with statistical analysis of data from client side is presented. Authors ([24]) show also that the results of the mining process are helpful for administrators. We haven't found a system registering data on both client and server sides.
3 ELM Web Usage Mining System In ELM system all phases of web usage mining process are performed. ELM is responsible for data storage, data preparation and data analysis. System provides predefined set of data mining algorithms and data preparation filters, but without difficulties new algorithms and custom filters can be added. System is able to store data in local or remote data base. ELM user is able to define what events should be stored and when. Some basic types of events are predefined but user can also specify its own logical events. In ELM system data are preprocessed, data mining algorithms can be executed and its results may be observed.
Fig. 1. ELM architecture
In Fig.1 the main parts of ELM system are presented. Data acquisition is performed by EventLogger and analysis by EventManager. Both modules are using eventDB, relational data base, containing stored events. EventLogger API is written to facilitate the access to data base for data registering and data analyzing modules. In this module logical types of events may be defined. For defined events also the parameters may be created. EventLogger API writes or reads events with parameters values. In order to avoid dependencies to third party libraries only JDBC (Java Data Base Connectivity) is used to access the database. JDBC provides methods for querying and updating data in a relational database. EventLogger is able to register events handled at the server side. We use aspect modification proposed in [2] to log events. An aspect, written in AspectJ [25], is responsible for storing appropriate information describing the event. This aspect can be woven to the code of Java application, even if only compiled code of this application is available. Registering data in aspects, instead of using default web logs, enables more flexibility in the content and format of data thus making the data preparation phase much easier. The structure of ELM repository is shown in Fig. 2. More information concerning ELM system can be found in [1].
86
I. Bluemke and A. Orlewicz
Fig. 2. ELM repository
4 Example To find the advantages and disadvantages of ELM system and its ability to discover knowledge an experiment with a real application should be conducted. We had problems to find a real application which could be monitored by ELM system. It is natural, that the owners will not let to associate with theirs applications “an unknown” system. We had to find an exemplary application having most, regular functionalities of a web application. jpetstore [26] – an application in JavaEE technology, available at sourceforge.net, was taken as a “case study”. This application is available in two versions, one using Spring MVC and the other Struts. In jpetstore an animal in one of following categories: cats, dogs, birds, fish or reptiles may be bought. When a user enters appropriate category the species in his category are displayed. Each species is assigned with some individuals presented on a picture with detailed information about the price and the provider. During registration process each user should give her/his personal data and preferences e.g. preferable animal’s category. Banner will remind the user to buy products for her/his pet. The goal of our investigation was to reduce the amount of data describing the preferences during the registration process and to generate the hints what to buy based on user’s interactions. Data registering was accomplished by aspect modification [2]. An aspect in AspectJ [25] was written. In this aspect, using EventLoggerAPI, several events were defined e.g. viewing products, adding a product into basket, checkout, signoff. Project containing EventLoggerAPI (in .war file) and the aspect were deployed on Tomcat application server. Load time weaving was used to combine application with its aspect. Several experiments on registered events and data were performed. Data left by a user were multiplied and randomly interleaved to simulate many users. In interaction data, on purpose some regularities were inserted e.g. clients buying cats are also buying fishes, because they find it cheaper than buying special food. We wanted to check if ELM will be able to detect such regularities using data mining algorithms. From the beginning of our experiments the events entering an application page and filling formats were registered.
Knowledge Mining with ELM System
87
4.1 Knowledge Mining In this section the knowledge mining process with ELM system working on jpetstore application is briefly described. Association and classification learning were explored. 4.1.1 Classification Learning The classification learning process was used to find a simple classifier giving the client’s favorite category based on the contents of his/her basket and the price of all products. A special event, named PET_STORE_CHECKOUT, a kind of CartEvent was defined. The definition is given in Fig. 3.
Fig. 3. Definition of event PET_STORE_CHECKOUT and its parameters categories and prices instead of product names
The k-fold cross-validation [10] technique for used for splitting the data into training and testing sets. The data is split into k approximately equal partitions. Each partition is in turn used for testing and the remainder for training. Learning procedure is executed k times on different training sets. k error estimates are averaged to yield an overall error estimate. Several classification learning algorithms were executed. We started with simple ZeroR algorithm which was able to find Reptiles as the majority category. Next, several algorithms building decision trees i.e. ID3, C4.5, PART were executed with different options. Filtering of data e.g. mergetwovalues were also applied. The results are presented in Table 1. The best results can be observed for C4.5 and PART algorithms. Table 1. Comparison of results of classification algorithms. Columns contain the number of data, percentage of correctly and incorrectly classified cases. data products categories categories categories categories categories products categories
no 51 21 21 67 67 67 51 67
Alg. ID3 ID3 C4.5 C4.5 C4.5 C4.5 C4.5 PART
Correct 68.6% 68.7% 72.4% 79.1% 83.5% 77.6% 80.3% 80.5%
Incorrect 5.8% 17.2% 27.5% 20% 16.5% 22.3% 19.6% 19.4%
Not classified 25.4% 13.7% 0 0 0 0 0 0
option useLaplace -M3 binarySplit -
In Fig. 4 the tree obtained by ID3 algorithm on data with reduced number of attributes is presented. On this tree a rule that: “for clients taking the first product from CATS and third - FISH the favorite category is CATS”. Other rule which can be seen on this tree says, that if they are buying more CATS (C1=CATS and C3=CATS)
88
I. Bluemke and A. Orlewicz
Fig. 4. Decision tree obtained by ID3 algorithm on data with reduced number of attributes
theirs favorite animal is REPTILE. These rule are the result of the phenomenon on purpose introduced in interactions that users are buying fishes as food for cats and cats as food for reptiles. In Fig. 5 the tree generated by C4.5 algorithm with binary test are presented.
Fig. 5. Decision tree obtained by C4.5 algorithm on data with full number of attributes and binary tests
4.1.2 Association Learning Three association-rule learners from weka library were examined: Apriori, Tertius and PredictiveApriori [10, 3]. A PET_STORE_VIEW event was defined in ELM system. This event was registered if a user entered a system page to see the description of a product or an exemplary. This event contains an additional parameter the name of a product or an exemplary. The data obtained from logging PET_STORE_VIEW events were grouped with the session identifier. The goal was to obtain records describing all categories, products and exemplary viewed by a user during one session. These data were filtered and the association-rule learners were executed by appropriate scripts in ELM. Some problems were observed. PredicitiveApriori ended with an error “insufficient stack”. Tertius was working for a long time and was not able to produce
Knowledge Mining with ELM System
89
reasonable results. Apriori also doesn’t generate reasonable results. To solve this problem the set of attributes was reduced. The generated rules are presented in Fig. 6 - 8. These rules concern some regularities on “not viewing some products” and can be translated into clients behavior in “real” world.
Fig. 6. Rules obtained by Apriori algorithm on data concerning event PET_STORE_VIEW with some attributes removed
Fig. 7. Rules obtained by Tertius algorithm on data concerning event PET_STORE_VIEW
Fig. 8. Rules obtained by PredictiveApriori algorithm on data concerning event PET_STORE_VIEW with some attributes removed
90
I. Bluemke and A. Orlewicz
Algorithm Apriori (Fig. 8) found a rule: “clients not buying reptiles are not buying roof skeeper cats as well”. This rule is a consequence of the regularity on purpose introduced in data that “clients buying reptiles are also buying cats as a food for reptiles”. In the results generated by Tertius it can be seen that “clients not buying food for kittens are also not buying cats” which is obvious. PredictiveApriori also generated obvious rule that: ”clients not buying reptiles and not buying food for them are also not buying cats” 4.2 Results of Experiment However in the above described case study a simple internet application was examined the obtained results are very promising and close to “real world”. In the knowledge mining process typical activities were practiced. Such activities will also appear in more complex internet applications. This case study revealed advantages and disadvantages of ELM system. Grace to aspect modification the ELM system can be easily “joined” with any Java application, even if the source code of this application is not available. The amount of AspectJ code which must be prepared is limited (aspect, pointcut). ELM system is resilient, different types of events, even some defined by user, can be logged. The attributes of the registered data and its number can be easily changed. User is able to monitor the knowledge mining process and tune it. Other advantage of ELM system is the support in data processing phase: available different filters and especially the ability to group data. Currently the system is not able to apply a sequence of filters and such feature could facilitate the data processing phase. Other functionality, very convenient in data processing phase, which should be considered in future version of ELM could be splitting the data into training and testing sets (currently cross validation is used [10]).
5 Conclusions Knowledge about users and understanding user needs is essential for many web applications. Using algorithms for knowledge discovery it is possible to find many interesting trends, obtain pattern of user behavior, better understand user needs. We presented briefly in section 3 general system for web usage mining and business intelligence reporting. We can not compare our small system to giants like Google, Yahoo or Amazon. To our best knowledge, the details about web usage mining processes in these systems are not published so we can not make more thorough references. ELM system is flexible and easy to integrate with Java web applications by aspect modification. It is able to collect and store data from users interactions from many applications in one data base and analyze them using different knowledge discovery algorithms. A user of our system may add own algorithms or tune implemented ones, as well as decide what events should be captured, filtered and which pattern discovery method should be used. Others WUM systems are usually strictly associated with a portal or one application, use server log files with “fixed contents” and contain one group of pattern discovery methods.
Knowledge Mining with ELM System
91
Simple example, described in section 4, shows, that using ELM system containing data mining algorithms some useful information from user interactions can be deduced. To obtain more convincing results similar experiments but on a “real” application should be conducted.
References 1. Bluemke, I., Orlewicz, A.: System for knowledge mining in data from interaction between user and application. In: Man-Machine Interactions, Advances in Intelligent and Soft Computing, vol. 59, pp. 103–110. Springer, Heidelberg (2009) 2. Bluemke, I., Billewicz, K.: Aspects in the maintenance of compiled programs. In: Proceedings of Third International Conference on Dependability of Computer Systems, DepCoSRELCOMEX 2008, pp. 253–260 (2008) 3. weka, http://www.cs.waikato.ac.nz/ml/wek 4. Srivastava, J., et al.: Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explorations 1(2), 12–23 (2000) 5. Srivastava, J., Desikan, P., Kumar, V.: Web mining – concepts, applications and research directions. In: Data Mining: Next Generations, Challenges and Future Directions. AAAI/MIT Press, Boston (2003) 6. Berendt, B., Stumme, G., Hotho, A.: Usage Mining for and on the Semantic Web. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 264–278. Springer, Heidelberg (2002) 7. Nina, S.P., et al.: Pattern discovery of web usage mining. In: Int. Conf. on Computer Tech. and Devel. IEEE Xplore (2009), 978-0-7695-3892-1/09 8. Etminani, K., Delui, A.R., Yaneshsari, N.R., Rouhani, M.: Web usage mining: discovery of the users’ navigational pattern using SOM. IEEE Xplore (2009), 978-1-4244-4615-5/09 9. Chen, J., Wei, L.: Research for web usage mining model. In: Computational Intelligence for Modelling. Control and Automation, vol. 8 (2006), ISBN: 0-7695-2731-0 10. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005) 11. Clark, L., et al.: Combining ethnographic and clickstream data to identify user web browsing strategies. Information Research 11(2) (2006) 12. Nasraoui, O.: World wide web personalization. In: Wang, J. (ed.) Encyclopedia of Data Mining and Data Warehousing. Idea Group, USA (2005) 13. Analog (04, 2009), http://www.analog 14. Baraglia, R., Palmerini, P.: Suggest: A web usage mining system. In: Proceedings International Conference on Information Technology: Coding and Computing, pp. 282–287 (2002), ISBN:0-7695-1503-1 15. Botia, J.A., Hernansaez, J.M., Gomez-Skarmeta, A.: METALA: a distributed system for Web usage mining. In: Mira, J., Álvarez, J.R. (eds.) IWANN 2003. LNCS, vol. 2687, pp. 703–710. Springer, Heidelberg (2003) 16. de Castro Lima, J., et al.: Archcollect front-end: A web usage data mining knowledge acquisition mechanism focused on static or dynamic contenting applications. In: ICEIS, vol. 4, pp. 258–262 (2004) 17. Lu, M., Pang, S., Wang, Y., Zhou, L.: WebME - web mining environment. In: International Conference on Systems, Man and Cybernetic, vol. (7) (2002)
92
I. Bluemke and A. Orlewicz
18. Cercone, X.H.: An OLAM framework for Web usage mining and business intelligence reporting. In: Proceedings of the 2002 IEEE International Conference on Fuzzy Systems, vol. (2), pp. 950–955 (2002) 19. Amazon, http://www.amazon.com 20. Google, http://google.com 21. Google news, http://news.google.com 22. Colet, E.: Using data mining to detect fraud in auctions, DSstar (2002) 23. Yahoo, http://www.yahoo.com 24. Heydari, M., et al.: A Graph –Based Web usage mining method considering client side data. In: Int. Conf. on Electrical Eng. Informatics, Malaysia. IEEE Xplore (2009), 978014244-4913-2/09/ 25. AspectJ, http://www.eclipse.org/aspectj/ 26. JPetStore, http://sourceforge.net/projects/ibatisjpetstore/
DOCODE-Lite: A Meta-Search Engine for Document Similarity Retrieval Felipe Bravo-Marquez1, Gaston L’Huillier1 , Sebasti´ an A. R´ıos1 , 1 Juan D. Vel´ asquez , and Luis A. Guerrero2 1
University of Chile, Department of Industrial Engineering, Santiago, Chile {fbravo,glhuilli}@dcc.uchile.cl, {srios,jvelasqu}@dii.uchile.cl 2 University of Chile, Department of Computer Science, Santiago, Chile [email protected]
Abstract. The retrieval of similar documents from large scale datasets has been the one of the main concerns in knowledge management environments, such as plagiarism detection, news impact analysis, and the matching of ideas within sets of documents. In all of these applications, a light-weight architecture can be considered as fundamental for the large scale of information needed to be analyzed. Furthermore, the relevance score for documents retrieval can be significantly improved using several previously built search engines and taking into account the relevance feedback from users. In this work, we propose a web-services architecture for the retrieval of similar documents from the web. We focus on software engineering to support the manipulation of users’ knowledge into the retrieval algorithm. An human evaluation for the relevance feedback of the system over a built set of documents is presented, showing that the proposed architecture can retrieve similar documents by using the main search engines. In particular, the document plagiarism detection task was evaluated, for which its main results are shown. Keywords: Meta-Search architecture, Plagiarism Detection, Random Query Generation, Document Similarity Retrieval, SOA.
1
Introduction
The access to massive amount of information is considered as one of the main potentials of the Internet, in particular for educational institutions [9]. However, this access can lead to problems such as document plagiarism, that unfortunately has taken leadership in our today’s society. Furthermore, in the educational context, the massification of the Web and search engines, has contributed to access large bibliographic contents, much larger than the generally needed for their assignments. Likewise, reviewers’ task for authenticity checking in delivered documents, has been compromised. This phenomenon is presented in both educational institutions as well as the academic environment, where scientific contributions are constantly being delivered. The mechanism that most reviewers has adopted for R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 93–102, 2010. c Springer-Verlag Berlin Heidelberg 2010
94
F. Bravo-Marquez et al.
authenticity checking, is to search for the suspicious extract of the document in a search engine. Then, checking all results, the original document manually searched in all retrieved documents. This process can take large amounts of time, and the detection of malicious activities is not guaranteed. A solution to this, is to develop a web crawler that indexes the complete document from a given seed of URLs. Then after a dataset of documents is indexed, search mechanisms could be used for document similarity evaluation from within this set of documents. Unfortunately, resources needed to the crawler and indexing steps are difficult to get. A second alternative is to submit the whole document in a search engine. Unfortunately, search engines admits a fixed number of characters in their queries, for which the document needs to be chopped into several parts, and then delivered in parts to the search engine, which leads to the same reviewer problem stated in the latter. This work’s contribution is an information system architecture that given a text passage source, using different search engines, retrieves from the Web similar documents from this source. Furthermore, the proposed knowledge management system is based on a web service architecture and developed using Object Oriented Programming (OOP), for which it is highly scalable and extensible to new information retrieval requirements. This paper is structured as follows: In section 2, previous work on knowledge management information systems for plagiarism detection using meta-search engine architectures is presented. Then, in section 3, the main contribution of this work and proposed architecture is described. In section 4 the experimental setup and results are presented. Finally in section 5, the main conclusions of this paper and in section 6 future work is detailed.
2
Previous Work
As described in [8], meta search engines provides a single unified interface, where a user enters an specific query, the engine forwards it in parallel to a given list of search engines, and results are collated and ranked into a single list. Several of these approaches have been developed in [1,10,11]. In [6] the idea of exploiting directly the scores of each search engine is proposed, where the main information is the relative rank of each result. In [1], different ranking approaches are analyzed, for example Borda-fuse which is based on democratic voting, the Borda count, or the weighted borda-fuse, in which search engines are not treated equally. The document similarity retrieving problem has been studied by different researchers [4,12]. These approaches propose fingerprinting techniques for document representation into sets of relevant terms. Also, these approaches uses meta search engine architectures for retrieving an extended list of similar candidate documents. On the one hand, in [12] document snippets are retrieved from search engines and compared with the query document using cosine similarity from their Vector Space Model [7]. The previous process is deployed in a collaborative Service Oriented Architecture (SOA),
DOCODE-Lite: A Meta-Search Engine for Document Similarity Retrieval
95
where multiple services for search result analysis are developed in a modular and flexible architecture. On the other hand, Pereira and Ziviani [4] proposes to retrieve the complete text from each web document, comparing them with the query document using text comparison strategies, like Patricia trees and kgrams or shingles method. Non of these techniques’ approach have used the user knowledge to enhance results.
3
DOCODE-Lite
In the following, the proposed architecture for document similarity retrieval from web documents is presented. Firstly, all the components and processes are described with their respective interactions. Secondly, the query generating process and the metasearch scoring function are described. Finally, all software engineering components are detailed in order to show how the architecture can be constructed and deployed. 3.1
Web Service Architecture
In order to allow a flexible communication with other applications, the system was developed as a Web Service. We used in its implementation the Axis2 1.5 Web Services/WSDL/SOAP engine. The DOCODE-lite architecture (Fig. 1) is determined by different components. Each one is responsible to handle different tasks of document similarity retrieval problem, whose interaction is described as follows: The process starts with an input paragraph and finishes with a ranked list of web documents output. These documents are candidates from input paragraph and scored by their similarity. The input is converted in a set of queries, which are generated by a randomized process, that gives higher probabilities of occurrence to the more relevant terms. This queries are sent in parallel to a customizable list of search engines.
Fig. 1. DOCODE-lite SOA client and server for similar document retrieval
96
F. Bravo-Marquez et al.
The search instances results are collated and scored using their local ranks, the number of appearances, and search engines reliability parameters. Finally, retrieved documents are ranked by this score, stored in a data base model, and returned to the Web Service Client. The user interface also allows the user to evaluate the relevance of results. This evaluations are stored in the database for a posterior effectiveness evaluation. The reason of developing the system as a Web Service, is to inter-operate with different systems, where the needed service is independent to our user interface. Therefore, batch document processing tasks can be executed directly by using the proposed architecture. 3.2
Document Fingerprinting Model and Meta Search Scoring Function
The query generation process from a paragraph is based on a randomized fingerprinting model. The process starts with the extraction of the document vocabulary and the assignment of term extraction probabilities, calculated using customizable weighting approaches like tf, tf-idf, among others. These weights are normalized in order to satisfy the probability normalization constraint. Afterwards, we proceed to construct our queries by the concatenation of successive randomized term extractions. Terms are extracted without replacement in order to increment the occurrence probabilities of the others. The length of queries is also customizable, understanding that short queries achieved very fast retrieval times in search engines as stated in [3]. This process aims to represent relevant information from the given document in a set of queries. All queries are sent to the list of search engines, whose results associated to the same URLs are gathered. Each single result ω is defined a a tuple (s, q, r) where s is the used search engine, q the requested query, and r the rank in which the result has appeared. A collated result set Ω is defined as a set of single results which point to the same URL. The scoring function for these results is defined as follows: score(Ω) =
1 cs(ω) N (rω )βs(ω) ω∈Ω
(1)
Where N is the number of requested queries and c, β represents confidence factors in the search engine s. The c ∈ [0, 1] is a constant which represents the average relevance of the best response of a Web search engine s, and β represents the decay factor of the results’ relevance while increasing the amount of results retrieved. This score measure, aims to estimate the relevance of the collated result using its local ranks and the Web search engine reliabilities. This strategy is inspired in the idea that, if an URL appears many times and in higher ranks, the probability that the pointed document is related to the given document from which the query was generated should by high.
DOCODE-Lite: A Meta-Search Engine for Document Similarity Retrieval
3.3
97
Software Design and Architecture Engineering
As presented in the class diagram1, Fig. 2, previously described components are modeled with an object oriented programming approach. DocodeLiteService class receives as constructor input the document as string, the Search engines list, and the number of queries to generate. Queries are generated in a QueryGenerator object instance and used to build a MetaSearch object. This object creates instances of Search objects with their respective list of search engines, and the generated queries as their constructor signature. Each Search Engine must be implemented according to abstract class Search inheritance. The MetaSearch object creates Search instances using the metaprogramming reflection paradigm, which allows to create a class instance by only having their class name and constructor parameters. Each Search instance must request the assigned query to the Search Engine and create the QueryAnswer instance for each result. The QueryAnswer object (ω) has as instance variables the pointed web document url, its title, and its local rank of occurrence. The MetaSearch object collates and aggregates QueryAnswers when they point to the same url. The aggregation process is supported with a Hashmap data structure, mapping each url to a single MetaAnswer (Ω), whose score (Eq. 1) is determined by the QueryAnswer instances variables. Finally, the DocodeLiteService picks all MetaAnswers and returns them to the Web Service client.
Fig. 2. Class diagram for DOCODE-lite software architecture
1
Unified Modeling Language: http://www.uml.org/
98
F. Bravo-Marquez et al.
4
Experiments and Results
In present section, all parameters used in this research are introduced, for which different information retrieval options are discussed. Finally, the evaluation mechanism from which the architecture can be evaluated is presented. According to the previously described architecture, a prototype was implemented. It was developed in the Java programming language (JDK 1.6.0), providing a JavaServer Pages (JSP) user interface, mounted in Apache Tomcat 6.0 Servlet Container. The JSP user interface includes a text area for the input text, and a relevance FeedBack button for each result responded, which allows the user to evaluate the results’ relevance. The query generation algorithm was implemented with the Apache Lucene (2.9.0) Framework using a term frequency weighting approach and a stopwords’ list. The input area allows a user to insert a text without length constraints. However, If the whole document is considered as input, this could increase the number of non relevant pages retrieved, because of the randomized query generation process. Therefore, we recommend to use a single paragraph instead of a whole document. Since it is a self-contained independent information unit. A database model was designed to store relevant information for process analysis (Parameters, Input Paragraph, Search Engines, MetaAnswers, and QueryAnswers) and its results (Ranking, Scores, Relevance Feedback, and URLs), presented by the usage of the system and deployed in MySQL 5.1.372.
4.1
Experimental Setup
A hand-crafted set of paragraphs were extracted from a sample of web documents. For this, a total number of 160 paragraphs were selected from different Spanish written web sites. The respective URL was stored and classified into three different document types: bibliographic documents or school essays (type 1), blog entries or personal web pages (type 2), and news (type 3). After this paragraphs were sent to the system as input. The top 15 answers from each paragraph search were manually reviewed and classified as relevant or non relevant results. The criteria to label an answer as relevant was defined that the retrieved document must contain the given paragraph exactly3 . The table 1 shows the numbers of paragraphs, generated queries and retrieved documents per label. The selected search engines for the experiments were Google, Yahoo!, and Bing. The parameters used for each search engine for the query generation and the ranking procedures, as well as their formal estimation have been intentionally omitted. 2 3
This database is freely available to download in http://dcc.uchile.cl/~ fbravo/docode/dbcorpus.sql This corpus can be downloaded from http://dcc.uchile.cl/~ fbravo/docode/corpus.xml
DOCODE-Lite: A Meta-Search Engine for Document Similarity Retrieval
99
Table 1. Number of paragraphs, queries, and retrieved documents for each type, used in the evaluation procedure Type 1 Type 2 Type 3 All Types Paragraphs 77 53 30 160 539 371 210 1120 Generated Queries Retrieved Documents 1155 795 450 2400
4.2
Evaluation Criteria
The goal of this experiment is to measure the effectiveness of the architecture to satisfy user information needs. In this case, those needs are related with the document similarity retrieval problem. The results’ relevance measuring process was realized by inspection. The performance measure used in this work is the precision at k, whose composition is presented in equation 2: precision at (k) =
4.3
relevant retrieved in the top k results documents retrieved in the top k results
(2)
Results and Discussion
The table 2 shows the precision at k for results retrieved by the whole set of paragraphs associated to their document types. Is easy to see, that the precision at k differs with each type of documents. Table 2. Precision at k for each type of documents for the first 10 ranks k Precision All Precision Type 1 Precision Type 2 Precision Type 3 1 0,869 0,922 0,849 0,767 0,709 0,727 0,689 0,700 2 3 0,606 0,602 0,591 0,644 0,530 0,526 0,509 0,575 4 0,469 0,465 0,457 0,500 5 6 0,423 0,416 0,415 0,456 7 0,385 0,375 0,375 0,429 0,358 0,344 0,356 0,396 8 9 0,331 0,317 0,335 0,359 0,309 0,295 0,313 0,340 10
Firstly, type 1 has higher precision at the first values of k than the other types. This is because bibliographic documents are often founded in popular collections like Wikipedia, which are indexed by most of web search engines and usually ranked on top. In this case, a web document will appear as result for most of generated queries. Secondly, type 2 documents are not as popular as type 1. In
100
F. Bravo-Marquez et al.
this case, blog entries or personal pages are hardly indexed by all web search engines. Finally, we can observe a lower precision for the first ranked results of type 3 documents. However, a slow decayment of the precision at k is presented. That is because news are repeated in many different web pages, like RSS feeds aggregators, so is possible to find a high number of relevant results lower ranked documents.
Fig. 3. In (a) the precision at k for different types of documents for the first 10 ranks is presented. In (b), the precision at k for all types evaluated with all search engines is compared against a single search engine.
As shown in Fig. 3 (b), the precision at k for all search engines outperforms a single one for the retrieval of all types of documents. This fact could be explained because the union of search engines gives a better coverage of the web, given that their intersection can be considered as low [2]. In this case, the overall evaluation of a search engine was compared against results retrieved by a single search engine, showing that the meta-search process enhances the number of relevant retrieved documents in comparison with the single search engine.
5
Conclusions
The effectiveness of the proposed SOA distribution of the meta-search engine by using an object oriented architecture was successfully tested. Experimental results showed that proposed architecture (DOCODE-lite) is able to satisfy the document similarity retrieval problem. Likewise, results shows that the developed meta-search model improved significantly the retrieval capacity than a single search engine, contributing to the reception of this work. In this work, as well as previous proposed methods for Web document similarity retrieval, such as [4,12], are based in randomized query generation and meta-search algorithms. However, in this case the scoring function uses the local ranks and search engine reliability instead of downloading the retrieved documents [4], or text processing algorithms over the results snippets [12], aiming towards the minimization of the processing time a lite-weighted architecture.
DOCODE-Lite: A Meta-Search Engine for Document Similarity Retrieval
101
Presented architecture can be applied for plagiarism detection, document impact analyzer, related ideas retrieval, among other document retrieval problems. For example, in plagiarism detection, the proposed model will retrieve a similar document to the suspicious one, allowing the user to determine its authenticity. Furthermore, plagiarized documents uses the same words than the original source, usually changing the order of terms, for which proposed model will convert suspicious document into a set queries that could retrieve original source document. Likewise, for document impact analysis, the number of retrieved documents could be increased in order to review the frequency of appearance of a quote paragraph. Many plagiarism detection tools like eTBLAST 4 [5] compare the suspicious documents over specific topic databases. Taking use of DOCODE-lite in the search task, the search scope could be expanded, considering large document collections. However, plagiarism detection tools like duplichecker 5 allows to search using commercial search engines, but don’t combine them in one single search. Furthermore the fingerprinting process is deterministic, converting each sentence directly into a query. Our probabilistic approach allow us to detect more sophisticated plagiarism techniques. Finally Plagium 6 seems to be using a randomized query generation process, but only uses the Yahoo! search engine.
6
Future Work
As future work, the proposed architecture could be extended into a large scale processing environment. This fact leads to complex distributed system architecture and parallel processing problems that needs to be taken into consideration. Also, further development in terms of the relevance feedback from users could be considered in an active learning schema over the scoring parameters, in order to enhance the effectiveness of the system. It is important to consider, that DOCODE-Lite is a first approach for a document plagiarism detection system. The proposed model is focused on solving the document similarity retrieval problem for a given paragraph. A complete plagiarism detector system should also allow working with large collection of documents. The system should be able to identify potential plagiarized paragraphs in documents using for example text mining algorithms. Those paragraphs could consume the DOCODE-lite Web Service in order to retrieve a collection of source candidates documents. Afterwards more sophisticated plagiarism detection strategies must be used. Finally a plagiarism detection system that consumes the DOCODE-lite service will have a great advantage against other similar tools. Because using DOCODE-lite Web Service gives the best coverage from the Web at combing the effectiveness of commercial Web search engines. 4 5 6
http://etest.vbi.vt.edu/etblast3/ http://www.duplichecker.com http://www.plagium.com/
102
F. Bravo-Marquez et al.
Acknowledgment Authors would like to thank Marcos Himmer for his useful comments in the system’s architecture and interface development. Also, we would like to thank continuous support of “Instituto Sistemas Complejos de Ingenier´ıa” (ICM: P-05004- F, CONICYT: FBO16; www.sistemasdeingenieria.cl); FONDEF project (DO8I-1015) entitled, DOCODE: Document Copy Detection (www.docode.cl); and the Web Intelligence Research Group (wi.dii.uchile.cl).
References 1. Aslam, J.A., Montague, M.: Models for metasearch. In: SIGIR 2001: Proceedings of the 24th Aannual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 276–284. ACM, New York (2001) 2. Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. AddisonWesley Longman Publishing Co., Inc., Boston (1999) 3. Cormack, G.V., Palmer, C.R., Van Biesbrouck, M., Clarke, C.L.A.: Deriving very short queries for high precision and recall (multitext experiments for trec-7) (1998) ´ 4. Pereira Jr., A.R., Ziviani, N.: Retrieving similar documents from the web. J. Web Eng. 2(4), 247–261 (2004) 5. Lewis, J., Ossowski, S., Hicks, J., Errami, M., Garner, H.R.: Text similarity: an alternative way to search medline. Bioinformatics 22(18), 2298–2304 (2006) 6. Rasolofo, Y., Abbaci, F., Savoy, J.: Approaches to collection selection and results merging for distributed information retrieval. In: CIKM 2001: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 191–198. ACM, New York (2001) 7. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. ACM Commun. 18(11), 613–620 (1975) 8. Selberg, E., Etzioni, O.: The metacrawler architecture for resource aggregation on the web. IEEE Expert, 11–14 (January-February 1997) 9. Velasquez, J.D., Palade, V.: Adaptive Web Sites: A Knowledge Extraction from Web Data Approach. IOS Press, Amsterdam (2008) 10. Wu, Z., Meng, W., Yu, C., Li, Z.: Towards a highly-scalable and effective metasearch engine. In: WWW 2001: Proceedings of the 10th International Conference on World Wide Web, pp. 386–395. ACM, New York (2001) 11. Wu, Z., Raghavan, V., Qian, H., Rama, V., Meng, W., He, H., Yu, C.: Towards automatic incorporation of search engines into a large-scale metasearch engine. In: WI 2003: Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence, Washington, DC, USA, p. 658. IEEE Computer Society, Los Alamitos (2003) 12. Zaka, B.: Empowering plagiarism detection with a web services enabled collaborative network. Journal of Information Science and Engineering 25(5), 1391–1403 (2009)
Group Formation for Collaboration in Exploratory Learning Using Group Technology Techniques Mihaela Cocea and George D. Magoulas London Knowledge Lab, Birkbeck College, University of London 23-29 Emerald Street, London, WC1N 3QS, UK {mihaela,gmagoulas}@dcs.bbk.ac.uk
Abstract. Exploratory Learning Environments (ELEs) allow learners to approach a problem in different ways; they are particularly suitable for ill-defined problems where knowledge is less structured and openended exploration is allowed. Moreover, multiple solutions which are equally valid are possible and a common and efficient way to convey this is by promoting and supporting students’ collaboration. Successful collaboration, however, depends on forming groups in which the activity is relevant for all members of the group. In this paper we present a computational model for group formation for open-ended exploration in ELEs by modelling the various strategies that learners adopt to solve the same task. This is underpinned by Group Technology techniques that use as criteria the learners’ strategies and the similarity among them to form groups that match pedagogy considerations. The proposed mechanism is tested in an exploratory learning environment for mathematical generalisation. Keywords: collaboration, group formation, exploratory learning environments, Group Technology.
1
Introduction
Exploratory Learning Environments (ELEs) allow learners to construct and/or explore models by varying their parameters and observing the effects of these variations. ELEs are particularly useful for ill-defined domains [1], where problems are not well structured and there are no clear boundaries between correct and incorrect approaches to solve the task. Moreover, some problems in these domains have several valid solutions but none of them can be considered better than the others. In classroom situations this is often conveyed by discussion between students under the supervision of a teacher/instructor. Although collaborative learning has been proved successful in classroom situations [2], [3], in computer-supported learning environments it does not seem to lead to the same learning benefits. One of the contributing factors is the way the collaborative groups are formed, as forming efficient groups is very important to ensure an educational benefit from the group interaction [4]. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 103–113, 2010. c Springer-Verlag Berlin Heidelberg 2010
104
M. Cocea and G.D. Magoulas
Several approaches have been proposed to address this issue, of which we briefly present only a few. For example, an algorithmic approach has been used in [5] to form homogeneous and heterogeneous groups, and a Genetic Algorithms approach was used to form mixed groups by considering the following learners’ characteristics: learner’s personality, competence level, learning style, indicator for collaborative behaviour and indicator for acting as evaluator in peer-assessment. An Ant Colony Optimisation approach is used in [6] to form heterogeneous groups based on personality traits and performance. Fuzzy CMeans clustering for formation of homogeneous and heterogeneous groups is compared to other low complexity algorithms in [7]. A framework for collaborative group formation was proposed in [8] using a constraint satisfaction problem formulation; Semantic Web Technologies and Logic Programming are used. An ontology was proposed as a framework based on learning theories that facilitate group formation and collaborative learning design [9]. Unlike previous work on group formation, our research is in the field of exploratory learning environments and for the domain of mathematical generalisation, where a problem can be solved using different strategies. The goal of collaborative activities is to discuss similarities and differences between strategies adopted by learners to solve the task. Therefore, the criteria used in our approach is the learners’ strategies to a particular task and the similarities between those strategies. To address this, we use an approach inspired from Group Technology by defining learners and strategies as property vectors, calculating similarities between them and deriving an incidence matrix on which clustering is performed. This procedure leads to formation of homogeneous groups; however, heterogeneous groups can also be formed by choosing one or more learners from each or some of the homogeneous groups. We are aware of only one approach based on Group Technology [10] that has been previously used for group formation for educational purposes. This work was developed for distance learning environments with the purpose of grouping learners and learning objects so that students are given the opportunity to learn skills or knowledge that they don’t master already [11]. Also, characteristics of solutions have been previously used to support collaboration [12]; more specifically, differences between problem solutions are detected to help students recognize and resolve conflicts between their problem solutions. In contrast to these works, our approach is developed for ill-defined problems and uses similarities rather than differences between approaches to solve the task. The rest of the paper is structure as follows. The next section describes the learning environment and examples of typical tasks. Section 3 briefly introduces Group Technology (GT) and defines our approach and its relation to GT. Section 4 illustrates the approach using data from a classroom session and Section 5 concludes the paper and presents some directions for future work.
2
Mathematical Generalisation with Exploratory Learning
Our research is conducted in the context of eXpresser [13], an Exploratory Learning Environment for the domain of mathematical generalisation. The system is
Group Formation for Collaboration in Exploratory Learning
105
intended for 11 to 14 year olds and for classroom use. Individual tasks in eXpresser involve building a construction and deriving a rule form it; collaborative tasks involve discussion on similarities or differences or both between constructions produced individually. We illustrate here two typical tasks: ‘pond tiling’ and ‘stepping stones’. Pond tiling requires to find a general rule for surrounding any rectangular pond. The construction, several ways of building the construction and their corresponding rules are displayed in Fig. 1; the variables “w” and “h” refer to the width and the height of the pond. The ‘stepping stones’ task requires to build a construction such as the one in Fig. 2(a) and to find a rule for the green (lighter colour) tiles in relation to the red (darker) tiles, i.e. the stepping stones; some construction are expanded for ease of visualisation; the variable “red” refers to the number of red tiles. In these figures, the internal structure of the constructions has been highlighted for clarity. In eXpresser all constructions would look the same in the normal course of the task.
Fig. 1. ‘Pond tiling’ task, constructions and associated rules
Fig. 2. ‘Stepping Stones’task, constructions and associated rules
As illustrated above, each task has multiple solutions, and some of these solutions are similar to each other while others are different. For example, in the ‘pond-tiling’ task the construction in Fig. 1(c) and (e) are similar because they have the same horizontal bars, while the constructions in Fig. 1(d) and (e) are similar for having the same vertical bars. The constructions are illustrated with same pond and the same ‘stepping stones’, respectively; however, learners build constructions of various dimensions. Therefore, in this work the notion of similarity between different strategies refers to structural similarity rather than the exact dimensions of the construction.
106
M. Cocea and G.D. Magoulas
As the goal of collaborative activities is to discuss similarities and differences between different learner strategies, the criteria for group formation are: (a) the strategies followed by each learner; (b) the similarities between different strategies. The second criterion is important because students find it difficult to translate between different representations, which in our case correspond to strategies [14]. Therefore, when learners are less experienced, starting with similar strategies facilitates this translation. When learners become more advanced, they can be challenged to translate between totally different strategies. Consequently, we are interested in forming both homogeneous and heterogeneous groups and to this end we use an approach inspired from Groups Technology which is described in the next section.
3
Group Formation Using Group Technology Techniques
Our approach is inspired by techniques from Group Technology (GT). A brief overview of GT is given below, followed by our approach and the similarities and differences between the two. Group technology designates a method for group formation of machines and parts in cellular manufacturing systems. The idea is to optimize production of families of parts by creating machine cells. Thus, clusters of machines can be located in close proximity and be responsible for a particular family of parts (and minimize production/transfer time). Therefore, two types of groupings are considered: grouping parts into families and grouping machines into cells. This is referred to as the cell formation problem. Several methods have been used to solve it, such as descriptive procedures, cluster analysis, graph partitioning, artificial intelligence and mathematical programming [10] [15]. We are interested in the clustering analysis approach, which in turn includes several techniques: similarity coefficient, set theoretic, evaluative and other analytic methods [16]. Our approach is using array-based clustering and the similarity coefficient technique. In array-based clustering, a part-machine incidence matrix is used whose entries are either zero or one. If the entry in row i and column j is one, part j needs to visit machine i; if it is zero, no visit is needed. The array-based techniques lead to clusters of parts and machines by rearranging the order of rows and columns to form diagonal blocks of one in the part-machine matrix. The similarity coefficient techniques involve grouping parts and machines into cells based on similarity measurements between machines and parts, and between machines. These similarity coefficients are used in agglomerative clustering techniques, of which the most popular is single linkage clustering [10]. In our approach, similarity coefficients are used to form the incidence matrix and array-based clustering is then applied to obtain the cells. For us, the parts are the learners, the machines are the strategies of a task, and the goal is to group students according to the strategy (or strategies) they used, taking into account that some strategies are similar to each other while others are different. In previous work [17] we defined a Case-based Reasoning knowledge representation to delineate the possible strategies that learners could use when solving a
Group Formation for Collaboration in Exploratory Learning
107
Fig. 3. The procedure for group formation
task, and an identification mechanism was developed to distinguish which strategy is followed by the learner. Partial and complete solutions are represented as sequences of cases linked by temporal and dependency relations in a Task Model. These are mapped to the learners behaviour in the system using the identification mechanism which is based on similarity metrics for each type of information used in the cases, and the information on the most similar strategy (or strategies) is then stored in a Learner Model. Consequently, for the purpose of group formation, the strategies are extracted from the Task Model, and the information on what strategies the learners followed are extracted from Learner Models. The procedure includes the following phases (see also Fig. 3) which are detailed below: Phase 1. Represent all strategies stored in the Task Model as binary vectors that define similarities between them. Phase 2. Retrieve learner strategies from the Learner Models and represent learners as vectors whose elements depict the existence of a relation between a learner’ strategy and a strategy stored in the set of task strategies. Phase 3. Define resemblance coefficients and calculate them. Phase 4. Derive the Strategies-Learners Matrix (SLM) from the results of previous step. Phase 5. Perform clustering on SLM. Definition 1. Let S be the set of strategies of a task: S = {sj }, j = 1, 2..., n. Every strategy can be represented as a n-dimensional vector of 0s and 1s: sj = (s1j , s2j , ...., snj ) where: 1 if sj is similar to strategy si sij = 0 if sj is not similar to strategy si For example, the vectors for the four strategies of the ‘pond-tiling’ task illustrated in Fig. 1 are displayed in Table 1. For example, as already mentioned in Section 2, strategy (d) (corresponding to Fig. 1(d)) is similar to itself and to strategy (e); strategy (e) is similar to itself and to strategies (c) and (d). These similarities could be automatically deducted from the existence of structurally similar components like the ones illustrated in Section 2, or they could be formulated by teachers.
108
M. Cocea and G.D. Magoulas Table 1. Vectors for the strategies of ‘pond tiling’ task
(b) (c) (d) (e)
(b) (c) (d) (e) 1 0 0 0 0 1 0 1 0 0 1 1 0 1 1 1
Definition 2. Let L = {λk }, k = 1, 2, ...., m be the set of learners. A learner can be represented as a vector of 0s and 1s: λk = (λ1k , λ2k , ..., λnk ), where: 1 if learner λk used si strategy i λk = 0 if learner λk did not use si strategy For example, learner A that uses the (b) strategy is represented as (1 0 0 0) and learner B that uses strategy (e) is represented as (0 0 0 1). Sometimes learners use combinations of different strategies; for example, learner C using the (e) and (d) strategies would be represented as (0 0 1 1). This vector formulation is based on the information stored in the Learner Models; thus, learner A and B have in their Learner Models that their constructions are most similar to strategies (b) and (e), respectively, while the Learner Model for learner C indicates that strategies (e) and (d) are most similar. Definition 3. For each learner vector λk and each strategy vector sj , the following are defined (see also Fig. 4): 1. a is the number of matching 1s, i.e. the number of strategies contained in both vectors; 2. b is the number of 1s in λk and 0s in sj , i.e. the number of strategies followed by the learner which are contained in λk but not included in sj ; 3. c is the number of 0s in λk and 1s in sj , i.e. the number of strategies that the learner did not follow but are included in sj .
Fig. 4. Example for Definition 3
In line with [11], we use two resemblance coefficients: one for the similarity between learners and strategies, and one for the relevance of each strategy for a particular learner. Definition 4. The similarity coefficient (SC) between a learner λk and a strata egy sj is defined as: SC(λk , sj ) = a+b+c , for each learner λk ∈ L, k = 1, 2, ..., m and each strategy sj ∈ S, j = 1, 2, ..., n.
Group Formation for Collaboration in Exploratory Learning
109
This was first defined for use in GT by McAuley [18] and is in fact a Jaccard similarity coefficient, a well known measure of similarity. A study comparing 20 similarity coefficients found it to be the most stable [19]. Definition 5. The Relevance Coefficient (RC) of a strategy sj for learner λk is a defined as: RC(λk , sj ) = a+b , for each learner λk ∈ L, k = 1, 2, ..., m and each strategy sj ∈ S, j = 1, 2, ..., n. The strategies-learners matrix is defined as: SLM = {cij }, i ∈ [1, n], j ∈ [1, m], 1 if RC ≥ θRC and SC ≥ θSC cij = 0 otherwise where θRC , θSC ∈ (0, 1]. A minimum density of the matrix is necessary to obtain meaningful results. More specifically, each column should have at least a ‘1’, i.e. each learner should follow at least one strategy. Therefore, the minimum density is the number of learners: m. Consequently, to fulfill the matrix density constraint, the values of θRC and θSC could be defined dynamically for each class. To avoid unnecessary computation, however, the following were established: (a) as the relevance of a strategy for a learner should play an important role, the value of θRC should not be lower than 0.5; (b) calculate values dynamically only if the density constraint is not satisfied using the value of 0.5 for both thresholds. Therefore, the grouping starts with the value of 0.5 for both thresholds and if the matrix density constraint is not satisfied, the value of θSC is gradually decreased until the constraint is satisfied. To illustrate the next phase of the procedure, i.e. the clustering, let us consider the matrix displayed in Step 1 of Fig. 5. Rank Order Clustering (ROC), one the most frequently used methods in array-based clustering [15], is used, which involves organising columns and rows in the order of decreasing binary weights. The following procedure is applied which is illustrated in Fig. 5: m Step 1. Assign value 2m−j to column j. Evaluate each row (Rowi = j=1 cij ∗ 2m−j ) and order rows in decreasing order. If there is no change compared to previous order, stop. Else, go to step 2. n Step 2. Assign value 2n−i to row i. Evaluate each column (Columnj = i=1 cij ∗ 2n−i ) and order columns in decreasing order. If there is no change compared to previous order, stop. Else, go to step 1.
Fig. 5. Steps of Rank Order Clustering example
110
M. Cocea and G.D. Magoulas
In this example, the following clusters were formed: learners 1, 3 and 6 with strategies 1 and 3; learners 2 and 5 with strategy 2, and learner 6 with strategy 4. In this particular example the blocks of 1s are clear cut; however, that is rarely the case, showing that clusters are not independent. Also, one strategy may be used by many learners, forming a big cluster; this is known as the “bottleneck machine problem”, when a large number of components need to be processed by one machine. In the context of forming groups for collaboration though these constraints are not considered critical limitations for the formulation of strategies-learners clusters: if clusters are not independent, it means that some learners are using other strategies besides the ones of that cluster; if many learners are using the same strategy, forming a large cluster, it can be broken down in several subgroups for the purpose of the collaborative task.
4
An Illustrative Example
We illustrate the approach presented in the previous section using data from a classroom session where 18 students used eXpresser to solve the ‘stepping stones’ task. Out of the 18 learners, 6 used the so-called ‘C’ strategy (Fig. 2(b)) (C), 4 used the ‘HParallel’ strategy (Fig. 2(c)) (H), 2 used the ‘VParallel’ strategy (Fig. 2(d)) (V), 1 used the ‘Squares’ strategy (Fig. 2(d)) (S), 1 used a combination of ‘HParallel’ and ‘VParallel’ strategies (H&V) [17] and the remaining 4 students were either off-task or used non-systematic approaches such as building the construction using individual tiles - see Table 2. Table 2. Distribution of strategies used by learners Strategies C H V S H&V Other Number of Learners 6 4 2 1 1 4
A subset of the vectors for strategies and learners is displayed in Table 3. For learners that used the same strategy, only one example is provided; for example, learners λ1 to λ6 have the same vectors and thus only learner λ1 is displayed. The learners that did not follow a systematic approach are excluded. As shown in Table 2, 4 learners used non-systematic approaches to solve the task denoted Table 3. Strategies and learners vectors Strategies C H V S C 1 0 0 1 H 0 1 0 0 V 0 0 1 0 S 1 0 0 1
Learners C H V S
λ1 1 0 0 0
λ7 0 1 0 0
λ11 0 0 1 0
λ13 0 0 0 1
λ14 0 1 1 0
Group Formation for Collaboration in Exploratory Learning
111
Table 4. Values of Relevance Coefficients (RC) and Similarity Coefficients (SC) Strategies λ1 λ2 λ3 λ4 λ5 λ6 λ7 λ8 λ9 λ10 λ11 λ12 λ13 λ14 C forward RC 1 1 1 1 1 1 0 0 0 0 0 0 1 0 SC 0.5 0.5 0.5 0.5 0.5 0.5 0 0 0 0 0 0 0.5 0 HParallel RC 0 0 0 0 0 0 1 1 1 1 0 0 0 0.5 SC 0 0 0 0 0 0 1 1 1 1 0 0 0 0.5 VParallel RC 0 0 0 0 0 0 0 0 0 0 1 1 0 0.5 SC 0 0 0 0 0 0 0 0 0 0 1 1 0 0.5 Squares RC 1 1 1 1 1 1 0 0 0 0 0 0 1 0 SC 0.5 0.5 0.5 0.5 0.5 0.5 0 0 0 0 0 0 0.5 0
Table 5. Initial and final matrix Initial matrix
Final matrix (after ROC)
by ‘Other’. These could also be represented by a distinctive vector which will result in a cluster formed by these learners; however, they are already classified as a distinctive group and, therefore, including them in the grouping mechanism will only lead to unnecessary computations. Table 4 displays the values of RC and SC for each strategy and learner. Using θRC = 0.5 and θSC = 0.5, the initial matrix in Table 5 is obtained; applying ROC to it leads to the final matrix in Table 5 and to the following groups: (1) Group 1 includes learners λi , i = 1, 2, ..., 6 and λ13 that adopted the ‘C’ and ‘Squares’ strategies; (2) Group 2 includes learners λi , i = 7, 8, ..., 10 and λ14 that adopted the ‘HParallel’ strategy; (3) Group 3 includes learners λi , i = 11, 12 that adopted the ‘VParallel’ strategy. The advantages of using this method, as opposed to clustering based only on the strategies used, is that the similarities between different strategies could be modified by the teacher. They could vary from being very strict (a strategy is similar only to itself) to being very relaxed (a strategy is similar to other strategies when there is at least one part that is similar). For example, a teacher may consider the ‘Squares’ strategy to be similar to the ‘VParallel’ rather than the ‘C’ strategy. Consequently, the strategies vectors would be: (a) ‘C’ strategy (1 0 0 0); (b) ‘HParallel’ strategy (0 1 0 0); (c) ‘VParallel’ strategy (0 0 1 1); (d) ‘Squares’ strategy (0 0 1 1). Using these vectors a new SLM matrix is obtained and the clustering procedure outputs the following groups:
112
M. Cocea and G.D. Magoulas
(1) Group 1 includes learners λi , i = 1, 2, ..., 6 that used the ‘C’ strategy; (2) Group 2 includes learners λi , i = 7, 8, ..., 10 and λ14 that used the ‘HParallel’ strategy; (3) Group 3 includes learners λi , i = 11, 12 and λ14 that used the ‘VParallel’ and ‘Squares’ strategies. The mechanism we developed provides the teachers with groups based on the strategies followed by learners, i.e. the clusters formed as explained above. Using this information, teachers decide the size of groups and how the learners are distributed. Currently, our approach does not include social, cultural or personality factors, which are handled by the teacher. Future work, however, will look at integrating these factors and at automating the group formation in a flexible manner that, for example, will allow teachers to enter constrains such as ‘learner X should never be grouped with learner Y’.
5
Conclusions and Future Work
In this paper we presented an approach for group formation inspired from Group Technology. The criteria used in the grouping process are the strategies followed by learners in solving a task and the similarities between different strategies of a particular task. Resemblance coefficients are used to define: (a) the similarity between learners and strategies and (b) the relevance of each strategy for a particular learner. The approach outputs homogeneous groups; however, heterogeneous groups can be formed by choosing one or more learners from each or some of the homogeneous groups. The model could be refined to better reflect the similarity of the learner’s construction to the strategies in the Task Model. At the moment, when a learner uses more than one strategy, these are all contained in the learner’s vector and are represented by a value of 1. In addition, no special weights are used as strategies are all considered equally important. However, the learner model includes information about the similarity of learner’ construction with each strategy and these could be proportionally reflected in the learner’s vector. Also, the similarities between strategies could be refined to reflect partial similarity rather than having a value of ‘1’ for a range of similarities going from similarity of only one part to full similarity. This would possibly require a modification of the resemblance coefficients and is part of our future work.
References 1. Lynch, C., Ashley, K., Aleven, V., Pinkwart, N.: Defining ill-defined domains; a literature survey. In: Proc. of the Workshop on Intelligent Tutoring Systems for IllDefined Domains at the Intelligent Tutoring Systems Conference, pp. 1–10 (2006) 2. Brown, A., Palincsar, A.: Knowing, learning, and instruction. In: Guided, Cooperative Learning and Individual Knowledge Acquisition, pp. 307–336. Lawrence Erlbaum Associates, Hillsdale (1989)
Group Formation for Collaboration in Exploratory Learning
113
3. Slavin, R.E.: When and Why Does Cooperative Learning Increase Achievement? In: The RoutledgeFalmer Reader in Psychology of Education. RoutledgeFalmer Readers in Education, vol. 1, pp. 271–293 (2003) 4. Daradoumis, T., Guitert, M., Gim´enez, F., Marqu`es, J.M., Lloret, T.: Supporting the composition of effective virtual groups for collaborative learning. In: Proc. of ICCE 2002, pp. 332–336. IEEE Computer Society, Los Alamitos (2002) 5. Gogoulou, A., Gouli, E., Boas, G., Liakou, E., Grigoriadou, M.: Forming homogeneous, heterogeneous and mixed groups of learners. In: Proc. of PING Workshop, pp. 33–40 (2007) 6. Graf, S., Bekele, R.: Forming heterogeneous groups for intelligent collaborative learning systems with ant colony optimization. In: Ikeda, M., Ashley, K.D., Chan, T.-W. (eds.) ITS 2006. LNCS, vol. 4053, pp. 217–226. Springer, Heidelberg (2006) 7. Christodoulopoulos, C.E., Papanikolaou, K.: Investigation of group formation using low complexity algorithms. In: Proc. of PING Workshop, pp. 57–60 (2007) 8. Ounnas, A., Davis, H.C., Millard, D.E.: A framework for semantic group formation in education. Educ. Tech. Soc. 12(4), 43–55 (2009) 9. Isotani, S., Inaba, A., Ikeda, M., Mizoguchi, R.: An ontology engineering approach to the realization of theory-driven group formation. I. J. Computer-Supported Collaborative Learning 4(4), 445–478 (2009) 10. Selim, H.M., Askin, R.G., Vakharia, A.J.: Cell formation in group technology: Review, evaluation and directions for future research. Comput. Ind. Eng. 34(1), 3–20 (1998) 11. Pollalis, Y.A., Mavrommatis, G.: Using similarity measures for collaborating groups formation: A model for distance learning environments. Eur. J. Oper. Res. 193(2), 626–636 (2009) 12. de los Angeles Constantino-Gonzalez, M., Suthers, D.D., de los Santos, J.G.E.: Coaching web-based collaborative learning based on problem solution differences and participation. Int. J. Artif. Intell. Ed. 13(2-4), 263–299 (2003) 13. Pearce, D., Mavrikis, M., Geraniou, E., Gutierrez, S.: Issues in the design of an environment to support the learning of mathematical generalisation. In: Dillenbourg, P., Specht, M. (eds.) EC-TEL 2008. LNCS, vol. 5192, pp. 326–337. Springer, Heidelberg (2008) 14. Ainsworth, S.: The functions of multiple representations. Comput. Educ. 33(2-3), 131–152 (1999) 15. Joines, J.A., King, R.E., Culbreth, C.T.: A comprehensive review of productionoriented manufacturing cell formation techniques. Int. J. Flex. Autom. Integr. Manuf. 3(3-4), 225–265 (1996) 16. King, J., Nakornchai, V.: Machine-component group formation in group technology: Review and extension. Int. J. Product. Res. 20(2), 117–133 (1982) 17. Cocea, M., Magoulas, G.: Task-oriented modeling of learner behaviour in exploratory learning for mathematical generalisation. In: Proc. of the 2nd ISEE workshop, pp. 16–24 (2009) 18. McAuley, J.: Machine grouping for efficient production. Production Engineer 51(2), 53–57 (1972) 19. Yin, Y., Yasuda, K.: Similarity coefficient methods applied to the cell formation problem: a comparative investigation. Comput. Ind. Eng. 48(3), 471–489 (2005)
Applying Pedagogical Analyses to Create an On-Line Course for e Learning D.-L. Le1, V.-H. Tran1, D.-T. Nguyen2, A.-T. Nguyen2, and A. Hunger3 1
HCMc Uni of Pedagogy, Vietnam HCMc Uni of Science, Vietnam 3 Uni of Duisburg-Essen, Germany {longld,haotv}@hcmup.edu.vn, {ndthuc,nate}@hcmuns.edu.vn, [email protected] 2
Abstract. In teaching field, the building of programs, curricula, courses and lectures is considered as key phase in the development of learning contents and materials. Therefore, an issued learning content satisfied principle of complete, logical, and pedagogical qualities is necessary and quite challenge task for teaching and learning process in both traditional learning and e-Learning environment. In this paper, we propose an approach to creating on-line course, called e-Course, which is able to meet requirements above. The proposed e-Course is based on pedagogical analyses and basic teaching principles, its application focuses on two main actors of an e-Learning system in which the instructor designs and builds e-Course that its result is a set of interacted topics or lessons, and the learner uses e-Course in self-study activities on-line or not. Keywords: e-Course, Knowledge Graph (KG), Sub-Knowledge Graph (SubKG), Prime Idea (PI), e-Learning System.
1 Introduction Up to now, learning activities in the classroom (of traditional learning form) are mainly based on conventional printed materials (e.g. reference book, schoolbook, textbook, etc.) and the guidance directly of instructor in charge. Reference book or schoolbook (abbr. book) is a part of defined content knowledge of the curriculum that need to be transferred to learner to reach teaching goal and learning outcome and surely, it is edited carefully by educational experts on paper. Textbook is a formal manual material of instruction prepared for use in schools/colleges and edited by the instructor (from books and related documentations) to support the self-paced activities of learner (e.g. preview material, do exercise, review knowledge). Along with the learning contents in textbook, instructor makes use of pedagogical ability and his1 experience to extend or make clear learning concepts in time (by explaining, presenting, or illustrating visual examples) when he communicates with his learners face-toface in teaching process, and surely these concepts are not easy to be printed or presented explicitly in textbook. It proves that the role of instructor in the traditional 1
The feminine form is used in this paper for learners and the masculine form for instructors.
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 114–123, 2010. © Springer-Verlag Berlin Heidelberg 2010
Applying Pedagogical Analyses to Create an On-Line Course for e Learning
115
learning is important, and all learning activities only effect when the instructor and his learners have a reciprocal interaction directly. The printed material, in general, presents “roughly” in both the content and display of knowledge. It also provides knowledge homogeneously to all learners, and does not distinguish the backgrounds or cognitive ability of each learner. In the recent decades, computer and mobile devices were being used widely to support all learning activities, especially the new representation of learning content that we want to care about in here, e-books (i.e. electronic books). E-book is an electronic text in form digital media integrated with image, audio, video and so on [3] [12]. It is usually read on the PC’s, smart phones, or e-book readers. On the learner’s side, using e-book reader is not convenience and simple as holding a book to read, or it also has some limitations of screen (e.g. size, resolution and brightness). All of these could make the learner feel tired easily in reading e-book for a long time. Similar to book/textbook, e-book also misses necessary interactions between instructor and learner as in traditional learning. Now distance training via Internet or on-line education (a.k.a. e-Learning) is being developed in many countries all over the world [1]. However, the face-to-face communication between instructor and learner in the learning form has some limits, and then exploitation of learning materials in the former conventional way (including content and display) becomes less and less effective. Moreover, the developers of Web applications always say, "Content is king" to emphasize the importance of a good content [2], and the building of Web-based course (i.e. on-line course) contents can be a prerequisite problem of e-Learning. In the early Learning Management Systems (abbr. LMS) [13], main learning content of an on-line course is often lectures without structure and simply in format of an electronic text, hypertext or hypermedia. Learning content of the on-line course is not different with an e-book though it was divided into pages and branched for ordering representation [5] [11] [14]. Briefly, building of programs, curricula, courses and lectures is considered as key phase in the development of learning contents and materials in both traditional learning form and e-Learning. Especially, strengthening and improving learning contents (e.g. unitary learning elements/objects) in an e-Learning system can help compensate the lack of face-to-face communication between the instructor and learners. Clearly, if learning contents can replace teaching role of instructor in the classroom, then learning activities in the e-Learning system will be almost similar to the learning traditional form. In this paper, we care about the development of learning contents for an on-line course and present a process to build an e-Course. E-Course is a proposed concept basing on pedagogical analyses and then its learning contents can satisfy the requirements such as qualities of completeness, logicality, and pedagogy. The application focuses on two main actors of an e-Learning system where the instructor designs and builds e-Course that its result is a set of interacted lessons (called eLessons or topics), and the learner uses e-Course in self-study activities on-line (through LMS) or off-line form (with PDF, CHM file). The rest of this paper will cover four sections. In section 2, we present the good reason based on the analyses of pedagogical and teaching aspects to propose the concept e-Course; section 3 shows definitions and related procedures for creating an eCourse. Section 4 displays building an e-Course and its application in e-Learning; and the conclusion is presented in section 5.
116
D.-L. Le et al.
2 Pedagogical Analyses and Proposing Concept of e-Course In the traditional teaching process, an instructor in charge prepares the materials and related resources (from given books/textbooks) for his course basing on a detailed syllabus and usually divides the course into many different topics or lessons to transmit the required knowledge to learners in a fixed duration. Instructor also establishes standards and learning goals for the topics, then he reorganizes knowledge from the books/textbooks into teaching knowledge (of each topic or lesson) to be able to instruct directly in classroom. Fig. 1 describes a course with its lessons and topics where the common information (e.g. introduction, goals, grade level) and related components (e.g. summary, exercise, keywords, content) also are displayed.
Fig. 1. A specific course and its main components
When each topic is presented, it aims at objectives for learners to acquire the topic content, memorize required knowledge in the topic, and be able to apply knowledge in the context of real-life (e.g. doing exercise, solving project). These requirements seem to be easy and can be deployed completely but in practical learning process, it is not so simple. Learner can memorize standard knowledge but she does not understand the topic or cannot apply knowledge in the real-life; learner can acquire the topic but she does not know what knowledge must be memorized, and so on. Consequently, the role of instructor with pedagogical ability and his experiences in face-to-face teaching is important and necessary. Basically, we see that the topic can be represented in two main parts: prime ideas of the topic and the instructor’s presentations. The prime idea is necessary core knowledge of topic that is required understanding and memorizing by learner. However, she can meet some difficulties to learn it during her self-study activities if there is not any guidance from the instructor. The instructor’s presentation is an act of displaying (a.k.a. “art of teaching”) to make clear the core knowledge above basing on the instructor’s pedagogical ability and experiences (by explanation, illustration, demonstration). Therefore, the instructor himself often designs the topic and uses it during the face-to-face communications between him and learners in the classroom to help learners easily acquire the knowledge.
Applying Pedagogical Analyses to Create an On-Line Course for e Learning
117
To the e-Learning systems, the unitary learning object (abbr. LO) as topic (or lesson) in an on-line course satisfied all the requirements above is more difficult because it misses face-to-face communications of instructor that always exists in traditional classroom. Besides, as we know, learning is a “hard work” and it will become more and more “boring”, lack of the “necessary attraction” when the learner has to face with the screen of IT devices as PC and does not communicate directly between participants. In addition, there is a gap between reusable ability and context for the structure of conventional learning object [4]. With the increasing use of instructional designers as pedagogical experts for eLearning systems, it is important we learn as much as possible about designing and developing effective instructional design across on-line courses. When instructional designers are employed as pedagogical experts but not content experts — and the instructors are content experts but not pedagogical experts — the result is a bifurcation of content and pedagogy [7]. The concept of proposed e-Course is the combination between the core content knowledge (through set of prime ideas with the given objectives) and the external presentation of instructor basing on pedagogical ability and his experiences to transmit the knowledge to learner, then e-Course is able to give learner more chance to acquire the required knowledge completely and logically (a.k.a. basic principles of pedagogy for teaching knowledge), especially in the self-study activities via PC and Internet [9]. The related concepts are shown in Table 1. Table 1. General concept of e-Course and topic Concept
Descriptions
e-Course
A course based on computer technologies, equivalent to a unit of curriculum or an academic subject in traditional learning form, teaching and learning completely via Internet with the support of multimedia and system resources.
Topic
A part of course, is an on-line lecture, determining a sum of specific knowledge (based on knowledge standards of curriculum) with a given objective. A concrete learning object, known as a unit of knowledge. It is able to be a part of e-Lesson containing learning contents that have to teach learners. Then, an eLesson usually has many topics. Topic is designed in form optional representations by instructor basing on core knowledge (prime ideas). It composes completely the meaning of the learning knowledge that needs to self-study by own learner without instructor’s guidance directly. Topic’s content can integrated in multimedia types.
3 Definitions and Related Procedure for Creating an e-Course This section presents design of an e-Course in mathematical form with definitions, propositions and procedures. Section 3.1 present KG and its definitions, KG is fundamental model of content knowledge to create e-Course; section 3.2 shows the
118
D.-L. Le et al.
procedure to generate Sub-KG basing on goal derived from KG; and section 3.3 is related definitions of e-Course to prove that it is a pedagogical instance of Sub-KG. 3.1 KG Model and Fundamental Definitions Knowledge Graph (abbr. KG) is known as a set of core content knowledge (prime ideas) together with its internal relationship, and KG is also the knowledge representation of learning contents in e-Learning systems [8]. Definition 1. Prime-idea Prime idea also called PI for short, is the smallest knowledge unit about a specific technical topic and is explained in a short transparent paragraph. We use the notation ρ, which is kept throughout the paper to express PI. Fig. 2 illustrates generic structure for two particular PI’s of course “Fundamental Programming with C”, and some its components as ID, label, statement and category.
Fig. 2. The illustration of PI and its components
Definition 2. Relationship between two PI’s Give ρj and ρk are two PI’s. There is exactly one and only one of the following relationships such that (1) This PI is hard-condition of the other one. For instance, ρj is hard-condition of ρk, and denoted as ρj h ρk (2) This PI is necessary-condition of the other one. For instance, ρj is necessarycondition of ρk,, and denoted as ρj ρk (3) They are independent, if there is no necessary-condition relationship between them. Fig. 3 illustrates the relationship between PI’s as ρ4 is hard-condition of ρ5; ρ1 or ρ2 is necessary-condition of ρ5; ρ3 and ρ5 are independent of each other.
Fig. 3. The illustration of PI’s relationship
Applying Pedagogical Analyses to Create an On-Line Course for e Learning
119
Definition 3. Knowledge Graph - KG Given a finite set of PI’s: {ρ}i, (i =1,n). Let V = {ρ}i be the vertex set, and E = {(ρj, ρk)}j ≠ k , where (ρj, ρk): ρj h ρk be the edge set. Then KG = . KG is a graph that has the following properties. (1) KG is a directed and acyclic graph (a.k.a. dag). (2) KG is minimum, i.e. ∄ρj, ρk: ρj h ρk and ρj the consistent KG.
ρk. And, such a KG is called
Definition 4. Goal Goal is expressed by two set of started-PI’s and set of ended-PI’s. In which, set of started-PI’s, denoted by I = {PI}nI, is set of knowledge to start (a.k.a. pre-conditions), and set of ended-PI’s, denoted by O = {PI}mO, is set of knowledge to end (a.k.a. learning outcomes). I and O is chosen (by user) from set of prime ideas of KG depending on − − −
Curriculum and syllabus of instructor; or Learning materials as book/textbook; or Requirements of input and output in teaching context of each instructor or learning context of each learner/learning group.
Definition 5. Goal- based Sub-KG Given training and/or teaching goals, Sub-KG basing on goal (i.e. Goal-based SubKG) is a sub graph derived from KG with the set of started-PI’s as I, and the set of ended-PI’s as O. In short, Goal-based Sub-KG is called Sub-KG and should be kept throughout the paper. Proposition 6. Sub-KG has the following properties completeness and logicality, where 1. Completeness Given J is set of prime ideas, if ∀ρj ∈ I is necessary-condition, then O ⊂ J 2. Logicality i. ∀ρj ∈ I must be necessary-condition at least of a ρk ∈ O, ii. ∀ρj ∈ O must have at least of a ρk ∈ I is necessary-condition, and iii. If ρj is necessary-condition of ρk then ρj must be shown before ρk 3.2 Generating Sub-KG Basing on Goal With the given training and/or teaching goals, Sub-KG basing on goal is a derived graph from KG with Started-PI’s as I, and Ended-PI’s as O. Given KG = . Sub-KG is generated from KG basing on goal with I = {ρ}n is the set of started-PI’s and O = {ρ}m is the set of ended-PI’s (i.e. both I and O are the subsets of V). See Fig. 4 in details. 3.3 e-Courses: A Pedagogical Instance of Sub-KG Definition 7. Topic Topic is combination of a triple including keywords K, prime ideas ρj and displayed content Co. Let t be a topic, we have t =
120
D.-L. Le et al.
Nature of topic is prime ideas that is displayed by instructor through authoring and delivery tools (e.g. AeGS, a proposed tool to generate e-Lesson by Sub-KG) to present to learners. Definition 8. e-Course e-Course is a set of topics with an ordered sequence defined by instructional designer. We note two important properties of e-Course as follows. - e-Course is formed by all of the prime ideas of a Sub-KG basing on goal - e-Course is designed and displayed by technical skills and tools of ICT to meet fully given requirements of an on-line course.
Fig. 4. Procedure of generating Sub-KG
Proposition 9. e-Course is an on-line course satisfied principle of completeness, logicality, and pedagogy. Indeed, - Completeness and logicality. As Sub-KG basing on goal is a sub graph of KG, it is satisfied two qualities, and then e-Course is too.
Applying Pedagogical Analyses to Create an On-Line Course for e Learning
121
- Pedagogy. e-Course is designed by “learning script” (i.e. lesson plan) of each difference instructor, then it transmits instructor’s pedagogical ability and experiences to content knowledge.
4 Building e-Course and Its Application in e-Learning To build an e-Course, the procedure has two phases (see in Fig. 5) − Phase 1: Generating logical course (Sub-KG). It is presented in Section 3.2. − Phase 2: Creating interface component of e-Course, namely building topics of the e-Course.
Fig. 5. Building topic from Sub-KG and e-Course is collection of topics
Note that a topic includes: - Core knowledge of the topic (with an objective) is just PI’s which need to be presented to learner. A topic can be structured from many PI’s. External representation of the topic (through user interface of system) is the topic content to explain PI’s within topic. It depends on design of each instructor. - Topic can have various forms as concept/principle or process/operation; theory or exercise; easy or difficult; simple or complex. From that, the topic content will be edited and presented to suit a particular type of presentation (e.g. question, explanation, guiding in step-by-step, image, and diagram). Different instructors can design a topic in many various forms but all of them must be built based on the same Sub-KG of the course. It is proved that the qualities of completeness and logicality are still in each topic. - A topic (that is studying) is linked to another one (known knowledge) through using label of PI’s. When development of e-Course, instructional designer use the authoring and delivery tools to product topics according to standard metadata (e.g. IEEE-LOM). In teaching and learning contexts of developing countries' undergraduate education now (such as Vietnam [15]), there are many disadvantages like as ineffective teaching
122
D.-L. Le et al.
methods, inadequate resources, lack of common or professional skills, weak capacity of network, and infrastructures. [6] [16] Hence, the blended-learning form is a good way to support for instructor and learner in learning process, in which LMS with Web-based course is an important computer-mediated technology and infrastructure. It consists of two parts: e-Course and e-Learning activities [10]. When learner joins in the system she can be interactive actively with her instructor, content knowledge, or other learners in different activities such as leaner can recognize incomprehensible knowledge of course (through PI’s) to give feedbacks or ask for helpful activity; instructor can recommend a specific learning path for each learner through the supervising her learning process; group learners can discuss about difficult knowledge of course (through PI’s) with the support of instructor. Besides, basing on given Sub-KG, the instructor can design his presentations with multimedia, lecture notes and textbook of the specific course to provide for learners in form off-line, or instructor can also build the different e-Courses to adapt with the individual characteristics of each learner’s group. For instance (see in Fig.6), the class has three learner’s groups, in which group1’s feature is {habit = “>2h”; style = “active”; background =”good”}, group 2’s feature is {habit = “<1h”; style = “passive”; background = “inadequate”} and group 3 is the rest.
Fig. 6. Applying e-Course in the teaching-learning activities of two main actors
5 Conclusion This work aims at building on-line course with fully meaning and independence for applying in blended-learning environment, and brings pedagogical and teaching methods of instructor into learning contents to support learners in self-study activities via PC and Internet.
Applying Pedagogical Analyses to Create an On-Line Course for e Learning
123
In the future researches, we continue to develop a tool of generating course (e.g. AeGS) and an Adaptive e-Learning framework basing on proposed e-Course [10] to improve the limitations of interaction among three factors in teaching and learning process: instructor, learner and content knowledge. Then, the system will play a role of the adaptive instructional system to support and recommend content knowledge suitably to each learner or group learners. Acknowledgments. This work has been partly funded by Vietnam National University- Hochiminh city, Vietnam under the User Modeling for e-Learning (UMeL) project; code B2009-18-01TD.
References 1. Baron, A.: A teacher’s guide to distance learning, http://fcit.usf.edu/DISTANCE/ 2. Callan, D.: Content is king, http://www.akamarketing.com/content-is-king.html 3. Doctorow, C.: Ebooks: Neither E, Nor Books, O’Reilly Emerging Technologies Conference, http://craphound.com/ebooksneitherenorbooks.txt 4. Hodgins, W.: Get REAL (Relevant Effective Adaptive Learning), TECHLEARN (2001) 5. Horton, W.: E-Learning by design. Pfeiffer-An Imprint of Wiley, USA (2006) 6. Jones, C., Holmfeld, L.-D., Lindström, B.: A relational, indirect, meso-level approach to CSCL design in the next decade. International Journal Computer Supported Collaborative Learning (ijCSCL) 1(1) (2006) 7. Kenuka, H.: Instructional Design and e-Learning: A Discussion of Pedagogical Content Knowledge as a Missing Contruct. e-Journal of Instructional Science and Technology (eJIST) 9(2) (2006) 8. Le, D.-L.: Toward a supporting system for e-Learning environment. In: Addendum Contributions to the 2008 IEEE International Conference on Research, Innovation and Vision for the Future (RIVF 2008), Vietnam, pp. 200–203 (2008) 9. Le, D.-L., Tran, V.-H., Hunger, A., Nguyen, D.-T.: e-Course and its Applications in Blended-Learning Environment. In: EEE 2008 - WORLDCOMP, USA (2008) 10. Le, D.-L., Nguyen, A.-T., Nguyen, D.-T., Hunger, A.: Building learner profile in Adaptive e-Learning Systems. In: The 4th International Conference on e-Learning (ICEL), Canada, pp. 294–304 (2009) 11. Open-source LMS Moodle, page Documentations, http://docs.moodle.org/en/Main_Page 12. Pastore, M.: 30 Benefits of Ebooks, http://epublishersweekly. blogspot.com/2008/02/30-benefits-of-ebooks.html 13. Schroeder, U., Spanagel, C.: Supporting the Active Learning Process. International JI on E-learning 5(2), 245–264 (2006) 14. SCORM Content Aggregation Model Ver. 1.3.1, http://www.adlnet.org/downloads/70.cfm 15. Stephen, W., et al.: Observations on undergraduate education in computer science, electrical engineering, and physics at select universities in Vietnam. In: A Report Presented to the Vietnam Education Foundation by the Site Visit Teams of the National Academies of the United States (2006), http://home.vef.gov/download/Report_on_Undergrad_Educ_E.pdf 16. Tinio, V.L.: ICT in Education. In: United Nations Development Program (UNDP) – APDIP, pp. 3–15 (2003), http://www.apdip.net/elibrary
Adaptive Modelling of Users’ Strategies in Exploratory Learning Using Case-Based Reasoning Mihaela Cocea, Sergio Gutierrez-Santos, and George D. Magoulas London Knowledge Lab, Birkbeck College, University of London, 23-29 Emerald Street, London, WC1N 3QS, UK {mihaela,sergut,gmagoulas}@dcs.bbk.ac.uk
Abstract. In exploratory learning environments, learners can use different strategies to solve a problem. To the designer or teacher, however, not all these strategies are known in advance and, even if they were, introducing them in the knowledge base would involve considerable time and effort. In previous work, we have proposed a case-based knowledge representation, modelling the learners behaviour when constructing/exploring models through simple cases and sequences of cases, called strategies. In this paper, we enhance this approach with adaptive mechanisms for expanding the knowledge base. These mechanisms allow to identify and store inefficient cases, i.e. cases that pose additional difficulty to students in their learning process, and to gradually enrich the knowledge base by detecting and adding new strategies. Keywords: user modelling, knowledge base adaptation, exploratory learning environments, case-based reasoning.
1
Introduction
Exploratory learning environments (ELEs) provide activities that involve constructing [1] and/or exploring models, varying their parameters and observing the effects of these variations on the models. When provided with guidance and support ELEs have a positive impact on learning compared with other more structured environments [2]; however, the lack of support may actually hinder learning [3]. Therefore, to make ELEs more effective, intelligent support is needed, despite the difficulties arising from their open nature. To address this, we proposed a learner modelling mechanism for monitoring learners’ actions when constructing/exploring models by modelling sequences of actions reflecting different strategies in solving a task [4]. An important problem, however, remains: only a limited number of strategies are known in advance and can be introduced by the designer/teacher. In addition, even if all strategies were known, introducing them in the knowledge base would take considerable time and effort. Moreover, the knowledge about a task evolves over time - students may discover different ways of approaching the task, rendering the knowledge base R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 124–134, 2010. c Springer-Verlag Berlin Heidelberg 2010
Adaptive Modelling of Users’ Strategies in Exploratory Learning
125
suboptimal for generating proper feedback, despite the initially good coverage. To address this, we employ a mechanism for adapting the knowledge base in the context of eXpresser [5], an ELE for mathematical generalisation. The knowledge base adaptation involves a mechanism for acquiring inefficient cases, i.e. cases which include actions that make it difficult for students to create a generalisable model, and a mechanism for acquiring new strategies. The former could be potentially useful to enable targeted feedback about the inefficiency of certain parts of a construction, or certain actions of the student; this approach could also lead gradually to creating a library of inefficient constructions produced by students that could be analysed further by a researcher/teacher. Without the later a new valid strategy will not be recognised as such, and, consequently, the learner modelling module will diagnose the learner to be still far from a valid solution and any potential feedback will be confusing as it will guide the learner towards the most similar strategy stored in the knowledge base. The next section briefly introduces eXpresser and mathematical generalisation. Section 3 describes the case-based reasoning (CBR) cycle for eXpresser and gives a brief overview of the knowledge representation and the identification mechanism employed. Section 4 presents our proposed approach for adapting the knowledge base. Section 5 describes the validation of this approach and, finally, Section 6 concludes the paper and presents some directions for future work.
2
Mathematical Generalisation with eXpresser
eXpresser [5] is an ELE for the domain of mathematical generalisation. It is designed for classroom use and targets pupils of 11 to 14 year-olds. Each task involves constructing a model and deriving an algebraic-like rule from it. Fig. 1 illustrates the system, the properties list of a pattern (linked to another one) and an example of a rule. The left screenshot includes two windows: (a) the students’ world, where students build their constructions and (b) the general world that displays the same construction with a different value for the variable(s) involved in the task, and where students can check the generality of their construction by animating their pattern (using the Play button). We illustrate here a task called ‘stepping stones’ (see Fig. 1) displayed in the students’ world with a number of 3 red (lighter colour) tiles and in the general world with a number of 8 red tiles; the task requires to build such a model and to find a general rule for the number of blue (darker colour) tiles needed to surround the red ones. The model for this task can be built in several ways that we call strategies. Here we illustrate the ‘C strategy’, named after the shape of the building-block, i.e. the basic unit of a pattern. Its components are displayed separately in the students’ world for ease of visualisation: a red pattern, having 3 tiles, a blue one made of a C-shape pattern repeated 3 times, and 3 blue tiles. The property list of the C-shape pattern is displayed in the top right screenA specifies the number of iterations of the buildingshot. The first property () block; the value for this attribute is set to the value of the iterations of the red pattern by using a T-box (that includes a name and a value); by using a T-box,
126
M. Cocea, S. Gutierrez-Santos and G.D. Magoulas
Fig. 1. eXpresser screenshots
the two (or more) properties are made dependent, i.e. when the value in the T-box changes in one property, it also changes in the other one(s). The next B set to 2, and move-down (), C set to 0. The last properties are move-right (), D establishes the number for colouring all the tiles in the pattern - in property () our case 5 times the iterations of the red pattern. The bottom right screenshot displays the rule for the number of blue tiles: 5 x red + 3, where red stands for A (a T-box can be displayed with name only, value only or both). the T-box in The construction and the bottom-right rule in Fig. 1 constitute one possible solution for the ‘stepping stones’ task. Although in its simplest form the rule is unique, there are several ways to build the construction and infer a rule from its components. Thus, there is no unique solution and students follow various strategies to construct their models. Two such examples are illustrated in Fig. 2.
Fig. 2. (a) ‘HParallel’ Strategy; (b) ‘VParallel’ Strategy
3
Modelling Learners’ Strategies Using CBR
In case-based reasoning (CBR) [8] knowledge is stored as cases, typically including a problem and its solution. When a new problem is encountered, similar cases are retrieved and the solution is used/adapted from them. Four processes are involved [8]: (a) Retrieve similar cases to the current problem; (b)Reuse the cases (and adapt them) to solve the current problem; (c) Revise the proposed solution if necessary; (d) Retain the new solution as part of a new case.
Adaptive Modelling of Users’ Strategies in Exploratory Learning
127
In exploratory learning a problem has multiple solutions and it is important to identify the one the learner used or if the learner produced a new valid solution. To this end, eXpresser has a case-base (knowledge base) of solutions (i.e. strategies) for each task. When a learner builds a construction, it is transformed into a sequence of simple cases (i.e. strategy) and compared to all strategies in the case-base for the task the learner is working on; the case-base consists of strategies, i.e. composite cases, rather than simple cases. To retrieve the most similar strategies to the one used by the learner, appropriate similarity metrics are employed (see below). Once the most similar strategies are identified, they are used in a scaffolding mechanism that implements a form of reuse by taking this information into account along with other information, such as learner characteristics (e.g. knowledge level, spatial ability), completeness of solution and state within a task. The reuse, revise and retain steps are part of the knowledge base adaptation described in Section 4: simple cases are modified and then stored in a set of inefficient cases; new strategies are stored without modifications. We use the term knowledge base adaptation in the sense that the knowledge base changes over time to adapt to new ways in which learners approach tasks ways that could be either efficient or inefficient. This is referred to as ‘adaptation to a changing environment’ [9]. It is not, however, the same as adaptation in the CBR sense (case adaptation), which involves the reuse and revise processes [10]. These processes, though, are included in the acquisition of inefficient cases, and thus, case adaptation is partly present in our mechanism. The acquisition of new strategies corresponds to case-base maintenance in CBR terminology [8], as it involves adding a new case for which no similar case was found. Knowledge Representation. Strategies for building constructions are represented as series of cases with relations between them. A case is defined as Ci = {Fi , RAi , RCi }, where Ci is the case and Fi is a set of attributes. RAi is a set of relations between attributes and RCi is a set of relations between Ci and other cases. The set of attributes of Ci is defined as Fi = {αi1 , αi2 , . . . , αiN }. It includes three types: numeric, variables and binary. Numeric attributes correspond to the values in the property list and variables correspond to the type of those properties: number, T-box, expression with number(s) or expression with Tbox(es). Binary attributes refer to the membership of a case to a strategy, defined as a P artOf S function returning 1 if it belongs to the strategy and 0 otherwise; there are S binary attributes (S=number of strategies in the knowledge base). The set of relations between attributes of a case Ci and attributes of other cases (including Ci ) is represented as RAi = {RAi1 , RAi2 , . . . , RAiM }, where at least one of the attributes in each relation RAim , ∀m = 1, M , is from Fi , the set of attributes of Ci . Two types are used: (a) dependency relations such as the one illustrated in Fig. 1 where the number of the iterations of the blue pattern depends on the iterations of the red pattern through the use of a T-box; (b) value relations such as ‘the value of the colouring property of the blue pattern in Fig. 1 is 5 times the iterations of the red pattern’. The set of relations between cases is represented as RCi = {RCi1 , RCi2 , . . . , RCiP }, where one of the cases in each relation RCij , ∀j = 1, P is Ci . Two time-relations are used: (a) P rev relation
128
M. Cocea, S. Gutierrez-Santos and G.D. Magoulas
indicates the previous case and (b) N ext relation indicates the next case, with respect to the current case. A strategy is defined as Su = {Nu (C), Nu (RA), Nu (RC)}, u = 1, r , where Nu (C) is a set of cases, Nu (RA) is a set of relation between attributes of cases and Nu (RC) is a set of relations between cases. The structure of a strategy is displayed in Fig. 3.
Fig. 3. Schematic structure of strategies
Similarity Metrics. Strategy identification is based on scoring elements of the strategy followed by the learner according to the similarity of their attributes and their relations to strategies previously stored. The similarity metrics used for cases and strategies are displayed in Fig. 4. The overall similarity metric for strategies is: Sim = w1 ∗ F1 + w2 ∗ F2 + w3 ∗ F3 + w4 ∗ F4 , where F1 is the normalised value of F1 . To bring F1 in the same range as the other metrics, i.e. [0, 1], we applied linear scaling to unit range [11] using the function F1 = F1 /z. Weights are applied to the four similarity metrics to express the central aspect of the construction, the structure. This is mostly reflected by the F1 metric and, to a lesser extent, by the F3 metric. Therefore, we agreed on the following weights: w1 = 6, w2 = 1, w3 = 2, w4 = 1. This leads to the range of [0, 10]
Fig. 4. Similarity metrics for cases and strategies
Adaptive Modelling of Users’ Strategies in Exploratory Learning
129
for the values of Sim. The metrics have been tested for several situations of pedagogical importance: identifying complete strategies, partial strategies, mixed strategies and non-symmetrical strategies. The similarity metrics were successful in identifying all these situations (details can be found in [4]).
4
Adaptation of the Knowledge Base
Our approach for adapting the knowledge-base of eXpresser includes acquiring inefficient simple cases, acquiring new strategies and deleting redundant ones. In this paper we focus on the first two, for which some examples from the ‘stepping stones’ task are displayed in Fig. 5; the constructions are broken down into individual components used by the students for ease of visualisation. These examples, with the adaptation rationale and mechanism are discussed below.
Fig. 5. (a)-(b) HParallel strategy with an inefficient component (blue middle row) and its property list; (c)-(d) VParallel strategy with an inefficient component (blue vertical bars) and its property list; (e) a new strategy
Acquiring inefficient simple cases. The goal of this mechanism is to identify parts of strategies constructed in inefficient ways and store them in a set of ‘inefficient constructions’, i.e. constructions that pose difficulties for learners in their process of generalisation. The set could be used for automatic generation of feedback or analysed by a researcher/teacher. The results could inform the design of better interventions, could be presented as a lesson learned to the scientific community of mathematics teachers/researchers, or even discussed in class (e.g. if an inefficient construction is often chosen by pupils of that class). The construction in Fig. 5a illustrates an inefficient case within the “HParallel” strategy of the ‘stepping stones’ task: the middle bar of blue tiles is constructed as a group of two tiles repeated twice - see its property list in Fig. 5b. The efficient way to construct this component is one tile repeated 4 times or, to make it general, one tile repeated the number of red tiles plus 1. Both approaches, i.e. the efficient and the inefficient, lead to the same visual output, i.e. there is no difference in the way the construction looks like, making the situation even more confusing. The difficulty lies in relating the values of the blue middle row (Ci ) to the ones of the red middle row (Cj ). If the learner relates value 2 of iterations of Ci to value 3 of iterations of Cj , i.e. value 2 is obtained by using the iterations of red tiles (3) minus 1, this would work only for a ‘stepping stones’
130
M. Cocea, S. Gutierrez-Santos and G.D. Magoulas
task defined for 3 red tiles. In other words, this will not lead to a general model. Another example of an inefficient case is given in Fig. 5c, with its property list presented in Fig. 5d. These inefficient cases are not related to mathematical misconceptions, but seem to be a consequence of the system’s affordances combined with little experience of generalisation tasks, and come from learners’ wish to make there patterns bigger without considering the generality of their approach. However, they are pedagogically important, offering learners the possibility to reflect on their actions in relation to generalisation. Algorithm 1. Verification(StrategiesCaseBase, InputStrategy) Find most similar strategy to InputStrategy from StrategiesCaseBase StoredStrategy ← most similar strategy; if similarity > θ then Find cases of InputStrategy that are not an exact match to any case of StoredStrategy for each case that is not an exact match do InputCase ← the case that is not an exact match Compare InputCase to all cases of the set of inefficient cases; if no exact match then Find the most similar case to InputCase from the cases of StoredStrategy StoredCase ← the most similar case if Conditions(StoredCase, InputCase) returns true then // see Alg. 2 InefficientCaseAcquisition(StoredCase, InputCase) // see Alg. 3 end if end if end for end if
Algorithm 2. Conditions(C1, C2) if (MoveRight[C1] = 0 and Iterations[C1]∗MoveRight[C1] = Iterations[C2]∗MoveRight[C2]) or (MoveDown[C1] = 0 and Iterations[C1] ∗ MoveDown[C1] = Iterations[C2] ∗ MoveDown[C2]) then return true else return false end if
Algorithms 1, 2 and 3 illustrate how inefficient simple cases are identified and stored. First, the most similar strategy is found. If there is no exact match, but the similarity is above a certain threshold θ, the process continues with the identification of the inefficient cases; for each of these cases, several checks are performed (Alg. 2). Upon satisfactory results and if the cases are not already in the set of inefficient cases, they are then stored (Alg. 3). What is stored is actually a modification of the most similar (efficient) case, in which only the numerical values of iterations, move-right and/or move-down are updated together with the value and dependency relations. These are the only modifications because, on one hand, they inform the way in which the pattern was built, including its non-generalisable relations, and, on the other hand, it is important to preserve the values of P artOf S attributes, so the researcher/teacher knows in which strategies these can occur. The colouring attributes and the relation between cases are not important for this purpose and, therefore, they are not modified. This has also the advantage of being computationally cheaper.
Adaptive Modelling of Users’ Strategies in Exploratory Learning
131
Algorithm 3. InefficientCaseAcquisition(StoredCase, InputCase) N ewCase ← StoredCase for i = 4 to v − 1 do // attributes from iterations to move-down if value of attribute i of N ewCase different from that of InputCase then replace value of attribute i of N ewCase with the one of InputCase end if end for for all relations between attributes do // value an dependency relations replace relations of N ewCase with the ones of InputCase end for add N ewCase to the set of inefficient cases
New strategy acquisition. The goal is to identify new strategies and store them for future use. New strategies could be added by the teacher or could be recognized from the learners’ constructions. In the later case, after the verification checks described below, the decision of storing a new strategy is left with the teacher. This serves as another validation step for the detected new strategy. Algorithm 4. NewStrategyVerification(StrategiesCaseBase, InputStrategy) Find most similar strategy to InputStrategy from the StrategiesCaseBase if similarity < θ1 then if ValidSolution(InputStrategy) returns true then // see Alg. 5 NewStrategyAcquisition(InputStrategy) // see Alg. 6 end if end if
Fig. 5e illustrates the so-called “I strategy”, due to its resemblance to letter I. When compared to all strategies, it is rightly most similar to the ‘VParallel’ one (see Fig. 2b), as some parts correspond to it. However, the similarity is low, suggesting it may be a new strategy. Without the adaptation mechanism, the learner modelling module will infer the learner is using the ‘VParallel’ strategy, but is still far from completion. This imprecise information is potentially damaging as it could, for example, lead to inappropriate system actions, e.g. providing confusing feedback by guiding the learner towards the ’VParallel’ strategy. Conversely, the adaptation mechanism will prevent generating such confusing feedback, and storing the new strategy will enable appropriate feedback in the future - automatically or with input from the teacher/researcher. Algorithm 5. ValidSolution(InputStrategy) if SolutionCheck(InputStrategy) returns true then // checks if InputStrategy ‘looks like’ a solution if the number of cases of InputStrategy < θ2 then if InputStrategy has relations between attributes then RelationVerification(InputStrategy) // verifies that the numeric relation corresponds to the task rule solution if successful verification then return true end if end if end if end if
Algorithms 4, 5 and 6 illustrate how an input strategy is identified and stored as a new strategy (composite case). If the similarity between the input strategy
132
M. Cocea, S. Gutierrez-Santos and G.D. Magoulas
and the most similar strategy from the case-base is below a certain threshold θ1 (Alg. 4), some validation checks are performed (Alg. 5) and upon satisfaction, the new strategy is stored in the case-base (Alg. 6). If the input strategy has been introduced by a teacher and the similarity is below θ1 , the teacher can still store the new strategy, even if very similar to an existing one. Algorithm 6. NewStrategyAcquisition(N ewStrategy) add N ewStrategy to the strategies case-base adjust values of P artOf S
In Algorithm 5 the SolutionCheck(InputStrategy) function verifies whether InputStrategy ‘looks like’ a solution by examining if the mask of InputStrategy corresponds to the mask of the task. The following check takes into consideration the number of simple cases in the InputStrategy. Good solutions are characterized by a relatively small number of simple cases; therefore, we propose for the value of θ2 the maximum number of cases among all stored strategies for the corresponding task, plus a margin error (such as 3). If this check is satisfied, the RelationVerification(InputStrategy) function derives a rule from the value relations of the cases and checks its correspondence to the rule solution of the task. red For example, in the construction of Fig. 5c, the rule derived is 3∗( red 2 +1)+7∗ 2 which corresponds to the solution 5 ∗ red + 3. If all checks are satisfied, the new strategy is stored in the case-base and the P artOf S values are adjusted.
5
Validation
The validation of our proposed mechanisms includes: (a) identifying the boundaries of how far a pattern can be modified and still be recognised as similar to its original; (b) correct identification of inefficient cases within these boundaries and (c) correct identification of new strategies. This low-level testing of the system shows how the adaptation of the knowledge-base and the learner modelling module function together to improve the performance of the system. To this end, experiments have been conducted using real data produced from classroom use of eXpresser as well as artificial data that simulated situations observed in the classroom sessions. Simulated situations were based on varying parameters of models produced by learners in order to provide more data. First, a preliminary experiment using classroom data was conducted to identify possible values for the threshold θ in Algorithm 1 and threshold θ1 in Algorithm 4. Since our main aim was to test the adaptive modelling mechanism we decided not to seek optimal values for these thresholds, but only to find a good enough value for each one. Two possibilities were quickly identified - for θ: the minimum overall similarity (4.50) minus an error margin (0.50) or value 1.00 for the numerical similarity; for θ1 : the maximum overall similarity (3.20) plus an error margin (0.30) or value 1 for the numeric similarity.
Adaptive Modelling of Users’ Strategies in Exploratory Learning
133
Fig. 6. (a) the construction for the ‘pond tiling’ task ; (b) ‘I Strategy’; (c) ‘H strategy’
Experiment 1: identifying the boundaries of how far a pattern can be inefficiently modified and still be recognised as similar to its original efficient form. As mentioned previously, we consider changes in a pattern that can lead to the same visual output as the original one but use different building-blocks. More specifically, these building-blocks are groups of two or more of the original efficient building-block. This experiment looks for the limits of changes that a pattern can undergo without losing its structure. We used 34 artificial inefficient cases from two tasks: (a) ‘stepping stones’ that was defined earlier and (b) ‘pond tiling’ which requires to find the number of tiles needed to surround any rectangular pond. Fig. 6 illustrates the construction for the ‘pond tiling’ problem and two strategies frequently used by students to solve this task. Compared to ‘stepping stones’, pond tiling is more difficult, requiring the use of two variables (the width and height of the pond) rather than one. From the 34 cases, 47% were from the ‘stepping stones’ task and 53% were from the ‘pond tiling’ task. Using these cases, the following boundaries were identified: (i) groups of less than 4 building-blocks; (ii) groups of 2 building-blocks repeated less than 6 times and (iii) groups of 3 building-blocks iterated less than 4 times. Experiment 2: correct identification of inefficient cases within the previously identified boundaries. From the 34 inefficient cases of Experiment 1, 13 were outside the identified boundaries and 21 were within. From the 21 cases within the boundaries, 62% were from the ‘stepping stones’ task and 38% were from the ‘pond tiling’ task. Using the previously identified values for θ, we obtained the following results: out of these 21 cases, 52.48% had the overall similarity greater than 4.00 and 100% had the numeric similarity above 1.00. These results indicate that a small modification of a pattern can drastically affect the identification of the strategy the learner is following; hence almost half the cases have an overall similarity less than 4.00. The results obtained using the numeric similarity are much better and consistent with the fact the modifications are just numerical. Experiment 3: correct identification of strategies. The data for this experiment included 10 new strategies: 7 observed in trials with pupils and 3 artificial. Out of the 10 new strategies, 4 were from the ‘pond tiling’ task; all of them were observed in trials with pupils. The remaining 6 new strategies were from the ‘stepping stones’ task, with 3 of them observed and 3 artificial. The knowledge base for the two tasks included originally 4 strategies for the ‘stepping stones’ task and 2 strategies for the ‘pond tiling’ task. Using the previously identified values for θ1 , we obtained the following results: out of the 10 new strategies, 100% had the overall similarity below 3.50 and 70% had the numeric similarity below 1.00. As opposed to Experiment 2, the overall similarity performs better, being consistent with the fact that the overall similarity reflects better the resemblance
134
M. Cocea, S. Gutierrez-Santos and G.D. Magoulas
with the stored strategies than the numeric similarity alone. Given the range that the overall similarity has, i.e. from 0 to 10, values below 3.50 indicate a very low similarity and therefore, rightly suggest that the learner’s construction is considerably different from the ones in the knowledge base.
6
Conclusions
In this paper we presented research on modelling users’ behaviour in eXpresser, an exploratory learning environment for mathematical generalisation. We adopted a case-based formulation of strategies learners follow when building constructions in eXpresser and employed similarity metrics to identify which strategy is used by each learner. Due to the open nature of the environment, however, not all strategies are known in advance. Moreover, learners use the system in inefficient ways that lead to difficulties in solving the given tasks. To overcome these problems, we developed an adaptive modelling mechanism to expand an initially small knowledge base, by identifying inefficient cases (i.e. cases that pose additional difficulty to the user’s learning process) and new strategies. Future work includes using the information on inefficient cases and new strategies in an automatic way to either incorporate it in the feedback and/or inform the teachers and allow them to author feedback.
References 1. Piaget, J.: The Psychology of Intelligence. Routledge, New York (1950) 2. de Jong, T., van Joolingen, W.R.: Scientific discovery learning with computer simulations of conceptual domains. Review of Educational Research 68, 179–202 (1998) 3. Kirschner, P., Sweller, J., Clark, R.E.: Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problembased, Experiential and Inquiry-based Teaching. Ed. Psych. 41(2), 75–86 (2006) 4. Cocea, M., Magoulas, G.: Task-oriented Modeling of Learner Behaviour in Exploratory Learning for Mathematical Generalisation. In: Proceedings of the 2nd ISEE Workshop, AIED 2009, pp. 16–24 (2009) 5. Pearce, D., Mavrikis, M., Geraniou, E., Gutierrez, S.: Issues in the Design of an Environment to Support the Learning of Mathematical Generalisation. In: Proceedings of EC-TEL 2008, pp. 326–337 (2008) 6. Mason, J.: Generalisation and Algebra: Exploiting Children’s Powers. In: Haggarty, L. (ed.) Aspects of Teaching Secondary Mathematics: Perspectives on Practice, pp. 105–120 (2002) 7. Kaput, J.: Technology and Mathematics education. In: Grouws, D. (ed.) Handbook of Research on Mathematics Teaching and Learning, pp. 515–556. Macmillan, New York (1992) 8. Kolodner, J.L.: Case-Based Reasoning. Morgan Kaufmann Publishers, Inc., San Francisco (1993) 9. Anguita, D.: Smart adaptive systems: State of the art and future directions of research. In: Proceedings of EUNITE 2001, pp. 1–4 (2001) 10. Mitra, R., Basak, J.: Methods of case adaptation: A survey. International Journal of Intelligent Systems 20(6), 627–645 (2005) 11. Aksoy, S., Haralick, R.M.: Feature Normalisation and Likelihood-based Similarity Measures for Image Retrieval. Pattern Recognition Letters 22, 563–582 (2001)
An Implementation of Reprogramming Scheme for Wireless Sensor Networks Aoi Hashizume1 , Hiroshi Mineno2 , and Tadanori Mizuno3 1
3
Graduate School of Informatics, Shizuoka University, Japan 2 Faculty of Informatics, Shizuoka University, Japan Graduate School of Science and Technology, Shizuoka University, Japan 3–5–1 Johoku, Naka–ku, Hamamatsu, Shizuoka 432-8011, Japan [email protected], {mineno,mizuno}@inf.shizuoka.ac.jp Abstract. Reprogramming sensor nodes is important for managing sensor networks. Since the number of deployments is increasing, the need for an efficient and reliable reprogramming service is growing. Reprogramming protocols use radio communication to distribute software data. Many studies about software data dissemination have been done, but there have been little knowledge about reprogramming on sensor nodes, and challenges with reprogramming in real sensor networks have not resolved. Therefore, we designed and implemented a reprogramming scheme on real sensor nodes.
1
Introduction
Advances in MEMS and low power wireless communication technology have led to the development of wireless sensor networks (WSN). A typical WSN consists of a number of small battery-powered sensor nodes that sense, collect, and transfer various data autonomously. There are many WSN applications and services, including structural monitoring, security, and position tracking. These networks include state-of-the-art technologies (ad-hoc network routing, data processing, position estimation, etc.), and these technologies are implemented as specific code on the sensor nodes. These technologies are highly advanced and still developing. Therefore, these codes will be modified or extended in the future for long-running applications using WSNs. Thus, a method to efficiently reprogram many deployed sensor nodes is necessary. Wireless reprogramming has been extensively researched [1] [2] [3] [4]. Wireless reprogramming distributes new code to many sensor nodes using wireless multihop communication. Then, sensor nodes reprogram themselves with the received data. Reprogramming protocols now mainly focus on software data dissemination and discuss little about reprogramming [5]. Therefore, we have to address not only data dissemination but also reprogramming. We designed and implemented a reprogramming scheme on real sensor nodes. This paper is organized as follows. In Section 2, we explain several issues related to wireless reprogramming. Our reprogramming design and implementation is introduced in Section 3. We describe our experimental results in Section 4. Finally, Section 5 concludes the paper and mentions future work. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 135–142, 2010. c Springer-Verlag Berlin Heidelberg 2010
136
2 2.1
A. Hashizume, H. Mineno, and T. Mizuno
Reprogramming in Sensor Networks Reprogramming Challenges
Many wireless reprogramming protocols share design challenges [1]. The first challenge is completion time. The reprogramming completion time affects services using WSNs. When we reprogram a network, we have to stop the services and wait until the code update is completed. Thus, we have to minimize the reprogramming completion time. The second challenge is energy efficiency. Sensor nodes are usually batterypowered, and the sensor node battery provides the energy used in reprogramming. This battery also supplies energy for computing, communication, and sensing functions. Therefore, reprogramming must be energy-efficient. The third challenge is reliability. Reprogramming requires the new code to be delivered throughout the entire network, and the delivered code must be executed correctly on the sensor nodes. 2.2
Reprogramming Approaches
Reprogramming protocols use several techniques to resolve the challenges listed above. Pipelining was developed to accelerate reprogramming in multihop networks [6] [7]. Pipelining allows parallel data transfer in networks. In pipelining, a program is divided into several segments, and each segment contains a fixed number of packets. Instead of receiving a whole program, a node becomes a source node after receiving only one segment. Particularly for transmitting a large program, pipelining can reduce the completion time significantly. However, pipelining has detrimental characteristics. Pipelining works well when there is a large number of hops between base stations and farthest destination nodes. In contrast, having a small number of hops causes delay in code distribution. Negotiation is used to avoid data redundancy and improve reprogramming reliability. As explained above, pipelining is done through segmentation. After the data is segmented, it is necessary to avoid broadcast storms that are caused by dealing with a large number of segments. A negotiation scheme was developed in SPIN [8]. In this scheme, the source node knows which segment is requested before sending it out. This reduces data redundancy. Hierarchical reprogramming [9] [10] accelerates software distribution and reduces the number of control packets. First, the base station sends program codes to nodes in the upper layer of the node hierarchy (i.e., pseudo-base stations). Then the pseudo-base stations distribute the codes to other nodes in their local areas. Hierarchical reprogramming protocols improve reprogramming efficiency. Except for Firecracker and Sprinkler, most reprogramming protocols start distributing software from a single base station in the network and assume no hierarchy. Hierarchical reprogramming protocols improve reprogramming efficiency. Deploying base stations in suitable places greatly improves the efficiency of the reprogramming [11].
An Implementation of Reprogramming Scheme
3 3.1
137
Proposed Reprogramming Scheme Design Targeted Environment
As described in Section 2, most existing protocols focus on software data dissemination and hardly consider the reprogramming scheme on the sensor node. In the following sections, we consider a reprogramming scheme that includes hardware, network topology, program sections, and algorithms. The development of this scheme concerns the construction of a sensor network utilizing a ZigBee node produced by Renesas Solutions Corporation (RSO) [12]. The Renesas node is used with a second board to make our sensor system. The first is the battery-operated Renesas sensor board in Fig. 1 which has three kinds of sensors: motion, temperature, and light. It can be operated either with four AAA batteries or an AC adaptor. This board is used as the sensor node. The other board is a ZigBee evaluation board, shown in Fig. 2. It has no sensors or batteries, and therefore a power supply cable is required. This board is equipped with an RS-232C interface which can be connected with a PC. The board can be used in three ways, one board each: the first as a ZigBee router node, the second as a coordinator node, and the third as a sink node. Specify node works as a base station node (coordinator node). The assumed network topology is shown in Fig. 3. It requires several PCs, PC-connected base station nodes, and sensor nodes. A PC sends new code to a base station, and the base station sends the code to neighboring sensor nodes. Then, sensor nodes update their firmware (FW) with the receipt code.
Fig. 1. Sensor Board
Fig. 2. ZigBee Evaluation Board
PC
Wireless Communication
Sensor node
Base station node Fig. 3. Network Topology
138
3.2
A. Hashizume, H. Mineno, and T. Mizuno
Design and Implementation
We designed the reprogramming scheme to solve several of the problems described in Section 2. Here, we describe the memory allocation of the program sections. Table 1 and Fig. 4 show the program data allocation of a microcomputer in our targeted environment. Table 1. Program data allocation of microcomputer Data
Section code
Allocation
Automatic variable
stack
RAM
Static variable (has no initial value)
base
Static variable (has initial value)
data
Initial value of static variable
data I
String, constant number
rom
Program code
program
ROM
int a;
int a
bss
int b = 1;
int b
data
int d
stack
1
data_I
2
rom
const int c = 2;
RAM
void main() { int d = 0; return d;
main()
}
Source code
ROM
program
Memory
Fig. 4. Example of allocation
We designed several program sections not only for general use but for reprogramming use in the memory. Table 2 and Fig. 5 show address allocation of program sections. Program data is allocated by unit of the sections in the memory area. Program sections for general use and reprogramming use are separated to avoid collisions of address parts when we reprogram nodes. Then, we have to update only the reprogramming section and do not have to change the whole program. This reduces the amount of reprogramming data. Less data shortens completion time and reduce energy consumption.
An Implementation of Reprogramming Scheme
139
Table 2. Address allocation of program sections Section name
Section code
Address allocation
update data
data
0x003000 – 0x0030FF
update bss
bss
update bss I
bss I
0x0F8000 – 0x0F80FF
update rom
rom
0x0F9000 – 0x0F90FF
update prog
program
0x0FA000 – 0x0FA1FF
RAM (bss, data, stack) RAM (update_bss, update_data) General use ROM (data_I, rom, program) ROM (update_data_I) ROM (update_rom)
Reprogramming use
ROM (update_prog)
Fig. 5. Program Section
3.3
Reprogramming Algorithm
Base Station Node. Here, we explain the reprogramming algorithms. Fig. 6 illustrates flow charts of the firmware update algorithms on the base station nodes. Fig. 6(a) shows the initialization process of the firmware update. If the process mode is a FW update, the base station node sets a FW update flag and calls a FW update function (appF W update). After appF W update is returned, the return value of it is set to process mode. Furthermore, the node resets the FW update flag and the FW data receipt flag. Fig. 6(b) shows the main firmware update process. The base station calls the FW data get function (appF W dataGet) and then gets the FW data from the PC by calling this function. If the base station gets the FW data, it enters a loop process. If FW data ACK is FWDATA ACK UPDATE OK/ NG or a time out occurs, this loop is broken. Sensor Node. Next, Fig. 7 illustrates flow charts of firmware update algorithms on the sensor nodes. Fig. 7(a) shows the initialization process of the firmware update. If the process mode is a FW update, the sensor node analyzes the FW data message and sets the FW data size to a FW data structure. Furthermore, the node initializes a
140
A. Hashizume, H. Mineno, and T. Mizuno
Start
appFWdataGet() Others Return Value Start
FW_DATA_OK
Set FW update flag
appFWdataSend()
appFWupdate()
Wait FW data ACK Set return value as process mode
appFWdataRepCheck() Reset FW update flag Reset FW data receipt flag
Return Value FWDATA_ACK_UPDATE_OK || FWDATA_ACK_UPDATE_NG
End
(a) Initialization
End
(b) Firmware update
Fig. 6. Flow charts of algorithms on base station nodes
sequence number and a FW data counter. After the initialization of the FW data structure, the base station calls a FW update function (appF W update). After appF W update is returned, the return value of it is set to process mode. Fig. 7(b) shows the main firmware update process. The sensor node calls the message check function (appF W dataChk). If appF W dataChk returns FWDATA GET OK the node sends a FW data ACK message to the base station node to request the next FW data. If appF W dataChk returns FWDATA GET NG, the node sends a FW data NACK message to the base station node to request the lost FW data. After sending the FW data ACK/NACK message, the sensor node receives a FW data message and then calls appF W dataChk. If appF W dataChk returns FWDATA GET COMP, the node calls a ROM update function (appF W updateRom). Finally, this node sends a FW data ACK and finishes the operation. In this instance, the return value is “No Operation.”
4
Experimental Results
We allocated sensing interval and LED blink interval variables on the program section for reprogramming use. Then, we sent new variables from the PC to the sensor nodes through the base station node. Result showed that the sensor nodes were correctly reprogrammed.
An Implementation of Reprogramming Scheme
141
Start
appFWdataChk()
FWDATA_GET_COMP Return Value FWDATA_GET_OK || FWDATA_GET_NG
Send FW data ACK
Start
Wait FW data message
Initialize FW data structure
appFWupdate()
appFWupdateRom()
Set return value as process mode
Send FW data ACK
End
End
(a) Initialization
(b) Firmware update
Fig. 7. Flow charts of algorithms on sensor nodes
5
Conclusion
A reprogramming scheme was implemented in a wireless sensor network. First, we described reprogramming challenges and approaches. Then, we explained the design of the reprogramming scheme, the allocation of program sections and reprogramming algorithms on the sensor nodes and the base station nodes. In addition, we demonstrated that our proposed scheme worked correctly. In future work, we will try to implement the latest reprogramming protocols in real sensor networks. Then we will research several problems and solutions in actual environments.
Acknowledgments This work is partially supported by the Knowledge Cluster Initiative of the Ministry of Education, Culture, Sports, Science and Technology.
142
A. Hashizume, H. Mineno, and T. Mizuno
References 1. Wang, Q., Zhu, Y., Cheng, L.: Reprogramming wireless sensor networks: challenges and approaches. IEEE Network 20(3), 48–55 (2006) 2. Thanos, S., Tyler, M., John, H., et al.: A remote code update mechanism for wireless sensor networks, CENS Technical Report (2003) 3. Pradip, D., Yonghe, L., Sajal, K.D.: ReMo: An Energy Efficient Reprogramming Protocol for Mobile Sensor Networks. In: Proc. IEEE PerCom, pp. 60–69 (2008) 4. Leijun, H., Sanjeev, S.: CORD: Energy-efficient Reliable Bulk Data Dissemination in Sensor Networks. In: Proc. IEEE INFOCOM, pp. 574–582 (2008) 5. Pascal, R., Roger, W.: Decoding Code on a Sensor Node. In: Proc. IEEE DCOSS, pp. 400–414 (2008) 6. Jonathan, W.H., David, C.: The Dynamic Behavior of a Data Dissemination Protocol for Network Programming at Scale. In: Proc. ACM SenSys, pp. 81–94 (2004) 7. Sandeep, S.K., Limin, W.: MNP: Multihop Network Reprogramming Service for Sensor Networks. In: Proc. IEEE ICDCS, pp. 7–16 (2005) 8. Kulik, J., Heinzelman, R.W., Balakrishnan, H.: Negotiation-Based Protocols for Disseminating Information in Wireless Sensor Networks. Wireless Networks 8(23), 169–185 (2002) 9. Philip, L., David, C.: The Firecracker Protocol. In: Proc. ACM SIGOPS European Workshop (2004) 10. Vinayak, N., Anish, A., Prasun, S.: Sprinkler: A Reliable and Energy Efficient Data Dissemination Service for Wireless Embedded Devices. In: Proc. IEEE RTSS, pp. 277–286 (2005) 11. Aoi, H., Hiroshi, M., Tadanori, M.: Base Station Placement for Effective Data Dissemination in Sensor Networks. Journal of Information Processing 18, 88–95 (2010) 12. Renesas Solutions Corporation, http://www.rso.renesas.com/
Predicting e-Learning Course Adaptability and Changes in Learning Preferences after Taking e-Learning Courses Kazunori Nishino, Toshifumi Shimoda, Yurie Iribe, Shinji Mizuno, Kumiko Aoki, and Yoshimi Fukumura Kyushu Institute of Technology, Faculty of Computer Science and Systems Engineering, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502 Japan [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract. As a result of investigation on learning preferences and e-learning course adaptability among the students of full online courses offered by a consortium of higher education institutions, it has been found that there is a relationship between the two factors (the preference for asynchronous learning and that for the use of computer) in the learning preferences and the e-learning course adaptability. In addition, it has been found that the learning preferences of a student may change after taking an e-learning course. Furthermore, it has been found that a multiple regression analysis of the learning preferences of a student at the beginning of an e-learning course can somewhat predict the adaptability of the course at the end. Keywords: e-learning, learning preferences, e-learning adaptability, asynchronous learning, multiple regression analysis.
1 Introduction E-learning has an advantage of allowing learners to study at “any time” and “any place.” In addition, it allows diverse learning forms with the use of information and communication technologies (ICT). On the other hand, as e-learning is usually conducted asynchronously, it requires more self-discipline of students in comparison with face-to-face classes. E-learning might be easier for students who want to learn at their own pace to continue and complete a study. However, it can be challenging for those who do not like studying on their own and prefer doing in face-to-face classes. The use of learning management systems (LMS) can ease the distribution of course materials and the communication among students or between students and teaching staff. Some measures have been taken to help students understand the content of e-learning materials and also to motivate students in studying materials through e-mails sent by teachers and tutors of e-learning courses [1]. However, the use of ICT in e-learning tends to become complex as its functionality increases and may discourage those students who are not familiar with the ICT use. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 143–152, 2010. © Springer-Verlag Berlin Heidelberg 2010
144
K. Nishino et al.
The use of ICT and asynchronous learning is a typical characteristic of e-learning. However, as it is stated earlier, those who do not like asynchronous learning or the use of ICT may have the tendency to drop out in the middle of e-learning courses. Therefore, it is desirable that students and their teachers know the students’ learning preferences and their adaptability of e-learning courses in advance [2]. To investigate the learning preferences in e-learning, we developed learning preference questionnaire items asking preferences in studying, understanding, questioning, and doing homework [3]. This study investigates the change in learning preferences after taking an e-learning course, using the learning preference questionnaire mentioned above. It also proposes a method to predict e-learning course adaptability at the end of the course based on the multiple regression analysis of learning preferences at the beginning of the course.
2 Flexibility of Learning Styles and Learning Preferences The research on learning styles and learning preferences has been prolific in Europe and North America. According to the Learning Skills Research Center (LSRC) in U.K., the number of journal articles on the learning styles and learning preferences has reached more than 3,800. In those articles, 71 different theories and models for learning styles and preferences have been presented. LSRC has selected 13 most prominent theories and models of learning styles and preferences from the 71 theories and models, and further studied the 13 models [4]. LSRC classified the 13 models of learning styles and preferences into five categories from the most susceptible to the least susceptible to environments based on Curry’s onion model [5]. Previously some studies were conducted using the Kolb’s learning style [6] in developing computer-based training (CBT) [7] and examining the influence of learning styles on the “flow” experience and learning effectiveness in e-learning [8]. Other studies used GEFT (Group Embedded Figure Text) [9] to see the influence of learning styles and learning patterns on learning performance [10] and the instrument developed by Dunn, Dunn and Price [11] to build a system which provides learning environment suitable to the student’s learning style [12]. When investigating learning styles and learning preferences in e-learning, how should we consider the “flexibility of learning styles and preferences?” E-learning has the potential to provide “student-centered learning” and tends to be designed based on the pedagogy of providing learning environments according to the students’ needs, abilities, preferences and styles rather than providing uniform education without any consideration of individual needs and differences. Therefore, it is meaningful to provide students and teachers with information about the students’ adaptability to e-learning courses by using a questionnaire on learning preferences in e-learning. Here we use the term “learning preferences” instead of “learning styles” as the term, “preferences” connotes more flexibility than “styles.” This study looks at learning preferences of students in e-learning courses and determines if their learning preferences regarding asynchronous learning and the use of ICT of a student changes after taking an e-learning course.
Predicting e-Learning Course Adaptability and Changes in Learning Preferences
145
3 Survey on Learning Preferences and e-Learning Course Adaptability 3.1 Survey on Learning Preferences The survey on learning preferences was administered to those students who enrolled in the eHELP (e-Learning for Higher Education Linkage Project) which is a credit transfer system for e-learning courses offered by multiple member universities in Japan. In eHELP, students take one to three full online course(s) offered by other institutions in parallel to taking courses offered by their own institution. In taking an e-learning course, a student studies the content which is equivalent to 15 90-minute face-to-face classes. The majority of e-learning courses offered in eHELP are those in which students study by watching video lectures of instructors while using the downloadable text materials. This study was conducted from the early December of 2008 to the early January of 2009 when all the e-learning courses were completed. All the items in the questionnaire were asked with the 7-point Likert-type scale; from 1 being “don’t agree at all” to 7 “agree strongly,” and we obtained valid responses from 53 students. Those responses in which answers to the items were all the same including the reverse coded items were considered invalid. The questionnaire consists of 40 items asking preferences in studying, understanding, questioning, and doing homework in terms of asynchronous learning and the use of ICT. The questionnaire was made available online and students accessed the questionnaire online. As the result of the factor analysis [3], we could extract three factors with eigenvalues over .07(see Appendix 1): the factor 1 being “preference for asynchronous learning,” the factor 2 “preference for the use of ICTs in learning” and the factor 3 “preference for asynchronous digital communication.” 3.2 The Survey on e-Learning Course Adaptability When the learning preference questionnaire was administered, the questionnaire on e-learning course adaptability was also administered to the students who enrolled in eHELP courses. The items in the questionnaire are shown in the Table 1. The questionnaire consists of 10 items asking psychological aspects of learning such as the level of students’ understanding and the level of satisfaction. The questionnaire (see Table 1) was administered online to the students enrolled in each of the eHELP courses upon their completion of the course (i.e., between December 2008 and January 2009) and 69 completed responses were obtained. All the items in the questionnaire were asked with the 7-point Likert-type scale; from 1 being “don’t agree at all” to 7 “agree strongly.” The scores for the item (g) and (h) were reverse-coded. The mean score was 4.7. The mean score was calculated for the e-learning course adaptability, the factors 1, 2, and 3 respectively for each student, and the values were used in the subsequent analyses. In addition, the reverse-coded items were recoded to adjust to the other items.
146
K. Nishino et al.
Table 1. The question items in the e-learning course adaptability questionnaire and the mean scores Item (a) The content of this e-learning course is more understandable than regular class contents. (b) The style of learning of this e-learning course is easier to learn than regular class.
Mean
(c) The pace of this e-learning course is more suitable than regular class.
4.91
(d) This e-learning course is more satisfying than regular class.
4.36
(e) This e-learning course is more effective than regular class.
4.35
(f) This e-learning course is more interesting than regular class.
4.91
(g) This e-learning course makes me more tired than regular class. (recoded)
4.84
(h) This e-learning course makes me more nervous than regular class.
5.59
(i) This e-learning course brings me more endeavor than regular class. (recoded)
4.07
(j) This e-learning course brings me more motivation than regular class.
4.41
4.51 4.90
3.3 Correlations Correlations between the scores of the three learning preference factors and the score for the e-learning course adaptability were analyzed among the 69 respondents who completed both of the two questionnaires. The correlation r is shown in Table 2. A statistically significant (p <0.01) correlation was seen between the learning preference factor 1 (the preference for asynchronous learning) and the e-learning course adaptability and between the factor 2 (the preference for the use of ICT in learning) and the adaptability. The correlation between the learning preference factor 3 (the preference for asynchronous digital communication) and the e-learning course adaptability is not as high; however, the correlation is statistically significant at the level of p<.05. Table 2. Correlations between the e-leaning course adaptability and the leaning preference factors
Adaptability - Factor 1
r 0.53
p < 0.01
n 69
Adaptability - Factor 2
0.60
< 0.01
69
Adaptability - Factor 3
0.29
<.05
69
3.4 Multiple Regression Analysis In order to further investigate the relationships between the e-learning course adaptability and each of the three factors of learning preferences, a multiple regression analysis was conducted. The results are shown in Table 3.
Predicting e-Learning Course Adaptability and Changes in Learning Preferences
147
Table 3. The result of a multiple regression analysis Variable Name Intercept
Regression Coefficient 1.82
p **< 0.001
Factor 1(preference for asynchronous learning)
0.23
**0.0054
Factor 2(preference for the use of ICT in learning)
0.45
**0.0003
Factor 3(preference for asynchronous digital communication)
0.01
0.938
Multiple R-square
0.43
** < 0.001
N
69
―
**significant at p=0.01 *significant at p=0.05
As shown in Table 3, the regression coefficients of the factor 1 and the factor 2 are relatively high and the p values are less than 0.01. However, as for the factor 3, the regression coefficient is low and its p value is also not significant enough. It can be suspected that the multicollinearity is high between the factor 2 and the factor 3. Therefore, another multiple regression was conducted excluding the factor 3. As the result, the regression coefficients for the factor 1, the factor 2, and the intercept resulted in 0.23, 0.45, and 1.84 respectively. Therefore, the multiple regression equation was derived as follows, Adaptability to e-learning courses = 1.84+0.23×factor 1+0.45×factor 2
(1)
It is possible to predict the e-learning course adaptability for students in eHELP based on this formula.
4 Discussion on Changes in Learning Preferences and the Result of the Multiple Regression Analysis 4.1 Changes in Learning Preferences In order to further investigate the flexibility of learning preferences in e-learning discussed in the section 2, the changes in learning preferences of students before and after taking an e-learning course in the Spring semester (from the early April, 2009, to the early July, 2009) were investigated, using the learning preference questionnaire mentioned previously. The Figure 1 indicates the changes in learning preferences of 18 students who responded to the questionnaire both at the beginning of the course and at the end of the course. It shows the scores of responses at the beginning of the course deducted from the scores at the end of the course in bar charts. Hence, the vertical axis in the Figure 1 indicates the sum of the differences in scores before and after the course. The white bars on the left side in the Figure 1 (from q1 to q17) shows the changes in the factor 1 (preference for asynchronous learning) and the black bars on the right side indicates the changes in the factor 2 (preference for the use of ICT). As a whole, the factor 1 got positive scores and the factor got negative scores.
148
K. Nishino et al.
The paired sample t-test was conducted to the changes in scores shown in the Figure 1. As a result, q1, q2, and q26 showed significant differences (p<0.05) as indicated ** in the figure. In addition, the paired sample t-test of scores on q3 and q4 showed significant differences at p<0.10 as indicated * in the figure. Hence, it has been found that the learning preferences change after taking e-learning courses with regards to the five items indicated above. By taking e-learning courses, students’ preference for asynchronous learning tends to change positively, while their preference for the use of ICT tends to change negatively. Therefore, it has been found that the learning preferences for asynchronous learning and the use of ICT can change in e-learning environments. Differences in scores
Item number
Fig. 1. The changes in learning preferences regarding asynchronous learning and the use of ICT after taking an e-learning course
4.2 Multiple Regression Analyses The possibility of predicting the adaptability to e-learning courses using the formula presented in the section 3.4 was investigated using the data obtained from the e-learning courses offered in the Spring semester of eHELP (from the early April, 2009, to the early July, 2009) and in the Fall semester (from the early September, 2009, to the late December, 2009). First, in order to test the reliability of the multiple regression formula developed as a result of the 2008 study, we investigated the correlation between the actual scores obtained from the 2009 Spring students using the questionnaire of the e-learning course adaptability and the calculated scores using the multiple regression formula (n=15). The results are shown in the Figure 2. The correlation coefficient is 0.64, and a significant correlation is found between the scores obtained from the learning preference questionnaire (the total factor scores combining the scores for the factors 1 and 2) at the end of the e-learning course and the adaptability scores calculated using the multiple regression formula. Based on the result, it can be concluded that it is possible to predict
Predicting e-Learning Course Adaptability and Changes in Learning Preferences
149
Actual Scores
7 6 5 4 3 2 1 1
2
3
4
5
6
7
Calculated Scores Fig. 2. The correlation between the actual scores of the adaptability to e-learning courses and the calculated scores (at the end of the spring semester 2009)
the e-learning course adaptability using the multiple regression formula (1) discussed in the section 3.4. Then, applying the scores obtained at the beginning of the course to the multiple regression formula, we attempted to predict the e-learning course adaptability at the end of the course. The Figure 3 shows the correlation between the e-learning course adaptability scores calculated by applying the item scores for the factors 1 and 2 obtained at the beginning of the Spring course and the actual scores obtained at the end of the course (n=18). In this case, the correlation coefficient is 0.01, which indicates that the e-learning course adaptability cannot be predicted. The Figure 4 shows the correlation between the actual scores and the scores calculated using the multiple regression formula without the scores of the five items (from q1 to q4 and q26) that tend to change after experiencing e-learning as discussed in the section 4.1. As the correlation coefficient is 0.65, it can be concluded that there is a significant correlation between the calculated scores and the actual scores. This is similar to the correlation coefficient shown in the Figure 2, and it shows the possibility of predicting the e-learning course adaptability at the end of the course by using the formula with the scores excluding those of the learning preference items that may change after taking an e-learning course. In a similar vein, the Figure 5 shows the correlation between the actual scores and the scores of e-learning course adaptability calculated with all the items of the factor 1 and 2 gained at the beginning of the course in Fall and (n=15, r=0.10). In addition, the Figure 6 shows the correlation between the actual scores and the e-learning adaptability scores calculated excluding the changeable items (n=14, r=0.56). It indicates a correlation similar to the one obtained for the spring semester. These results demonstrate that it is possible to predict the e-learning adaptability of an individual student by administering the learning preference questionnaire and applying the responses to the multiple regression formula (1) discussed in the section 3.4.
K. Nishino et al.
7
7
6
6
Actual Scores
Actual Scores
150
5 4 3 2
5 4 3 2 1
1 1
2
3
4
5
6
1
7
Fig. 3. The correlation between the actual scores and the calculated scores of e-learning adaptability using all the items of the factor 1 and 2 (the spring semester)
Actual Scores
Actual Scores
6 5 4 3 2 1
3
4
5
6
4
5
6
7
Fig. 4. The correlation between the actual scores and the calculated scores of e-learning adaptability excluding the items that are changeable after taking an e-learning course (the spring semester)
7
2
3
Calculated Scores
Calculated Scores
1
2
7
Calculated Scores Fig. 5. The correlation between the actual scores and the calculated scores of e-learning adaptability using all the items of the factor 1 and 2 (the fall semester)
7 6 5 4 3 2 1 1
2
3
4
5
6
7
Calculated Scores Fig. 6. The correlation between the actual scores and the calculated scores of e-learning adaptability excluding the items that are changeable after taking an e-learning course (the fall semester)
5 Conclusion This study investigated the relationship between learning preferences and e-learning course adaptability by administering questionnaires to students who were enrolled in e-learning courses at higher education institutions. The results of the study show that the learning preferences regarding asynchronous learning and the use of ICT may change after taking e-learning courses. It has been also found that there is a significant
Predicting e-Learning Course Adaptability and Changes in Learning Preferences
151
correlation between the actual e-learning course adaptability scores and the scores calculated using the multiple regression formula with the response scores of the items that do not tend to change. Based on those results, it is concluded that the e-learning course adaptability of an individual student at the end of the course can be predicted by administering the questionnaire at the beginning of the course. In the future, we would like to increase the reliability and the validity of the questionnaire by administering it to those students who are taking e-learning courses outside the eHELP. In addition, we would like to continue the research on the factors influencing e-learning so that students can choose learning methods and environment that suit their learning preferences.
References 1. Fuwa, Y., Ushiro, M., Kunimune, H., Niimura, M.: Efforts toward the Establishment of Quality Assurances for Adults Students of Distance Learning on e-Learning System -Practice and Evaluations of Support and Advice Activities. Journal of Multimedia Aided Education Research 3(2), 13–23 (2007) 2. Nishino, K., Ohno, T., Mizuno, S., Aoki, K., Fukumura, Y.: A Study on learning Styles of Japanese e-learning learners. In: 11th International Conference on Humans and Computers, pp. 299–302 (2008) 3. Nishino, K., Toya, H., Mizuno, S., Aoki, K., Fukumura, Y.: The relationship between the learning styles of the students and their e-learning course adaptability. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds.) Knowledge-Based and Intelligent Information and Engineering Systems. LNCS, vol. 5712, pp. 539–546. Springer, Heidelberg (2009) 4. Coffield, F., Moseley, D., Hall, E., Ecclestone, K.: Learning Styles and Pedagogy in Post-16 Learning. In: A Systematic and Critical Review, Learning and Skills Research Center, London (2004) 5. Curry, L.: An Organization of Learning Styles Theory and Constructs. In: ERIC Document 235185 (1983) 6. Kolb, D.A.: LSI Learning-Style Inventory. McBer & Company/ Training Resources Group, Boston (1985) 7. Henke, H.: Learning Theory: Applying Kolb’s Learning Style Inventory with Computer Based Training. In: A Project paper for A Course on Learning Theory (1996) 8. Rong, W.J., Min, Y.S.: The effects of learning style and flow experience on the effectiveness of e-learning. In: Fifth IEEE International Conference on Advanced Learning Technologies, pp. 802–805 (1996) 9. Witkin, H., Oltman, P., Raskin, E., Karp, S.: A Manual for The Group Embedded Figures Test, Palo Alto, California (1971) 10. Lu, J., Yu, C.S., Liu, C.: Learning style, learning patterns, and learning performance in a WebCT-based MIS course. In: Information & Management, pp. 497–507 (2003) 11. Dunn, R., Dunn, K., Price, G.E.: The Learning Style Inventory. Price Systems, Lawrence (1989) 12. Wolf, C., Weaver, I.: Towards ‘learning style’-based e-learning in computer science education. In: Australasian Computing Education Conference (ACE 2003), vol. 20 (2003)
152
K. Nishino et al.
Appendix 1: The Learning Preference Questionnaire Factor 1 preference for asynchronous learning q1
When I study through computers, I tend not to care how others study.
q2
I tend to learn more actively when I study alone than studying with others at one place.
q3
I can familiarize myself better when I study independently at my convenience than studying with others at one place. I study at my own pace and do not care how others study.
q4 q5
q8
I would rather study alone at the place and time convenient to me than learn in class with other people. I can concentrate better when I study independently at my convenience than studying with others at one place. I can learn better when I study at the time I decide than when I study at the time decided by others. I would rather do group learning through computers than face-to-face.
q9
I would rather follow the computer instruction rather than study reading textbooks.
q10
I understand better when I study at my convenient time rather than learning in class with other people. I feel more motivated when I study using computers than learning from teachers in person.
q6 q7
q11 q12
It is easier for me to take test individually than to take one in a place with others.
q13 q14
I feel less tired looking at a computer screen than looking at a blackboard or a large screen in a classroom. I want to study at my own pace.
q15
I can be more creative when I study alone than studying with others at one place.
q16
I feel more motivated when I study at my convenience than learning in class with other people.
q17
I feel less tired when I study independently at my convenience than studying with others at one place.
Factor 2 preference for the use of ICT in learning q18
I understand better when I learn through computers than when I learn by reading books.
q19
I tend to learn more actively using computers than studying in class.
q20
I prefer learning through computers to learning by reading books.
q21
I am familiar with computers.
q22
It is easier for me to take test on a computer than on paper.
q23
I would rather submit my report in an electronic format than in a paper and pencil format.
q24
I prefer taking notes using a computer than writing on paper.
q25
I can concentrate better looking at a computer screen than looking at a blackboard or a large screen in a classroom. It is easier for me to memorize what is on a computer rather than to review printed materials.
q26
Factor 3 preference for asynchronous digital communication q27 q28
I would rather receive answers later from teachers via mail than asking questions in person or through chat. I prefer communicating via email to communicating through telephones.
q29
I would rather ask questions using email or bulletin boards than asking teachers in person.
q30
It is easier for me to communicate through computers or cell phones than to communicate face-to-face. I can be more creative when I think on paper than using computers.
q31
A Logic for Incomplete Sequential Information Norihiro Kamide Waseda Institute for Advanced Study, Waseda University, 1-6-1 Nishi Waseda, Shinjuku-ku, Tokyo 169-8050, Japan [email protected]
Abstract. Describing incomplete sequential information is of growing importance in Knowledge Representation in Artificial Intelligence and Computer Science. To obtain logical foundations for representing incomplete sequential information, a new logic, called sequence-indexed constructive propositional logic (SLJ), is introduced as a Gentzen-type sequent calculus by extending Gentzen’s LJ for intuitionistic logic. The system LJ is known as useful for representing incomplete information, and SLJ is obtained from LJ by adding a sequence modal operator which can represent sequential information. The cut-elimination and decidability theorems for SLJ are proved. A sequence-indexed Kripke semantics is introduced for SLJ, and the completeness theorem with respect to this semantics is proved. A logic programming framework can be developed based on SLJ. Keywords: Constructive logic, sequential information, incomplete information, sequent calculus, completeness.
1
Introduction
Aimes and results. Describing incomplete sequential information is of growing importance in Knowledge Representation in Artificial Intelligence and Computer Science. To obtain logical foundations for representing incomplete sequential information, a new logic, called sequence-indexed constructive propositional logic (SLJ), is introduced as a Gentzen-type sequent calculus by extending Gentzen’s LJ for intuitionistic logic [1]. The system LJ is known as useful for representing incomplete information, and SLJ is obtained from LJ by adding a sequence modal operator [2] which can represent sequential information. A theorem for embedding SLJ into LJ is shown, and the cut-elimination and decidability theorems for SLJ are shown by using this embedding theorem. A sequence-indexed Kripke semantics is introduced for SLJ, and the completeness theorem with respect to this semantics is proved. A sequence modal operator in temporal reasoning was studied in [2] based on a branching-time temporal logic, CTLS∗ . This logic can suitably represent temporal reasoning about sequential information. The logic CTLS∗ is, however, not appropriate for obtaining a proof-theoretical basis for reasoning with sequential R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 153–162, 2010. c Springer-Verlag Berlin Heidelberg 2010
154
N. Kamide
information. A proof system such as a Gentzen-type sequent calculus with completeness and cut-elimination theorems is required for developing an automated theorem proving method which can deal with incomplete sequential information. In this paper, we thus develop such a sequent system, SLJ. Incomplete information. In (extensions of) standard classical propositional logic, the law of excluded middle α∨¬α is valid. This means that the information represented by classical logic is complete information: every formula α is either true or not true in a model. Representing only complete information is plausible in classical mathematics, which is a discipline handling eternal truth and falsehood. The statements of classical mathematics do not change their truth value in the course of time, and the classical mathematician may assume every situation to support either the truth or the falsity of such a statement. The assumption of complete information is, however, inadequate when it comes to representing the information available to real world agents. We wish to explore the consequences of incomplete information about computer and information systems, and then it is desirable to avail of a logic which is paracomplete in the sense of not validating the law of excluded middle [5]. For representing the development of incomplete information, it turned out that constructive logics including intuitionistic logic are useful as base logics for Knowledge Representation. Indeed, constructive (intuitionistic) modal logics have been studied by many researchers (see e.g., [3]). Sequential information. The notion of “sequences” is fundamental to practical reasoning in Computer Science, because it can appropriately represent “data sequences,” “program-execution sequences,” “action sequences,” “time sequences,” “word (character or alphabet) sequences,” “DNA sequences” etc. The notion of sequences is thus useful to represent the notions of “information,” “attributes,” “trees,” “orders,” “preferences,” “strings,” “vectors,” and “ontologies.” “Sequential information” can appropriately be represented by sequences, because a sequence structure gives a monoid M, ;, ∅ with informational interpretation [6]: 1. M is a set of pieces of (sequential, ordered or prioritized) information (i.e., a set of sequences), 2. ; is a binary operator (on M ) that combines two pieces of information (i.e., a concatenation operator on sequences), 3. ∅ is an empty piece of information (i.e., the empty sequence). The sequence modal operator [b] represents labels as sequential information. A formula of the form [b1 ; b2 ; · · · ; bn ]α intuitively means that “α is true based on a sequence b1 ; b2 ; · · · ; bn of information pieces.” Further, a formula of the form [∅]α, which coincides with α, intuitively means that “α is true without any sequential information.” The semantics proposed in this paper is regarded as a natural extension of the standard Kripke semantics for intuitionistic logic. In ˆ |= α, are indexed by a this semantics, the satisfaction relations, denoted as (x, d) ˆ sequence d of symbols, and the satisfaction relation (x, ∅) |= α, which is indexed by the empty sequence ∅, just corresponds to the usual satisfaction relation x |= α of the standard Kripke semantics.
A Logic for Incomplete Sequential Information
2
155
Sequent Calculus and Cut-Elimination
Formulas are constructed from (countable) propositional variables, ⊥ (falsity constant), → (implication), ∧ (conjunction), ∨ (disjunction) and [b] (sequence modal operator) where b is a sequence. Sequences are constructed from the empty sequence ∅, (countable) atomic sequences and ; (concatenation). Lower-case letters b, c, ... are used for sequences, Greek lower-case letters α, β, ... are used for formulas, and Greek capital letters Γ , Δ, ... are used for finite (possibly empty) sets of formulas. The symbol ≡ is used to denote the equality of sets of symbols. The symbol ω is used to represent the set of natural numbers. An expression bi with i ∈ ω is defined inductively by b0 ≡ ∅ and bk+1 ≡ b ; bk . An expression [∅]α means α, and expressions [∅ ; b]α and [b ; ∅]α mean [b]α. An expression of the form Γ ⇒ γ where γ can be the empty set is called a sequent. The terminological conventions regarding sequent calculus (e.g., antecedent, succedent etc.) are used. If a sequent S is provable in a sequent calculus L, then such a fact is denoted as L S or S. A rule R of inference is said to be admissible in a sequent calculus L if the following condition is satisfied: for any instance S1 · · · Sn S of R, if L Si for all i, then L S. Definition 1. Formulas α and sequences b are defined by the following grammar, assuming p and e represent propositional variables and atomic sequences, respectively: α ::= p | ⊥ | α→α | α ∧ α | α ∨ α | [b]α b ::= ∅ | e | (b ; b) The symbol SE is used to represent the (countable) set of sequences including ˆ is used to represent [d0 ][d1 ] · · · [di ] with the empty sequence ∅. An expression [d] ˆ i ∈ ω and d0 ≡ ∅, i.e., [d] can be the empty sequence. An expression dˆ is used to represent d0 ; d1 ; · · · ; di with i ∈ ω and d0 ≡ ∅, i.e., dˆ can be ∅. Expressions ∅ ; b and b ; ∅ mean b. Definition 2 (SLJ). The initial sequents of SLJ are of the form: for any propositional variable p, ˆ ⇒ [d]p ˆ ˆ ⇒. [d]p [d]⊥ The structural inference rules of SLJ are of the form: Γ ⇒ α α, Σ ⇒ γ (cut) Γ,Σ ⇒ γ Γ ⇒γ (we-left) α, Γ ⇒ γ
Γ ⇒ (we-right). Γ ⇒α
156
N. Kamide
The logical inference rules of SLJ are of the form: ˆ ˆ Σ⇒γ Γ ⇒ [d]α [d]β, (→left) ˆ [d](α→β), Γ,Σ ⇒ γ ˆ Γ ⇒γ [d]α, ˆ [d](α ∧ β), Γ ⇒ γ
(∧left1)
ˆ ˆ Γ ⇒ [d]α Γ ⇒ [d]β (∧right) ˆ Γ ⇒ [d](α ∧ β) ˆ Γ ⇒ [d]α (∨right1) ˆ Γ ⇒ [d](α ∨ β)
ˆ Γ ⇒ [d]β ˆ [d]α, (→right) ˆ Γ ⇒ [d](α→β) ˆ Γ ⇒γ [d]β, ˆ [d](α ∧ β), Γ ⇒ γ
(∧left2)
ˆ Γ ⇒ γ [d]β, ˆ Γ ⇒γ [d]α, (∨left) ˆ [d](α ∨ β), Γ ⇒ γ ˆ Γ ⇒ [d]β (∨right2). ˆ Γ ⇒ [d](α ∨ β)
The sequence inference rules of SLJ are of the form: ˆ [d][b][c]α, Γ ⇒γ (;left) ˆ [d][b ; c]α, Γ ⇒ γ
ˆ Γ ⇒ [d][b][c]α (;right). ˆ ; c]α Γ ⇒ [d][b
ˆ SLJ − Remark that we have the fact: for any formula α and any sequence [d], ˆ ˆ (cut) [d]α ⇒ [d]α. This fact can be proved by induction on α. We can also obtain the following fact as a special case of this fact: for any formula α, SLJ − (cut) α ⇒ α. In order to show an embedding theorem, Gentzen’s LJ for intuitionistic logic is introduced below. Definition 3 (LJ). LJ is obtained from SLJ by deleting the sequence inference ˆ appearing in the initial sequents and the logical inference rules. rules and [d] The names of the logical inference rules of LJ are denoted by labeling “LJ” in superscript position, e.g., (→leftLJ ). Definition 4. We fix a countable set Ψ of propositional variables, and define the sets Ψ dˆ := {pdˆ | p ∈ Ψ } (dˆ ∈ SE) of propositional variables where p∅ := p, i.e., Ψ ∅ := Ψ . The language (or the set of formulas) LSLJ of SLJ is constructed from Ψ , ⊥, →, ∧, ∨ and [b]. The language (or the set of formulas) LLJ of LJ is constructed from Ψ dˆ, ⊥, →, ∧ and ∨. ˆ d∈SE
A mapping f from LSLJ to LLJ is defined by: 1. 2. 3. 4.
ˆ := p ˆ ∈ Ψ ˆ for any p ∈ Ψ , f ([d]p) d d ˆ f ([d]⊥) := ⊥, ˆ ˆ ˆ f ([d](α
β)) := f ([d]α)
f ([d]β) where ∈ {→, ∧, ∨}, ˆ ˆ f ([d][b ; c]α) := f ([d][b][c]α).
Let Γ be a set of formulas in LSLJ . Then, an expression f (Γ ) means the result of replacing every occurrence of a formula α in Γ by an occurrence of f (α).
A Logic for Incomplete Sequential Information
157
Theorem 5 (Embedding). Let Γ be a set of formulas in LSLJ , γ be a formula or the empty set, and f be the mapping defined in Definition 4. Then: 1. SLJ Γ ⇒ γ iff LJ f (Γ ) ⇒ f (γ). 2. SLJ − (cut) Γ ⇒ γ iff LJ − (cut) f (Γ ) ⇒ f (γ). Proof. We show only (1) since (2) is included a subcase of (1). • (=⇒): By induction on the proofs P of Γ ⇒ γ in SLJ. We distinguish the cases according to the last inference of P . We show some cases. ˆ ⇒ [d]p): ˆ ˆ ⇒ [d]p. ˆ Since Case ([d]p The last inference of P is of the form: [d]p ˆ f ([d]p) coincides with pdˆ by the definition of f , we obtain the required fact LJ ˆ ˆ f ([d]p) ⇒ f ([d]p). Case (∨right1): The last inference of P is of the form: ˆ Γ ⇒ [d]α (∨right1). ˆ Γ ⇒ [d](α ∨ β) ˆ By induction hypothesis, we have LJ f (Γ ) ⇒ f ([d]α). Then, we obtain the required fact: .. .. ˆ f (Γ ) ⇒ f ([d]α) (∨right1LJ ) ˆ ˆ f (Γ ) ⇒ f ([d]α) ∨ f ([d]β) ˆ ˆ ˆ where f ([d]α) ∨ f ([d]β) coincides with f ([d](α ∨ β)) by the definition of f . Case (;left): The last inference of P is of the form: ˆ [d][b][c]α, Γ ⇒γ (;left). ˆ [d][b ; c]α, Γ ⇒ γ ˆ By induction hypothesis, we have LJ f ([d][b][c]α), f (Γ ) ⇒ f (γ). Then, we obˆ ˆ ; c]α) by the defitain the required fact since f ([d][b][c]α) coincides with f ([d][b nition of f . • (⇐=): By induction on the proofs Q of f (Γ ) ⇒ f (γ) in LJ. We distinguish the cases according to the last inference of Q. We show only the following case. Case (→leftLJ ): The last inference of Q is of the form: ˆ ˆ f (Γ ) ⇒ f ([d]α) f ([d]β), f (Σ) ⇒ f (γ) (→leftLJ ) ˆ f ([d](α→β)), f (Γ ), f (Σ) ⇒ f (γ) ˆ ˆ ˆ where f ([d](α→β)) coincides with f ([d]α)→f ([d]β) by the definition of f . By ˆ ˆ Σ ⇒ γ. Then, induction hypothesis, we have SLJ Γ ⇒ [d]α and SLJ [d]β, we obtain the required fact:
158
N. Kamide
.. .. .. .. ˆ ˆ Σ⇒γ Γ ⇒ [d]α [d]β, (→left). ˆ [d](α→β), Γ,Σ ⇒ γ Theorem 6 (Cut-elimination). The rule (cut) is admissible in cut-free SLJ. Proof. Suppose that SLJ Γ ⇒ γ. Then, we have LJ f (Γ ) ⇒ f (γ) by Theorem 5 (1). We obtain LJ − (cut) f (Γ ) ⇒ f (γ) by the well-known cutelimination theorem for LJ. By Theorem 5 (2), we obtain the required fact SLJ − (cut) Γ ⇒ γ. Theorem 7 (Decidability). SLJ is decidable. Proof. The provability of SLJ can finitely be transformed into that of LJ by Theorem 5. Since LJ is decidable, SLJ is also decidable.
3
Kripke Semantics and Completeness
Definition 8. A Kripke frame is a structure M, SE, R satisfying the following conditions: 1. M is a nonempty set, 2. R is a reflexive and transitive binary relation on M . Definition 9. A valuation |= on a Kripke frame M, SE, R is a mapping from the set Ψ of all propositional variables to the power set 2M×SE of the direct ˆ product M × SE such that for any p ∈ Ψ , any dˆ ∈ SE, and any x, y ∈ M , if (x, d) ˆ ∈ |= (p). We will write (x, d) ˆ |= p for (x, d) ˆ ∈ |= ∈ |= (p) and xRy, then (y, d) (p). Each valuation |= is extended to a mapping from the set of all formulas to 2M×SE by the following clauses: 1. 2. 3. 4. 5. 6.
ˆ |= ⊥ does not hold, (x, d) ˆ |= α→β iff ∀y ∈ M [xRy and (y, d) ˆ |= α imply (y, d) ˆ |= β], (x, d) ˆ ˆ ˆ (x, d) |= α ∧ β iff (x, d) |= α and (x, d) |= β, ˆ |= α ∨ β iff (x, d) ˆ |= α or (x, d) ˆ |= β, (x, d) ˆ for any atomic sequence e, (x, d) |= [e]α iff (x, dˆ ; e) |= α, ˆ |= [b ; c]α iff (x, d) ˆ |= [b][c]α. (x, d)
Proposition 10. The following clauses hold: for any c, dˆ ∈ SE and any formula α, ˆ |= [c]α iff (x, dˆ ; c) |= α, 1. (x, d) ˆ iff (x, d) ˆ |= α. 2. (x, ∅) |= [d]α Proof. (1) is proved by induction on c, and (2) is derived from (1). Proposition 10 will be used implicitly in the following discussion. ˆ |= α can be read as “α is true by the execution of the The statement (x, d) ˆ sequence d at the state x.”
A Logic for Incomplete Sequential Information
159
Proposition 11. Let |= be a valuation on a Kripke frame M, SE, R. For any ˆ |= α and xRy, then (y, d) ˆ |= α. formula α, any dˆ ∈ SE, and any x, y ∈ M , if (x, d) Proof. By induction on the complexity of α. We show only the following case. ˆ |= [c]β, i.e., (x, dˆ ; c) |= β. By inducCase α ≡ [c]β. Suppose xRy and (x, d) ˆ |= [c]β. tion hypothesis, we obtain (y, dˆ ; c) |= β, i.e., (y, d) In the following discussion, Proposition 11 will often be used implicitly. An expression Γ ∧ means γ 1 ∧ γ 2 ∧ · · · ∧ γ n if Γ ≡ {γ 1 , γ 2 , ..., γ n } (0 ≤ n). Let γ be the empty sequence or a set consisting of a single formula. An expression γ ∗ means α or ⊥ if γ ≡ {α} or ∅, respectively. An expression (Γ ⇒ γ)∗ means Γ ∧ →γ ∗ if Γ is not empty, and means γ ∗ otherwise. Definition 12. A Kripke model is a structure M, SE, R, |= such that 1. M, SE, R is a Kripke frame, and 2. |= is a valuation on M, SE, R. A formula α is true in a Kripke model M, SE, R, |= if (x, ∅) |= α for any x ∈ M , and valid in a Kripke frame M, SE, R if it is true for any valuation |= on the Kripke frame. A sequent Γ ⇒ γ is true in a Kripke model M, SE, R, |= if the formula (Γ ⇒ γ)∗ is true in the Kripke model, and valid in a Kripke frame M, SE, R if it is true for any valuation |= on the Kripke frame. Theorem 13 (Soundness). Let C be the class of all Kripke frames, L := {Γ ⇒ γ | SLJ Γ ⇒ γ} and L(C ) := {Γ ⇒ γ | Γ ⇒ γ is valid in all frames of C }. Then L ⊆ L(C ). Proof. It is sufficient to show that for any sequent Γ ⇒ γ, if SLJ Γ ⇒ γ, then Γ ⇒ γ is valid in M, SE, R ∈ C, i.e., for any valuation |= on M, SE, R, and any x ∈ M , (x, ∅) |= (Γ ⇒ γ)∗ . This can straightforwardly be proved by induction on the proofs P of Γ ⇒ γ in SLJ. We now start to prove the completeness theorem for SLJ. Prior to the detailed presentation of the completeness proof, we show a Lindenbaum Lemma. Definition 14. Let x and y be sets of formulas. The pair (x, y) is consistent if for any α1 , ..., αm ∈ x and any β 1 , ..., β n ∈ y with (m, n ≥ 0), the sequent α1 , ..., αm ⇒ β 1 ∨ · · · ∨ β n is not provable in SLJ. The pair (x, y) is maximal consistent if it is consistent and for every formula α, α ∈ x or α ∈ y. The following lemma can be proved using (cut). Lemma 15. Let x and y be sets of formulas. If the pair (x, y) is consistent, then there is a maximal consistent pair (x , y ) such that x ⊆ x and y ⊆ y . Proof. Let γ 1 , γ 2 , ... be an enumeration of all formulas of SLJ. Define a sequence of pairs (xn , yn ) (n = 0, 1, ...) inductively by (x0 , y0 ) := (x, y), and
160
N. Kamide
(xm+1 , ym+1 ) := (xm , ym ∪ {γ m+1 }) if (xm , ym ∪ {γ m+1 }) is consistent, and (xm+1 , ym+1 ) := (xm ∪ {γ m+1 }, ym ) otherwise. We can obtain the fact that if (xm , ym ) is consistent, then so is (xm+1 , ym+1 ). To verify it, suppose (xm+1 , ym+1 ) is not consistent. Then there are formulas α1 , ..., αi , α1 , ..., αj ∈ xm and β 1 , ..., β k , β 1 , ..., β l ∈ ym such that SLJ α1 , ..., αi ⇒ β 1 ∨ · · · ∨ β k ∨ γ m+1 and SLJ α1 , ..., αj , γ m+1 ⇒ β 1 ∨ · · · ∨ β l . By using (cut) and some other rules, we can obtain SLJ α1 , ..., αi , α1 , ..., αj ⇒ β 1 ∨ · · · ∨ β k ∨ β 1 ∨ · · · ∨ β l . This contradicts the consistency of (xm , ym ). Hence, a pair (xk , yk ) produced by the construction ∞is consistent for any k. We thus obtain a maximal consistent pair ∞ ( n=0 xn , n=0 yn ). Definition 16. Let ML be the set of all maximal consistent pairs. A binary relation RL on ML is defined by (x, w)RL (y, z) iff x ⊆ y. A valuation |=L (p) ˆ ∈ x}. ˆ ∈ ML × SE | [d]p for any propositional variable p is defined by {((x, w), d) Lemma 17. The structure ML , SE, RL , |=L is a Kripke model such that for ˆ ∈ x iff ((x, w), d) ˆ |=L α. any formula α, any dˆ ∈ SE, and any (x, w) ∈ ML , [d]α Proof. It can be shown that (1) ML is a nonempty set, because (u, v) ∈ ML by the discussion above, (2) RL is a reflexive and transitive relation on ML , and (3) for any propositional variable p and any (x, w), (y, z) ∈ ML , if (x, w)RL (y, z) and ˆ |=L (p), then ((y, z), d) ˆ |=L (p). Thus, the structure ML , SE, RL , |=L ((x, w), d) is a Kripke model. It remains to be shown that in this model, for any formula α, any dˆ ∈ SE, ˆ ∈ x iff ((x, w), d) ˆ |=L α. This is shown by induction and any (x, w) ∈ ML , [d]α on the complexity of α. Base step. By Definition 16. Induction step. ˆ ∈ x does not hold. • Case α ≡ ⊥. By the consistency of (x, w), [d]⊥ ˆ ˆ |=L γ→δ, • Case α ≡ γ→δ. Suppose [d](γ→δ) ∈ x. We will show ((x, w), d) ˆ ˆ |=L δ]. i.e., ∀(y, z) ∈ ML [(x, w)RL (y, z) and ((y, z), d) |=L γ imply ((y, z), d) ˆ ˆ |=L γ. Then we have (*): [d](γ→δ) ∈ y by Suppose (x, w)RL (y, z) and ((y, z), d) ˆ the definition of RL , and obtain (**): [d]γ ∈ y by the induction hypothesis. Since ˆ ˆ ⇒ [d]δ, ˆ the fact that [d]δ ˆ ∈ z contradicts the (*), (**) and SLJ [d](γ→δ), [d]γ ˆ consistency of (y, z), and hence [d]δ ∈ / z. By the maximality of (y, z), we obtain ˆ ∈ y. By the induction hypothesis, we obtain the required fact ((y, z), d) ˆ |=L [d]δ ˆ ˆ δ. Conversely, suppose [d](γ→δ) ∈ / x. Then [d](γ→δ) ∈ w by the maximality of ˆ ˆ (x, w). Then the pair (x∪{[d]γ}, {[d]δ}) is consistent for the following reason. If it ˆ ⇒ [d]δ ˆ for some Γ consisting of formulas in x, is not consistent, then SLJ Γ , [d]γ ˆ and hence SLJ Γ ⇒ [d](γ→δ). This fact contradicts the consistency of (x, w). ˆ By Lemma 15, there is a maximal consistent pair (y, z) such that x ∪ {[d]γ} ⊆y ˆ ˆ ∈ and {[d]δ} ⊆ z (thus, we have [d]δ / y by the consistency of (y, z)). In the ˆ |=L γ and not-[((y, z), d) ˆ |=L δ] by the sequel, we have (x, w)RL (y, z), ((y, z), d) ˆ induction hypothesis. Therefore ((x, w), d) |=L γ→δ does not hold.
A Logic for Incomplete Sequential Information
161
ˆ ∧δ) ∈ x. Since SLJ [d](γ ˆ ∧ δ) ⇒ [d]γ, ˆ the fact • Case α ≡ γ ∧δ. Suppose [d](γ ˆ ˆ that [d]γ ∈ w contradicts the consistency of (x, w), and hence [d]γ ∈ x by the ˆ ∈ x. By the induction hypothesis, maximality of (x, w). Similarly, we obtain [d]δ ˆ ˆ |=L γ ∧ δ. ˆ we obtain ((x, w), d) |=L γ and ((x, w), d) |=L δ, and hence ((x, w), d) ˆ ˆ |=L ˆ Conversely, suppose ((x, w), d) |=L γ ∧ δ, i.e., ((x, w), d) |=L γ and ((x, w), d) ˆ ∈ x and [d]δ ˆ ∈ x by the induction hypothesis. Since SLJ δ. Then we obtain [d]γ ˆ ˆ ˆ ˆ ∧ δ) ∈ w contradicts the consistency [d]γ, [d]δ ⇒ [d](γ ∧ δ), the fact that [d](γ ˆ of (x, w), and hence [d](γ ∧ δ) ∈ / w. By the maximality of (x, w), we obtain ˆ [d](γ ∧ δ) ∈ x. ˆ ∨ δ) ∈ x. Since SLJ [d](γ ˆ ∨ δ) ⇒ [d]γ ˆ ∨ [d]δ, ˆ • Case α ≡ γ ∨ δ. Suppose [d](γ ˆ [d]δ ˆ ∈ w contradicts the consistency of (x, w), and hence [d]γ ˆ ∈ the fact that [d]γ, / ˆ ˆ ˆ w or [d]δ ∈ / w. Thus, we obtain [d]γ ∈ x or [d]δ ∈ x by the maximality of (x, w). ˆ |=L γ or ((x, w), d) ˆ |=L δ, By the induction hypothesis, we obtain ((x, w), d) ˆ |=L γ ∨ δ, i.e., ˆ |=L γ ∨ δ. Conversely, suppose ((x, w), d) and hence ((x, w), d) ˆ ˆ ((x, w), d) |=L γ or ((x, w), d) |=L δ. By the induction hypothesis, we obtain ˆ ∈ x or [d]δ ˆ ∈ x. Since SLJ [d]γ ˆ ⇒ [d](γ ˆ ∨ δ) and SLJ [d]δ ˆ ⇒ [d](γ ˆ ∨ δ), [d]γ ˆ the fact that [d](γ ∨ δ) ∈ w contradicts the consistency of (x, w), and hence ˆ ˆ ∨ δ) ∈ x. [d](γ ∨ δ) ∈ / w. By the maximality of (x, w), we obtain [d](γ ˆ • Case α ≡ [c]β. [d][c]β ∈ x iff ((x, w), dˆ ; c) |=L β (by the induction hypothˆ |=L [c]β. esis) iff ((x, w), d) Theorem 18 (Completeness). Let C be the class of all Kripke frames, L := {Γ ⇒ γ | SLJ Γ ⇒ γ} and L(C ) := {Γ ⇒ γ | Γ ⇒ γ is valid in all frames of C }. Then L = L(C ). Proof. In order to prove this theorem, by Theorem 13, it is sufficient to show that L(C ) ⊆ L, i.e., for any sequent Γ ⇒ γ, Γ ⇒ γ is valid in an arbitrary frame in C , then it is provable in SLJ. To show this, we show that if Γ ⇒ γ is not provable in SLJ, then there is a frame F = ML , SE, RL ∈ C such that Γ ⇒ γ is not valid in F , i.e., there is a Kripke model ML , SE, RL , |=L such that Γ ⇒ γ is not true in it. Suppose that Γ ⇒ γ is not provable in SLJ. Then the pair (Γ , γ) is consistent. By Lemma 15, there is a maximal consistent pair (u, v) such that Γ ⊆ u and γ ⊆ v. Note that if γ ≡ {α}, then α ∈ / u by the consistency of (u, v). Our goal is to show that ((u, v), ∅) |=L (Γ ⇒ γ)∗ does not hold in the constructed model. Here we consider only the case Γ = ∅. We show that ((u, v), ∅) |=L Γ ∧ →γ ∗ does not hold, i.e., ∃(x, z) ∈ ML [[(u, v)RL (x, z) and ((x, z), ∅) |=L Γ ∧ ] ˆ we can and [((x, z), ∅) |=L γ ∗ does not hold ]]. Taking (u, v) for (x, z) and ∅ for d, verify that there is (u, v) ∈ ML such that [(u, v)RL (u, v) and ((u, v), ∅) |=L Γ ∧ ] and [((u, v), ∅) |=L γ ∗ does not hold]. The first argument is obvious because of the reflexivity of RL and the fact that Γ ⊆ u. The second argument is shown below. The case γ ≡ ∅ is obvious because ((u, v), ∅) |=L ⊥ does not hold. The case γ ≡ α can be proved by using Lemma 17 and the fact that α ∈ / u, because we have the fact that α ∈ / u iff [((u, v), ∅) |=L α does not hold] by Lemma 17.
162
4
N. Kamide
Concluding Remarks
In this paper, the logic SLJ was obtained from Gentzen’s sequent calculus LJ for intuitionistic logic by adding the sequence modal operator. SLJ can suitably represent incomplete sequential information. The cut-elimination, decidability and completeness theorems for SLJ were proved as the main results of this paper. It was thus shown in this paper that SLJ gives an appropriate proof-theoretical basis for representing incomplete sequential information. Finally, we mention that SLJ can be used to develop a logic programming framework. Many researches have reported on the applications of LJ and its neighbors to programming languages. By the embedding theorem of SLJ into LJ, we can translate the set of formulas of SLJ into that of LJ. Hence, SLJ can also be adopted to the previously established frameworks of goal-directed logic programming [4]. Acknowledgments. This work was partially supported by the Japanese Ministry of Education, Culture, Sports, Science and Technology, Grant-in-Aid for Young Scientists (B) 20700015.
References 1. van Dalen, D.: Logic and structure, 3rd edn. Springer, Heidelberg (1994) 2. Kamide, N., Kaneiwa, K.: Extended full computation-tree logic with sequence modal operator: Representing hierarchical tree structures. In: Nicholson, A., Li, X. (eds.) AI 2009. LNCS (LNAI), vol. 5866, pp. 485–494. Springer, Heidelberg (2009) 3. Kamide, N., Wansing, H.: Combining linear-time temporal logic with constructiveness and paraconsistency. Journal of Applied Logic 8, 33–61 (2010) 4. Miller, D., Nadathur, G., Pfenning, F., Scedrov, A.: Uniform proofs as a foundation for logic programming. Annals of Pure and Applied Logic 51, 125–157 (1991) 5. Odintsov, S., Wansing, H.: Inconsistency-tolerant description logic. Motivation and basic systems. In: Trends in Logic. 50 Years of Studia Logica, pp. 301–335. Kluwer Academic Publishers, Dordrecht (2004) 6. Wansing, H.: The Logic of Information Structures. LNCS (LNAI), vol. 681, 163 p. Springer, Heidelberg (1993)
A Power-Enhanced Algorithm for Spatial Anomaly Detection in Binary Labelled Point Data Using the Spatial Scan Statistic Simon Read1 , Peter Bath1 , Peter Willett1 , and Ravi Maheswaran2 1
Department of Information Studies, University of Sheffield, Sheffield S1 4DP, UK [email protected] 2 ScHARR, University of Sheffield, Sheffield S1 4DA, UK
Abstract. This paper presents a novel modification to an existing algorithm for spatial anomaly detection in binary labeled point data sets, using the Bernoulli version of the Spatial Scan Statistic. We identify a potential ambiguity in p-values produced by Monte Carlo testing, which (by the selection of the most conservative p-value) can lead to sub-optimal power. When such ambiguity occurs, the modification uses a very inexpensive secondary test to suggest a less conservative p-value. Using benchmark tests, we show that this appears to restore power to the expected level, whilst having similarly retest variance to the original. The modification also appears to produce a small but significant improvement in overall detection performance when multiple anomalies are present.
1
Introduction
The detection of spatial anomalies (a.k.a. ‘spatial cluster detection’) in binary labeled point data has important applications in analyzing the geographic distribution of health events (e.g. [1]) and other fields such as forestry (e.g. [2]). Since 1997, the freely available SaTScanTM software package (www.satscan.org) has provided a means of detecting such anomalies using the Spatial Scan Statistic [3] and has been used in well over a hundred published scholarly studies [4]. Section 2 gives more details of the research context. Combined with a moving scan window procedure, the statistic identifies the location, size and statistical significance (p-value) of potential anomalies within a given study region. Due to the nature of the Spatial Scan Statistic as it is currently applied to binary labeled point data (see primer in Section 3), it is sometimes not possible to establish an exact p-value for the most likely potential anomaly. This ambiguity can occur in various places on the unit interval, including the most useful part of the range, 0 ≤ p-value ≤ 0.1. This issue is explained in Section 4. When ambiguity occurs, SaTScanTM selects the most conservative p-value, resulting in a sometimes lower-than-expected false positive rate (FPR), and a corresponding reduction in true positive rate (TPR, a.k.a. statistical power or sensitivity). R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 163–172, 2010. c Springer-Verlag Berlin Heidelberg 2010
164
S. Read et al.
The principal contribution of this paper is to describe a means by which, when p-value ambiguity occurs, otherwise redundant information can be used to meaningfully (and consistently) suggest a less conservative p-value. This is described in Section 5. Using benchmark tests (Section 6) we show this produces a false positive rate very close to the nominal significance level, and correspondingly increases the power in the circumstances outlined above. The proposed algorithm also delivers a p-value consistency (i.e. mean retest variance) comparable to SaTScanTM The secondary contribution is that, when applied to data sets where several anomalies are present, the modification appears to produce a small improvement in the ratio of true and false positive rates, as measured using area under curve (AUC) as applied to ROC curves. We use a ‘within datasets’ Monte Carlo method to show this improvement to be statistically significant, as described in Section 6. A discussion of the results, and future research directions, is given in Section 7.
2
Research Context
Spatial anomaly detection has three broad categories: global (identifying if any anomaly is present in the study region, but not specifying a location); localised (as global, but specifying location) and focused (testing for the presence of an anomaly at a location specified a priori). The Spatial Scan Statistic [3] is a widely use method of localised anomaly detection. Localised is the most flexible, as it can perform the function of the other two, albeit possibly with sub-optimum power. Following on from [3] many frequentist versions of the statistic have been developed and compared, (see list in [4]) as well a Bayesian version [5]. Mostly these are for use with areal data, e.g. disease counts in postal districts. The Bernoulli version (hereafter SSSB ), is for use with binary labeled point data. Despite being introduced in [3] and used in various studies since (e.g. [1], [2]), there has been little research into the benchmark performance of the SSSB . Recently [6] used the SSSB when considering a alternative circular scan window selection method (scan windows, termed here Zj , are defined in Section 3), and [7] have developed a risk-adjusted SSSB variant. Although the latter is of some interest, it appears to have lower-than-expected FPR and TPR, so in this paper we only consider the original SSSB . It is also worth noting that many studies have been published into the effect of using different shaped scan windows, of which a useful summary is given in [8]. The results of this study should be applicable to all types of scan window, provided they can be applied to point data.
3
Primer: Spatial Scan Statistic (Bernoulli Version)
The Spatial Scan Statistic has several versions. The Bernoulli (hereafter SSSB ) is suitable to binary labeled point data. For the benefit of readers unfamiliar
A Power-Enhanced Algorithm for Spatial Anomaly Detection
165
with the Spatial Scan Statistic, this section formally defines1 the SSSB , its accompanying data structures and method of application2 . For a derivation of the statistic see [3]. Consider a spatial region R , with r any point location therein. Consider a data set P = {p1 ,p2 ...pN } where each pi is associated a single point location loci ∈ R, and a binary label si . Let P0 = {pi : si = 0} and P1 = {pi : si = 1}, such that P0 ∪ P1 = P and P0 ∩ P1 = . Let N and the position of each loci be taken as given, but assume each si value to be the outcome of an independent Bernoulli trial with probability p(si = 1) = ρ(r), where ρ(r) is some arbitrary value (on the unit interval) associated with point r. Let H0 represent the (null) hypothesis that ρ(r) is constant for all r ∈ R. That is, the distribution of the elements of P1 amongst P is uniformly random. Let HA represent the (alternate) hypothesis that a spatial anomaly is present, i.e. there is a subset of R where ρ(r) is higher (or lower) than the rest of R. Put formally, HA ⇔ (∃A ⊂ R, hence = 1. In ∃B = R − A) such that ρ(a ∈ A) = βρ(b ∈ B), where β is a constant3
this study we only consider β > 1, but the results will apply equally to β < 1. This Bernoulli model is useful for representing point occurrences in many realworld applications, as it controls for a inhomogeneous underlying distribution of events. In real data sets, A (if it exists) can only be estimated by guessing which loci lie inside or outside it. Let us call any particular estimate Z. Furthermore, let us assume we have some predefined scheme (typically a moving scan window of variable size) for generating a set of estimates, Z = {Z1 , Z2 , Z3 . . .}, where each Zj ⊂ P . The purpose of the SSSB is to determine which Zj (let us call this Zprime ) is most likely to represent4 A, if indeed A exists. We then associate a p-value with Zprime , which represents the probability that H0 is true (in which case Zprime is a random artefact). To use the SSSB , we split all Zj such that Zj0 = {pi : pi ∈ Zj and si = 0} and Zj1 = {pi : pi ∈ Zj and si = 1}. For each Zj the SSSB takes four integer inputs (N = |P |, C = |P1 |, n = |Zj | and c = |Zj1 |) and produces one quasi-continuous output, the log likelihood ratio or LLR. The formula is given in three parts: Equation 15 gives the likelihood of HA if Zj represents A; Equation 2 gives the likelihood of H0 , identical of all choices of Zj . Equation 36 brings the two values together to produce the LLR. Testing all Zj ∈ Z using the SSSB gives a multiset L of LLR values, where L = {llr1 , llr2 , llr3 , . . .}. Zprime is then Zj for which llrj ≥ llrk ∀k
= j (let us call this llrprime ). In the case of multiple maximum LLR values, an arbitrary choice for Zprime is made. 1 2 3 4 5 6
Regarding notation: italic lower-case = scalar; italic upper-case = set (or multiset), bold upper-case = set (or multiset) of sets (or multisets). SSSB can also be applied to spatio-temporal data, not discussed here. Note an assumption of uniform probability inside and outside the anomaly is required by the Spatial Scan Statistic. By represent, we mean pi ∈ Zprime ⇒ loci ∈ A and pi ∈ / Zprime ⇒ loci ∈ /A Note I represents the indicator function. Note any log base can be used, provided it is consistent throughout the study.
166
S. Read et al.
C−c n n C − c C−c C − c (N −n−C+c) c ) ) ) LA (Zj ) = ( )c (1 − )(n−c) ( (1 − I( > c c N −n N −n n N −n (1) C C N − C (N −C) L0 = ( ) ( ) (2) N N LA (Zj ) llrj = log (3) L0 The p-value of Zprime is obtained by randomisation testing. For step m (of M Monte Carlo steps) the si values of all points are pooled and randomly re-allocated. The above procedure is repeated, generating a new multiset Lm of LLR values. For each Lm the maximum LLR (llrprime−m ) is recorded and stored in multiset D where D = {llrprime−1 , llrprime−2 , . . . llrprime−M }. If H0 is true, the ‘real’ value llrprime should fall comfortably within the distribution of llrprime−m values7 . To calculate the p-value8 of Zprime using the established SaTScanTM procedure, we count the number of llrprime−m ≥ llrprime (let’s call this v) and set the p-value to (v + 1)/(M + 1). This Monte Carlo procedure is compatible with most versions of the Spatial Scan Statistic, but it sometimes creates a particular problem when used with the SSSB , discussed in Section 4.
4
Problem Identification
All versions of the Spatial Scan Statistic share a common characteristic of being the individually most powerful test for a localised anomaly [3]. This means if a particular HA is true (see Section 3), then for a given Z and a given FPR (i.e. probability of Type I error), no test can have a greater chance of correctly rejecting H0 . Of course, this assumes one is in control of the FPR. In benchmark tests conducted by the author using some other versions of the Spatial Scan Statistic (not presented here), the FPR the Spatial Scan Statistic is generally very close to the nominal significance level (hereafter α). However, for the SSSB the FPR is sometimes markedly lower than α, which correspondingly reduces the TPR (a.k.a power). An explanation is given below. As described in Section 3, the SSSB has four integer input parameters (N , C, n, c), and one quasi-continuous output parameter, the LLR. Within any given data set, N and C are constant, leaving only two free integer parameters. Thus many scan windows (Zj ∈ Z) share duplicate LLR values, which also produces duplicate llrprime−m values in D. The problem arises when multiple values in D match the ‘real’ llrprime , as one then has a range of equally valid p-values to choose from. SaTScanTM defers to the most conservative p-value, by setting v to the count of all llrprime−m ≥ llrprime . This is not in any way incorrect, but it does lead unavoidably to the drop in FPR and TPR mentioned above. One can instead set v to the count of all llrprime−m > llrprime ; this leads to higher TPR but also a FPR significantly higher than α, which may not be acceptable to 7 8
Under H0 ρ(r) is uniform, so randomising si has little affect on llrprime . Other Zj with high llrj may also be of interest, but this is not our concern here.
A Power-Enhanced Algorithm for Spatial Anomaly Detection
167
users. Of course this is only a problem when these multiple p-values straddle α. Unfortunately, in both sets of benchmark tests presented in this paper, a range of equally likely p-values frequently occurs that includes the popular α values such as 0.05. Increasing the number of Monte Carlo repetitions does not help, as the number of duplicates increases also. If we are only concerned about the veracity of outcomes when averaged over many datasets, we could simply select a uniformly randomly p-value somewhere between the highest and lowest p-value (inclusive) in the ambiguous range. However, such a speculative p-value is clearly unacceptable in a real-world testing situation. The user could look for an alternative source of information about the data points instead, but an internal solution would clearly be preferable. Assuming the point locations and status are all we have, the only way of obtaining additional information is perform a different type of anomaly test, ideally one unlikely to produce duplicate values. Then we can then associate a secondary value with each LLR, enabling us to rank the llrprime value amongst many identical llrprime−m values. The problem then chiefly becomes one of computation time, as one must multiply the cost of the secondary test by M +1. The following section outlines a potential, time-efficient, solution.
5
Proposed Solution
As a secondary anomaly test (to help resolve p-value ambiguity) one can make use of the values in L and Lm (the sets of all LLR values obtained from both the ‘real’ data and each Monte Carlo step). These are a reservoir of information, most of which is discarded. Most of these LLR values are close to zero, as they correspond to Zj in which |Zj0 | and |Zj1 | are wholly compatible with H0 . However, when an anomaly is present, many Zj (aside from Zprime ) may partially overlap or wholly include the anomaly A. Thus we may have many unusually high llrj values, even if only one anomaly is present. Therefore the mean llrj value (hereafter llr) should generally be higher when an anomaly is present9 . Of course, we would not ordinarily use llr as a test statistic when many well established global anomaly tests exist. However, llr is very inexpensive to calculate, making it attractive as a secondary test (for reasons outlined in Section 4). Using the original procedure, we must calculate every llrj to find Zprime . So, these values must all be present within the processor at some point, and we can use cache memory (perhaps even a spare register) to hold the running total. The cost of each addition, compared to the exponential/logarithmic operations require to find the LLR, is minuscule. An algorithm to implement this, shown alongside the original algorithm, is given below. The line marked * is the step that accounts for the majority of the total computation time. The creation of D is provided here only for illustrative purposes; if the elements of D are stored in a suitably ordered way, v can be calculated directly from D.
9
llr is therefore a ‘global’ anomaly detection statistic, as defined in Section 2.
168
S. Read et al.
Original Algorithm
Proposed Algorithm
Load P0 and P1 from file Generate Z and L llrprime = max(L) Note Zprime Create empty set D For m = 1 to M { Shuffle all si values * Generate Lm llrprime−m = max(Lm ) Insert llrprime−m into D } D (⊆ D) = {llrprime−m : llrprime−m ≥ llrprime } v = |D | p-value = (v + 1)/(M + 1) Report Zprime and p-value
Load P0 and P1 from file Generate Z and L, note running total of llrj llrprime = max(L) Note Zprime llr = llrj /|L| Create empty set D For m = 1 to M { Shuffle all si values * Generate Lm , note running total of llrmj llrprime−m = max(Lm ) llrm = llrmj /|L| Insert pairing { llrprime−m , llrm } into D } D (⊆ D) = { {llrprime−m , llrm } : ( llrprime−m > llrprime ) or ( llrprime−m = llrprime and llrm ≥ llr ) } v = |D | p-value = (v + 1)/(M + 1) Report Zprime and p-value
6
Benchmark Results
The proposed algorithm shown above was coded in C++ and compared to the original SaTScanTM software using two batches of synthetic benchmark ‘case/ control’ data. This section briefly describes the technical implementation, and presents the results. So that a direct comparison could be made, the generation of Z (and L) was performed using the same concentric circular method used by SaTScanTM . This involves generating a set of concentric circles centred on each loci , selecting only those circles with radii just sufficient to include loci (i
= i , and pi ∈ P1 ). For each circle, a scan window (Zj ) is created containing all members of P whose location loci lies within this circle. A graphic example is given in [6]. Two batches of synthetic data sets (BCSR and BT REN T ) were generated using a separate program. Both contain 6000 data sets: 3000 representing H0 ; 3000 representing a selected HA . Each data set contains the loci (two integer co-ordinates on a 500×500 grid) of 300 points: 200 ‘controls’ (si = 0) and 100 ‘cases’ (si = 1). Regarding the control loci : for BCSR these were generated under complete spatial randomness; for BT REN T they were generated using a Poisson process, with a p.d.f. in proportion to the 2001 population density of the Trent region of the UK, mounted onto the same 500×500 grid (full details given in [6]). This offers comparison between homogeneous and inhomogeneous background point density. Regarding the case loci : under H0 these follow the same distribution as the control loci . For BCSR under HA , we chose to insert into each data set three
A Power-Enhanced Algorithm for Spatial Anomaly Detection
169
randomly located, isotropic, Gaussian shaped anomalies, each with a maximum relative risk 10 (hereafter MRR) of 15. For a fully illustrated description, see [6]. To give comparison with a tougher test, for BT REN T (under HA ), we inserted only one such randomly placed Gaussian anomaly, also with MRR of 15. To assess performance, we obtained the p-value of both Zprime values (one generated by SaTScanTM and one by the proposed algorithm) for all data sets, using (M =)999 Monte Carlo steps. For each value of α in the range {0.001, 0.002, . . . , 1} the count of p-values ≤ α was recorded for BCSR and BT REN T , split into counts for H0 and HA . Dividing the H0 count by 3000 gives us the FPR (false positive rate, a.k.a. 1-specificity), and similarly for HA we have TPR (true positive rate, a.k.a. power or sensitivity). These are shown in Figure 1. It can be seen the FPR of the proposed algorithm is closer to parity across most α values. As expected, the TPR of the proposed algorithm is also higher than that of SaTScanTM when the FPR of the latter dips below parity (due to p-value ambiguity in those ranges of α).
Fig. 1. FPR/TPR curves for benchmark data sets: BCSR (left) BT RENT (right) 10
This is the amount of the relative increase in the probability of a case location occurring at the very centre of the anomaly, with the increase smoothly decreasing (following a Gaussian curve) as distance from the anomaly centre increases.
170
S. Read et al.
Although the proposed algorithm appears to rectify the overall drop in FRP and TPR, as mentioned in Section 4 we could have achieved this by simply randomising the p-value for each data set within its ambiguity range. Users need assurance the p-value suggested by a modified test is consistent, at least similarly consistent to SaTScanTM . Table 1 shows the mean retest variance for both tests, plus the randomised version just described. Here we selected 50 data sets at random from each of the 3000 H0 and HA data sets used for BCSR and BT REN T . We then retested each data set 50 times, calculating the p-value variance. We then took the mean of this variance across the 50 data sets in each batch, shown in Table 1. The results show clearly the p-value consistency of the proposed algorithm is very close to that of SaTScanTM , and considerably better than the randomised version. Table 1. Table showing mean retest variance (×10−3 ) of the different algorithms Data sets source — SaTScanTM — Proposed algorithm — Randomised algorithm BCSR : H0 BCSR : HA BT RENT : H0 BT RENT : HA
0.160 0.042 0.140 0.065
0.177 0.043 0.159 0.080
1.486 0.199 1.291 0.721
Although the rectification of the FPR (and corresponding increase in TPR) is our main aim, it is also useful to test if the proposed algorithm has any noticeable effect on overall detection performance (i.e. the ratio of TPR to FPR). Plotting the pairings of FPR and TPR for each α value gives us the standard ROC (Receiver Operator Characteristic) curve for both SaTScanTM and the proposed algorithm, as applied to BCSR and BT REN T . It has been proved [9] the area under a ROC curve (hereafter AUC) is equal to the probability that the test, when presented with one H0 and one HA data set, will correctly distinguish which is which. However, AUC is calculated using 0
This involves randomly selecting 50% of data sets and swapping the p-values of the two tests, then recalculating the ratio of AUC0.1 for both tests. This swapping is permissible under a null hypothesis that the underlying performance of the two tests is idenitcal. Repeating this (say 10,000 times) produces a distribution for the ratio of the two AUC0.1 values, against which the ‘real’ ratio can be measured.
A Power-Enhanced Algorithm for Spatial Anomaly Detection
171
Fig. 2. ROC curves (in range 0 < FPR < 0.1) for: BCSR (left) BT RENT (right)
figure of 1.44% and 0.7929 for the figure of -0.62%. This indicates the confidence with which we may reject a null hypothesis that the increase in AUC0.1 (or decrease in the case of the -0.62% figure) is due solely to random variations in test performance. The significance levels suggest we can be confident that some performance improvement occured in the BCSR data sets, whereas no significant difference in performance occured in BT REN T data sets. The reason for this is likely to be the multiple anomalies present in the BCSR sets, an issue which is discussed further in Section 7.
7
Conclusion
In this paper we have identified a potential ambiguity in p-values produced by the Bernoulli version of the Spatial Scan Statistic (SSSB ), when used within the Monte Carlo algorithm with which it is normally associated. We proposed and tested a modified Monte Carlo algorithm which uses a very inexpensive secondary test to produce a more precise p-value, with a retest consistency similar to the p-value produced by the SaTScanTM software. The proposed algorithm appears to restore false positive rate (FPR) to approximate parity with the nominal significance level of the test, and correspondingly increases the true positive rate (TPR), a.k.a. power. Two batches of 6000 data sets were used for benchmark testing: one with three anomalies set against a homogeneous background point density; one with a single anomaly set against a inhomogeneous background point density. A similar rectification of FPR and TPR rates was seen across both, and in the former a small (but statistically significant) increase in overall detection performance was also observed, as measured by area under ROC curve in the critical area 0
172
S. Read et al.
hypothesis containing only a single anomaly; the batch that witnessed the improvement contains data sets with three anomalies. Due to its global nature, our secondary test statistic (i.e. the mean LLR) may be sensitive to the presence of multiple anomalies in a way that the Spatial Scan Statistic (i.e. the maximum LLR) is not. This raises the question, even when there is no p-value ambiguity in the Spatial Scan Statistic, whether it might be useful to take the mean LLR into account in some way. We hope these results are of interest to the research community, and may in future investigate the properties of other non-maximum LLR values with a view to gaining improvements in anomaly detection. It is expected any such improvements will apply equally to spatio-temporal point data, and this may be the subject of future benchmark testing.
Acknowledgments We wish to thank the Medical Research Council for funding Simon Read, the principal researcher, programmer and author. We also thank Professor Stephen Walters of ScHARR, for his thoughts on the statistical comparison of ROC curves.
References 1. Viel, J.F., Floret, N., Mauny, F.: Spatial and space-time scan statistics to detect low rate clusters of sex ratio. Environmental and Ecological Statistics 12(3), 289–299 (2005) 2. Riitters, K.H., Coulston, J.W.: Hot Spots of Perforated Forest in the Eastern United States. Environmental Management 35(4), 483–492 (2005) 3. Kulldorff, M.: A Spatial Scan Statistic. Communications in Statistics - Theory and Methods 26(6), 1481–1496 (1997) 4. Kulldorff, M.: SaTScanTM Users Guide (2009), http://www.satscan.org/techdoc.html 5. Neill, D.B., Moore, A.W., Cooper, G.F.: A Bayesian spatial scan statistic. In: Weiss, Y. (ed.) Advances in Neural Information Processing Systems. MIT Press, Cambridge (2006) 6. Read, S., Bath, P.A., Willett, P., Maheswaran, R.: A Spatial Accuracy Assessment of an Alternative Circular Scan Method for Kulldorff’s Spatial Scan Statistic. In: Fairbairn, D. (ed.) Proceedings of GISRUK 2009. Durham University, UK (2009) 7. Taseli, A., Benneyan, J.C.: Risk Adjusted Bernoulli Spatial Scan Statistics. In: Proceedings of the Industrial Engineering Research Conference (IERC) 2009. Institute of Industrial Engineers, Norcross (2009) 8. Tango, T., Takahashi, K.: A Flexibly Shaped Spatial Scan Statistic for Detecting Clusters. International Journal of Health Geographics 4(1) (2005) 9. Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic curve. Radiology 143(1), 29–36 (1982)
Vertical Fragmentation Design of Distributed Databases Considering the Nonlinear Nature of Roundtrip Response Time Rodolfo A. Pazos R.1, Graciela Vázquez A.2, José A. Martínez F.1, and Joaquín Pérez O.3 1
Instituto Tecnológico de Cd. Madero ESIME, Instituto Politécnico Nacional 3 Centro Nacional de Investigación y Desarrollo Tecnológico [email protected], [email protected], [email protected], [email protected] 2
Abstract. One of the challenges of applications of distributed database (DDB) systems is the possibility of expanding through the use of the Internet, so widespread nowadays. One of the most difficult problems in DDB systems deployment is distribution design. Additionally, existing models for optimizing the data distribution design have only aimed at optimizing query transmission and processing costs overlooking the delays incurred by query transmission and processing times, which is a major concern for Internet-based systems. In this paper a mathematical programming model is presented, which describes the behavior of a DDB with vertical fragmentation and permits to optimize its design taking into account the nonlinear nature of roundtrip response time (query transmission delay, query processing delay, and response transmission delay). This model was solved using two metaheuristics: the threshold accepting algorithm (a variant of simulated annealing) and tabu search, and comparative experiments were conducted with these algorithms in order to assess their effectiveness for solving this problem. Keywords: Mathematical programming, distributed databases, vertical fragmentation, metaheuristic algorithms.
1 Introduction Computing, from its beginning, has undergone many changes: from the massive mainframes that carried out limited tasks whose use was exclusive for a few organizations, up to modern computers (either PCs or laptops) that have even more processing power than the first ones, and which are more involved in the daily tasks of people. Organizations have embraced the use of computer communication networks, including the Internet, making easier the implementation of Distributed Database (DDB) systems. DDB systems managers have the responsibility of carrying out the fragmentation, allocation and monitoring of data at the network sites. In most DDB systems its design greatly affects the overall system performance, mainly because the time required R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 173–182, 2010. © Springer-Verlag Berlin Heidelberg 2010
174
R.A. Pazos R. et al.
for query transmission and processing and the transmission of query responses depend considerably on the sites where data resides, sites where queries are issued, data access frequencies, query processing capacity of database (DB) servers, and transmission speed of communication lines. Traditionally, the DDB distribution design problem has been defined as finding relations (tables) fragments and their allocation such that the overall costs incurred by query transmission and processing are minimized. However, it has been recognized for many years the importance of considering response time in DDB modeling [1]. Unfortunately, a survey of the specialized literature revealed that roundtrip response time has not been considered. Therefore, existing models for optimizing DDB design have only aimed at minimizing the transmission and processing costs of queries, overlooking the delays incurred by transmission and processing times (including queuing delays), which are one of the major concerns for Internet-based systems. We define roundtrip response time as the time elapsed between the arrival of a query to a network site (query source) and the time when the last bit of the response to the query is received at the query source. This definition implies that roundtrip consists of: transmitting a query from its source site to a DB server site (where the query is processed), processing the query and generating a response at the server site, and transmitting the response back to the source site. Considering only transmission and processing costs yields a linear optimization expression with respect to the problem decision variables [2, 3, 4], which is easier to deal with than the expression that involves roundtrip response time (presented in Subsection 3.1). Some investigators have assumed that transmission and processing costs are directly proportional to roundtrip response time; however, this assumption is not always true. For example, when considering a transmission line, the transmission cost of a query has been defined as being directly proportional to the transmission time of
Fig. 1. Variation of processing+queueing time with respect to query load
Vertical Fragmentation Design of DDBs Considering the Nonlinear Nature
175
the query; therefore, the overall transmission cost (incurred by the entire query load) is directly proportional to the amount of query data transmitted on the line. Unfortunately, this simplifying assumption ignores queuing time; i.e., the time a query needs to wait before being transmitted because the transmission line queue is not empty when the query arrives. Similarly, queuing processes should also be considered for query processing at server sites, as well as for transmission of query responses from servers back to source sites. For illustrating this situation, Fig. 1 presents simulation results that show the behavior of processing+queuing time of queries at a server site with respect to query load. In fact, this behavior has been studied by queuing theory; thus, the average processing+queueing delay of queries at a server can be approximated by the following expression [5]: T=
1 C −λ
where C represents the processing capacity of the server and λ is the arrival rate of queries to the server. As the figure and the preceding formula show, when the query load approaches the server processing capacity, the processing+queuing time spent by queries grows steeply and tends to infinity, because the number of queries waiting to be processed grows larger and larger. Finally, in this paper, the DDB distribution design problem is defined as finding relations fragments and their allocation such that the average roundtrip response time incurred by transmitting and processing all the queries and transmitting query responses is minimized. This problem is modeled using a zero-one mathematical programming model, which makes possible to solve it using the large assortment of metaheuristic algorithms available.
2 Related Work A survey of the specialized literature revealed that modeling roundtrip response time has not been considered, as shown in Table 1, which summarizes the most recent and relevant works on DDB fragmentation and allocation. The second, third and fourth columns of Table 1 show that some works have addressed only the fragmentation problem, most works have only dealt with the fragment allocation problem and some others have integrally addressed both problems. The fifth column shows that all the previous works have considered transmission, access or processing costs. The seventh column shows that most works have proposed heuristic algorithms for addressing DDB fragmentation and allocation, and only a few have proposed mathematical programming formulations. This comparative survey reveals that our approach has a distinctive feature that sets it apart from others: the incorporation of roundtrip response time into a mathematical programming formulation of the DDB fragmentation+allocation problem.
176
R.A. Pazos R. et al. Table 1. Related works on fragmentation and allocation for DDBs
Chakravarthy, 94 [1]
9 9
Tamhankar, 98 [6] Golfarelli, 99 [2]
9
9
9
9
9
9
9 9
Pérez, 03 [3]
9
9
9
Ulus, 03 [8]
9
9
Menon, 05 [4]
9
9
Basseda, 06 [9]
9
9
9
9
9
9
Ma, 06 [10] Hababeh, 07 [11]
9
Gorla, 08 [12]
*
Tambulea, 08 [13] Our approach
9 9
9 9
9
9
9
9
9 9
Mathematical programming formulation
Heuristic algorithm
Roundtrip response time
Solution Approach
9
*
Huang, 01 [7]
Value to Minimize Transmission or processing costs
Integrated Fragmentation + Allocation
Allocation
Works
Fragmentation
Problems Addressed
9 9
9
* for centralized databases.
3 Mathematical Model The DDB distribution design problem is defined as follows: given a set of attributes (data-objects) O = {o1, o2, ..., oL}, a computer communication network that consists of a set of sites S = {s1, s2, ..., sI}, a set of queries Q = {q1, q2, ..., qK} generated at the network sites, the attributes required by each query, and the generation rates of queries at each site; the problem consists of finding relations fragments and their allocation that minimizes the average roundtrip response time. This section describes a (binary) integer-programming model for the vertical fragmentation and allocation of data that involves roundtrip response time. We refer to this model as the VFA-RT model. In this model, the decision about storing an attribute l in site j is represented by a binary variable xlj. Thus xlj = 1 if l is stored in j, and xlj = 0 otherwise. Thus, the problem consists of finding an assignment of values to the x's, such that the value assignment satisfies certain restrictions and minimizes an objective function. 3.1 Objective Function The objective function below models the average roundtrip response time using three terms (inside the parenthesis): delay of queries incurred by their transmission
Vertical Fragmentation Design of DDBs Considering the Nonlinear Nature
177
(including queuing delay) on the communication lines from their sources to the servers where they are processed, delay of queries incurred by their processing (including queuing delay) at the servers, and delay of queries responses incurred by their transmission (including queuing delay) on the communication lines from the servers back to the sources of their corresponding queries. ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ 1 1 1 1 ⎜∑ ⎟ +∑ +∑ min z = Cj μ R Cij ⎟ f ki y jk ⎜ ij μQ Cij ij j ∑∑∑ −1 −1 −1⎟ ⎜ j k i f ki y jk ∑∑ ∑k f ki y jk ⎟⎠ ⎜ ∑ f ki y jk k i k ⎝
(1)
where z: average roundtrip response time, yjk: a dependent decision variable, such that yjk = 1 if one or more attributes used by query k are stored in site j, and yjk = 0 otherwise, fki: arrival rate of query k to source node i (expressed in queries/sec.), Cj: processing capacity of server j (expressed in queries/sec.), Cij: transmission speed of the communication line from node i to node j (expressed in bits/sec.), 1/μQ: mean length of queries (expressed in bits/query), and 1/μR: mean length of query responses (expressed in bits/response). It is not easy to see that expression (1) models the average roundtrip response time. Unfortunately, its mathematical derivation and explanation is rather complex and can not be included here due to space limitations; however, inquisitive readers can find the derivation and explanation in [14]. 3.2 Restrictions The model involves three groups of restrictions, shown below. Since replication of attributes is not considered, the first group of restrictions (2) specifies that each attribute must be allocated to only one site. Additionally, the second set of restrictions (3) specifies that each attribute should be allocated to a site where at least one query that involves the attribute is generated. Finally, the third group of restrictions (4) is used for restricting the values of the dependent variables y's according to the values of the x's. These restrictions are expressed as follows:
∑x
lj
=1
j
∀l xli ≤ ∑ qklϕ ki k
∀ l,i
each attribute must be stored in only one site; where ⎧1, if f ki > 0 ϕ ki = ⎨ ⎩0, if f ki = 0 each attribute l must be allocated to a site i where at least one query that involves the attribute is generated,
(2)
(3)
178
R.A. Pazos R. et al.
t y jk − ∑ qkl xlj ≥ 0 l
∀ j,k
this restriction forces the value of yjk to 1 when some product qklxlj equals 1 (i.e., some attribute l involved in query k is located in server j), and induces yjk to 0 otherwise,
(4)
where t = number of attributes.
4 Solution Methods Given that the problem described in [15] is NP-hard (as shown in the reference) and considering that the VFA-RT model is similar to the first one (the major difference being the objective function), we informally conclude that VFA-RT must also be NP-hard. When dealing with NP-hard problems, the practical approach is to use metaheuristic algorithms for solving them. In this case, we decided to use a variant of simulated annealing called threshold accepting (TA), since we have successfully applied it to a problem similar to VFA-RT [16]. Additionally, we also decided to use tabu search (TS) –since it has been applied to a wide variety of combinatorial optimization problems–, in order to determine how it compares versus TA for this problem. Since the implemented version of the TA algorithm is not as widely known as TS, a brief description is presented next. 4.1 Threshold Accepting Algorithm This is a simplification of the original simulated annealing, which spends less computing time than the classic algorithm [17]. In this algorithm, given a current feasible solution x = {xlj | l=1, ..., L; j=1, ..., J}, a neighbor feasible solution y is randomly generated. If Δz = z(y) – z(x) < T, y is accepted as the new current solution (where z represents the objective function and T is an algorithm parameter, called temperature, that has a suitably chosen value); otherwise, the current solution remains unchanged. Notice that T replaces the expression exp(–ΔE/k T) in the classic algorithm. The temperature value T is reduced each time until thermal equilibrium is reached (line 14). The value of T is decremented multiplying it by a cooling factor μ < 1, and a new search cycle (lines 7-14) is carried out again using the new value for T. These cycles are repeated inside an outer cycle (lines 6-16) with ever decreasing values of T until the system reaches freezing. This version of the algorithm has a third cycle (lines 8-13) for generating batches of solutions, which helps improve its performance. THRESHOLD ACCEPTING PSEUDOCODE 1) begin 2) real T0, μ 3) integer i, n 4) x = initial feasible solution
Vertical Fragmentation Design of DDBs Considering the Nonlinear Nature
5) 6) 7) 8) 9) 10) 11) 12) 13) 14) 15) 16) 17)
179
T = T0 high value of initial temperature repeat repeat for i = 1 to n y = randomly generated neighbor solution of x if z(y) – z(x) < T then x=y end if end for until thermal equilibrium is reached T = μT until freezing is reached end
Like most mataheuristic algorithms, the performance of simulated annealing depends on the values assigned to the parameters (T0, μ, n) and stop criteria (thermal equilibrium and freezing). Usually, the values for these parameters (and stop criteria) that yield the best performance are determined by trial and error; additionally, the best values vary from problem to problem. In these circumstances, for our problem we used the values and criteria proposed in [16].
5 Experimental Results Experiments were carried out for 12 randomly generated problems (P1 through P12). Table 2 shows the parameters used for generating the problems data; i.e., variable coefficients for expressions (1)-(4). It is worth pointing out that TA and TS are random algorithms; i.e., they move randomly from the current solution to the next one (line 9 of the TA pseudocode); therefore, they usually generate different solutions in different executions for the same problem. Thus, in order to obtain statistically significant results, 30 different instances were generated for each problem. The experiment consisted of solving each of the instances (for problems P1-P12) using TA and TS. Table 3 shows the results obtained from this experiment, which was carried out on a laptop with the following characteristics: Intel Core Duo CPU T5450 at 1.66 GHz, with 2038 MB RAM, and 32 bit Vista Home Premium operating system. The first column indicates the problem. Table columns two though four hold the minimal, average and maximal values of the objective function for the solutions obtained by TA, and column five holds its average execution time in seconds. Columns six through nine show the corresponding values for TS. Since for problems P1 through P4 the optimal values are known, an asterisk indicates when an algorithm was able to find the optimal solution. The results show that TA always finds the optimum for problems P1-P3 and sometimes finds it for problem P4; however, TS always finds the optimum only for problem P3, sometimes finds it for problems P1 and P2, and never finds it for problem P4. By looking at the average values of the objective function, it is clear that TA obtains better results than TS usually with a smaller amount of execution time.
180
R.A. Pazos R. et al.
Problem
No. Attributes L (=t)
No. Sites I
No. Queries K
Attribute Usage qkm
Arrival Rate fki
Processing Capacity Cj
Transmission Speed Cij
Mean Query Length 1/μQ
Mean Response Length 1/μR
Table 2. Input data for generating test cases
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12
2 10 3 10 23 14 12 23 18 26 17 28
3 2 4 3 6 5 9 5 11 11 13 9
2 3 5 4 5 13 10 10 9 7 10 12
0-1 0-1 0-1 0-1 0-1 0-1 0-1 0-1 0-1 0-1 0-1 0-1
6-21 0-19 1-25 2-21 0-24 0-25 1-25 0-25 0-25 0-25 0-25 0-25
50 50 500 500 500 500 5,000 5,000 5,000 5,000 5,000 5,000
100,000 100,000 1,000,000 1,000,000 1,000,000 1,000,000 1,000,000 1,000,000 1,000,000 1,000,000 1,000,000 1,000,000
1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000
5,600 5,600 5,600 5,600 5,600 5,600 5,600 5,600 5,600 5,600 5,600 5,600
Table 3. Comparative results for threshold accepting and tabu search Prob.
Threshold Accepting Objective Function Min. Avg. Max. Time P1 0.228208* 0.228208* 0.228208* 3 P2 0.118173* 0.118173* 0.118173* 15 P3 0.007264* 0.007264* 0.007264* 1 P4 0.006210* 0.006294 0.006746 2 P5 0.008271 0.008935 0.010108 24 P6 0.009362 0.011862 0.013149 63 P7 0.006539 0.006673 0.006962 39 P8 0.006598 0.007814 0.010415 24 P9 0.006715 0.007009 0.007309 80 P10 0.006745 0.007123 0.008014 100 P11 0.006895 0.007252 0.008322 111 P12 0.007369 0.010280 0.012339 91
Tabu Search Objective Function Min. Avg. Max. 0.228208* 0.240492 0.289628 0.118173* 0.260277 0.446106 0.007264* 0.007264* 0.007264* 0.006681 0.006943 0.007398 0.009887 0.010799 0.011491 0.018224 0.028068 0.098507 0.007648 0.007895 0.008104 0.008836 0.011892 0.014403 0.007867 0.008422 0.008749 0.008163 0.008463 0.008744 0.008088 0.008653 0.009213 0.011947 0.014928 0.020861
Time 9 12 42 148 1,755 21 1,300 1,255 3,961 7,615 4,968 908
6 Final Remarks This paper describes a mathematical programming model for the DDB distribution design problem, which is defined as finding relations fragments and their allocation such that the average roundtrip response time incurred by transmitting and processing all the queries and transmitting query responses is minimized. The optimal design problem was formulated as a zero-one integer programming model (called VFA-RT), which consists of an objective function (1) that aims at minimizing the overall
Vertical Fragmentation Design of DDBs Considering the Nonlinear Nature
181
roundtrip query response time, and three groups of constraints (2)-(4). Unlike previous works, the objective function used in our formulation permits modeling the nonlinear nature of query response time. Formulating the DDB distribution design problem as a mathematical programming model makes possible to solve it using the large assortment of metaheuristic algorithms available. Therefore, we used two metaheuristic algorithms for solving the VFA-RT model: threshold accepting and tabu seach. Comparative experiments were conducted with these algorithms in order to assess their effectiveness for solving this problem. The experimental results show that threshold accepting obtains better results than tabu search usually with a smaller amount of execution time. However, it is important to point out that the implemented tabu search is a basic version that does not include sophisticated intensification and diversification strategies; therefore, perhaps a more sophisticated implementation could yield better results.
References 1. Chakravarthy, S., Muthuraj, J., Varadarajan, R., et al.: An Objective Function for Vertically Partitioning Relations in Distributed Databases and its Analysis. Distributed and Parallel Databases 2(2), 183–207 (1994) 2. Golfarelli, M., Maio, D., Rizzi, S.: Vertical Fragmentation of Views in Relational Data Warehouses. In: Proc. Settimo Convegno Nazionale Sistemi Evoluti per Basi di Dati (SEBD 1999), Villa Olmo, Italy, pp. 23–25 (1999) 3. Pérez, J., Pazos, R., Santaolaya, R., et al.: Data-Object Replication, Distribution, and Mobility in Network Environments. In: Broy, M., Zamulin, A.V. (eds.) PSI 2003. LNCS, vol. 2890, pp. 539–545. Springer, Heidelberg (2004) 4. Menon, S.: Allocating Fragments in Distributed Databases. IEEE Trans. Parallel and Distributed Systems 16(7), 577–585 (2005) 5. Kleinrock, L.: Communication Nets: Stochastic Message Flow and Delay. Dover Publications, USA (2007) 6. Tamhankar, A.M., Ram, S.: Database Fragmentation and Allocation: an Integrated Methodology and Case Study. IEEE Trans. Systems, Man and Cybernetics Part A 28(3), 288– 305 (1998) 7. Huang, Y.F., Chen, J.H.: Fragment Allocation in Distributed Database Design. J. of Information Science and Engineering 17(3), 491–506 (2001) 8. Ulus, T., Uysal, M.: Heuristic Approach to Dynamic Data Allocation in Distributed Database Systems. Pakistan J. of Information and Technology 2(3), 231–239 (2003) 9. Basseda, R.: Fragment Allocation in Distributed Database Systems. Technical Report. Faculty of Electrical and Computer Eng., School of Engineering, Database Research Group, University of Tehran, Iran (2006) 10. Ma, H., Schewe, K.-D., Kirchberg, M.: A Heuristic Approach to Vertical Fragmentation Incorporating Query Information. In: Proc. 7th Int. Baltic Conf. on Databases and Information Systems, pp. 69–76 (2006) 11. Hababeh, I., Ramachandran, M., Bowring, N.: Application Design for Data Fragmentation and Allocation in Distributed Database Systems. In: Distributed Database Systems, Research and Practice Conference, Leeds, UK (2007) 12. Gorla, N., Wing-yan, B.P.: Vertical Fragmentation in Databases Using Data-Mining Technique. Int. J. of Data Warehousing and Mining 4(3), 35–53 (2008)
182
R.A. Pazos R. et al.
13. Tambulea, L., Horvat-Petrescu, M.: Redistributing Fragments into a Distributed Database. Int. J. of Computers, Communications & Control 3(4), 384–394 (2008) 14. Pazos, R.A., Vázquez, G., Pérez, J., Martínez, J.A.: Modeling the Nonlinear Nature of Response Time in the Vertical Fragmentation Design of Distributed Databases. Advances in Soft Computing 50, 605–612 (2009) 15. Lin, X., Orlowska, M.E., Zhang, Y.: On Data Allocation with the Minimum Overall Communication Costs in Distributed Database Design. In: Proc. 5th International Conference on Computing and Information, pp. 539–544 (1993) 16. Pérez, J., Pazos, R., Velez, L., Rodríguez, G.: Automatic Generation of Control Parameters for the Threshold Accepting Algorithm. In: Coello Coello, C.A., de Albornoz, Á., Sucar, L.E., Battistutti, O.C. (eds.) MICAI 2002. LNCS (LNAI), vol. 2313, pp. 125–144. Springer, Heidelberg (2002) 17. Dueck, G., Scheuer, T.: Threshold Accepting: a General Purpose Optimization Algorithm Appearing Superior to Simulated Annealing. Journal of Computational Physics 90(1), 161–175 (1990)
Improving Iterated Local Search Solution for the Linear Ordering Problem with Cumulative Costs (LOPCC) David Terán Villanueva1, , Héctor Joaquín Fraire Huacuja2 , Abraham Duarte3 , Rodolfo Pazos R.2 , Juan Martín Carpio Valadez1 , and Héctor José Puga Soberanes1 1
Instituto Tecnológico de León (ITL), Avenida Tecnológico S/N Fracc. Industrial Julián de Obregón C.P. 37290, León, Gto. Mexico [email protected], [email protected], [email protected] 2 Instituto Tecnológico de Ciudad Madero (ITCM), Av. 1o. de Mayo esq. Sor Juana Inés de la Cruz s/n Col. Los Mangos C.P.89440, Cd. Madero Tamaulipas, Mexico [email protected], [email protected] 3 Departamento de Ciencias de la Computación, Universidad Rey Juan Carlos, Madrid, Spain [email protected]
Abstract. In this paper the linear ordering problem with cumulative costs is approached. The best known algorithm solution for the problem is the tabu search proposed by Duarte. In this work an experimental study was performed to evaluate the intensification and diversification balance between these phases. The results show that the heuristic construction phase has a major impact on the tabu search algorithm performance, which tends to diminish with large instances. Then to evaluate the diversification potential of the heuristic construction, two iterated local search algorithms were developed. Experimental evidence shows that the distribution of the heuristic construction proposed as diversification mechanism is more adequate for solving large instances. The diversification potential of the heuristic construction method was confirmed, because with this approach we found 26 best known solutions, not found by the tabu search algorithm.
1
Introduction
In wireless communication systems it is required that cell phones communicate constantly with some base station. In order for the base station to distinguish between different signals from the cell phones, the Universal Mobile Telecommunication Standard adopted the Code Division Multiple Access technique. In this technique each cell phone has a different code. Due to the induced distortion of
Authors thank the support received from the Consejo Nacional de Ciencia y Tecnología (CONACYT) and Consejo Tamulipeco de Ciencia y Tecnología (COTACYT).
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 183–192, 2010. c Springer-Verlag Berlin Heidelberg 2010
184
D.T. Villanueva et al.
radio waves propagation, each cell phone interferes with each other, so in order to preserve the connectivity it is necessary to keep the interference below an acceptable level. To reach this goal each cell phone emits a signal, and they are detected by the base station in a sequential order previously established. When a signal is detected, it is eliminated, reducing the interference to other cell phones. Then one faces the problem of finding the order of detection of cell phones to assure a proper reception for all the users. This problem is called Joint Power Control and Receiver Optimization (JOPCO) and was formulated by Benvenuto [1]. Using this problem Bertacco formulates the Linear Ordering Problem With Cumulative Costs (LOPCC) [2]. This problem is defined as follows: Given a complete digraph G = (V , A) with nonnegative node weight di and nonnegative arcs cost cij , the objective of LOPCC is to find a Hamiltonian path P = {p1 , p2 , ..., pn−1 , pn } and the corresponding node values αi that minimize the expression: LOP CC(P ) =
1
αpi
i=n
where: αpi = dpi +
n
(cpi pj αpj )
for
i = n, n − 1, n − 2, ....1
j=i+1
2 2.1
Related Work Linear Ordering Problem (LOP)
The linear ordering problem (LOP) is closely related to LOPCC. In this section we review some recent works related to LOP. In [4] a tabu search metaheuristic algorithm for the linear ordering problem is proposed. In the algorithm, an insertion neighborhood strategy is used along with a local search that stops once the best improvement has been reached. Laguna implements intensification and diversification besides the strategies inherent to tabu search. Additional intensification is produced with a path-relinking strategy that is applied to a set of elite solutions and with a long term memory to insert certain nodes into a complementary position related to the previously selected positions. In [5] an efficient method to perform insertion movements by using consecutive swap movements is proposed. Also this method is used in two metaheuristic algorithms to solve LOP: an Iterated Local Search (ILS) and a Memethic Algorithm (MA). In both algorithms the local search stops somewhere between f irst and best improvements. It is reported that the Memetic Algorithm has better performance, in time and deviation from the optimum, than the Iterated Local Search (ILS). This results were validated using a Wilcoxon non-parametric test with α = 0.01. However, not enough evidence that MA has better performance than ILS for instances of size 250 is presented. A relevant contribution of this work is the economical method used to recalculate the objective function value.
Improving Iterated Local Search Solution for the Linear Ordering Problem
2.2
185
Linear Ordering Problem with Cumulative Costs (LOPCC)
Now we review the works directly related to the Linear Ordering Problem with Cumulative Costs (LOPCC). In [2] for the first time a formulation of the Linear Ordering Problem with Cumulative Costs is proposed. Also two exact algorithms for solving LOPCC are described. One of them is based on Mixed Integer-Linear programming, while the other is an enumerative ad-hoc algorithm that produces the best results. In [3] a tabu search algorithm for solving the LOPCC is proposed. The tabu search algorithm has four phases: construction, intensification, local search and diversification. In the construction phase a heuristic is used to generate a prespecified number of solutions, from which the one with the best objective function value is selected as the initial solution. An iteration of the intensification phase begins by randomly selecting a node. The score value for node i in the context of the LOPCC is calculated using Eq. (1). In the intensification phase, the low score nodes have priority for selecting positions that will decrease the objective function value of the current solution. The selected node is inserted in the first position that improves the objective value, if one exists, or alternatively in the best non-improving position. Note that this rule may result in the selection of a non-improving move. The move is executed even when the move is not beneficial, resulting in a deterioration of the current objective function value. The inserted node becomes tabu-active for a pre-specified number of iterations, and it cannot be selected for insertions during this number of iterations. The intensification phase terminates after a pre-specified number of consecutive iterations without improvement.
score(i) =
j
cij dj +
cji di .
(1)
j
The local search method is used to move the best solution found in the intensification phase to a local optimum. In this process each node of the current solution is inserted in the best improving position. The diversification phase is performed for a pre-specified number of iterations. A node is randomly selected iteratively, where the probability of selecting a node is inversely proportional to its frequency count (which records the number of times that it has been chosen for moving in the intensification phase). The chosen node is placed in the best position, as determined by the move values, allowing for non-improving moves to be executed. Once a solution has been chosen as an initial solution, the global iterations of tabu search are performed. In a global iteration the intensification and diversification phases are performed. Each phase begins from the current solution and after termination they return the overall best solution and a new current solution. The search terminates after a pre-specified number of global iterations have ocurred without improving the overall best solution. The reported experiments were carried out with 25 UMTS instances of each scenario, where the scenarios are synchronous and asynchronous instances with
186
D.T. Villanueva et al.
and without scrambling. Also there were 49 LOLIB instances and 75 Random instances with a uniform distribution between 0 and 100 of different sizes, 25 instances of size 35, 25 of size 100, and 25 of size 150. A comparative study was carried out against the enumerative method proposed by Bertacco [2], a Random Greedy method proposed by Benvenuto [1] and an adaptation of the Tabu search proposed by Laguna [4]. In this work we propose to evaluate the intensification and diversification balance between the four phases of the tabu search algorithm, to define new intensification and diversification general purpose strategies using the best methods.
3
Experimental Study
The goal of this experimental study is to identify the best methods used in the tabu search algorithm, in order to define new general purpose strategies of intensification and diversification. First the cost/benefit relation for the construction, intensification, local search and diversification methods of the tabu search algorithm were determined. Then the identified best methods were evaluated as intensification or diversification strategies, in the iterated local search framework. In the experimental study the algorithms were implemented in C and the experiments were performed on a computer with a Xeon 3.06 GHz processor and 4 GB of RAM. The instances used in the experiments were the same used in [3]. As the tabu search algorithm uses a predefined seed for random number generation and is time limited to 60 seconds, this parameters were used too in the experiments with the iterated local search algorithms. 3.1
Tabu Search Algorithm Diagnosis
To determine the cost/benefit relation for the construction, intensification, local search and diversification methods of the tabu search algorithm, the improvement contribution and the time spent for each method were measured. Table 1 shows the results from this experiment. The first column indicates the instances set used, the second one contains the indicator used, the next four columns contain the results for the construction, intensification, local search and diversification methods. The indicators used were the improvement percentage, the percentage of the overall time and the time in seconds used by each method. As we can see, for the first five instances set the heuristic construction produces over 90% of the attained improvement with less than 20% of the time spent. For the next two sets, the construction produces over 60% of the improvement. The results for the last set shows a different pattern due to the 60-second-time restriction. In general the results show that the heuristic construction is the most important method in the tabu search algorithm, but its performance tends to diminish with large instances. As we can see the main purpose of this method in the algorithm, is to contribute to the intensification process generating a high quality initial solution. Now we consider the next question: Can we use the heuristic construction to define an efficient diversification method based on restarting the current search at different high quality solutions?
Improving Iterated Local Search Solution for the Linear Ordering Problem
187
Table 1. Tabu Search Algorithm Evaluation
3.2
Instances Analysis
Heuristic. Const.
Intensification
Local Search
Diversification
Improve % UMTS 00 Time % Time sec. Improve % UMTS 01 Time % Time sec. Improve % UMTS 10 Time % Time sec. Improve % UMTS 11 Time % Time sec. Improve % LOLIB Time % Time sec. Improve % Rnd 35 Time % Time sec. Improve % Rnd 100 Time % Time sec. Improve % Rnd 150 Time % Time sec.
94.9 10.53 0.02 90.18 18.18 0.02 95.27 50 0.02 96.52 33.33 0.02 93.93 9.68 0.65 63.69 6.2 0.16 65.57 11.07 7.7 63.09 44.46 37.31
2.44 42.11 0.08 3.21 18.18 0.02 1.59 50 0.02 0.3 0 0 1.64 20.07 1.42 4.6 20.54 0.53 12.74 16.02 11.15 29.74 9.88 8.29
2.66 10.53 0.02 6.62 0 0 3.14 0 0 3.17 33.33 0.02 4.44 48.28 3.61 31.7 50 1.29 21.69 55.36 38.52 7.17 37.07 31.11
0 36.84 0.07 0 63.64 0.07 0 0 0 0 33.33 0.02 0 21.97 1.57 0 23.26 0.6 0 17.55 12.21 0 8.59 7.21
Diversification Based on the Initial Solution Construction Heuristic
To evaluate the diversification potential of the heuristic construction, two iterated local search algorithms were developed. In this framework a general purpose diversification strategy based on initial solution construction methods is used. The initial solution construction is performed by the greedy algorithm GreedyHeuristicConstruction(10). It is used to generate 10 solutions which are built in inverse order. For each one, the last node of the permutation is randomly selected. The next nodes are selected using a roulette that gives priority to those nodes that produce the lowest cost if they are placed at the next position of the permutation. This requires to determine the cost of the objective function for each node that could be in the next position of the current node. This is an expensive process, but it is oriented to produce very good quality solutions. Once a node is inserted, the algorithm also verifies if that node could produce a better solution if its inserted in a previous position.
188
D.T. Villanueva et al.
The solution perturbation phase is performed by the algorithm Perturbation(p). Here a random node in the current permutation p is selected and inserted in a new position that improves the current solution, otherwise the best non-improving position is selected. The local search method is performed by the algorithm LocalSearch(p). This method inserts each node of the current permutation p in the position that produces a better objective function value. Given a node, the position where it must be inserted is determined checking the neighborhood from node position i to the left and continuing from position i to the right, using consecutive swap movements. p = GreedyHeuristicConstruction(10) p = LocalSearch(p) Pbetter = p While globalIterationsW ithoutImprove < maxIterations and time < timeLimit { ILSB: p = Perturbation(p) ILSP: p = GreedyHeuristicConstruction(10) p = LocalSearch(p) if (p < Pbetter ) { Pbetter = p } else { if (p > Pbetter * β) { p = Pbetter } } }
Where: β is a numerical value between 1 and 2.
Fig. 1. Structure of the ILSB(ILSP) algorithm
To evaluate the diversification potential of the heuristic construction, two iterated local search algorithms were developed. The first one is a basic iterated local search (ILSB) shown in figure 1. In this algorithm the heuristic construction is used to generate a high quality initial solution, which is locally optimized, and the perturbation process is started. The solution perturbation has two goals: diversify the search because a node is randomly selected, and intensify the search because the selected node is inserted in a "good" position. In the second algorithm the heuristic construction is also used as the perturbation process. This algorithm is named ILSP and it is shown in figure 1. As we can see the heuristic
Improving Iterated Local Search Solution for the Linear Ordering Problem
189
construction work is distributed across all the search process. Now the perturbation continues diversifying the search, but restarting from high quality solutions. This is a diversification strategy that can be applied in several metaheuristics. For both algorithms the local search method is used to move the current solution to a local optimal solution and it is applied after the initial solution construction and the solution perturbation. The stop criterion applied has two elements: stagnation detection and a 60-second-time limit. Table 2 shows the results obtained with the UMTS instances for which the optimal values are known. These results show that the ILSP algorithm slightly outperforms the TS_Duarte algorithm in both quality and efficiency. Table 2. 100 UMTS instances of size 16 (25 of each group)
Synchronous without scramble Synchronous with scramble Asynchronous without scramble Asynchronous with scramble
Algorithm
Object. Func.
Avg. Err. Std. Dev.
Num. CPU sec. Optima
TS_Duarte ILSB ILSP TS_Duarte ILSB ILSP TS_Duarte ILSB ILSP TS_Duarte ILSB ILSP
6.2384 6.2384 6.2384 14 14.04072 14.00512 12.43 12.43308 12.42752 19.05 19.3113 19.03144
0 0 0 0.03 0.34 0.07 0.03 0.39 0.35 0.06 0.82 0
25 25 25 24 22 24 24 21 24 24 19 25
0 0 0 0.2 1.02 0.34 0.1 1.87 1.73 0.3 2.14 0
0.06 0.01 0.02 0.03 0.01 0.02 0.03 0.01 0.02 0.03 0.01 0.02
Table 3 shows the experimental results with the LOLIB instances for which no optimal solutions are known. As we can see the ILSP algorithm clearly outperforms the TS_Duarte algorithm in quality solution and efficiency. And for 16 instances, the ILSP algorithm improves the best known solutions. Table 4 shows the results obtained with the Random instances for 35, 100 and 150 nodes. For these instances the optimal solutions are unknown, and we have found 24 new best known solutions, 16 with ILSB and 10 with ILSP (2 overlapped). As we can see the TS_Duarte algorithm clearly outperforms the ILS algorithms. But as we have previously seen, the performance of the TS_Duarte algorithm is highly dependent on the heuristic construction and this process is too expensive. As we can see on the instances of size 100 and 150, the number of best known solutions obtained by the TS_Duarte algorithm decreases as the instance size increases. In contrast the number of the best known solutions obtained increases for the ILS algorithms. Also we can observe on the instances of size 150, the ILSP algorithm clearly outperforms in efficiency the TS_Duarte algorithm. The experimental evidence seems to show that the distribution of the heuristic construction used in the ILSB algorithm is more appropriate for solving large instances that the approach used in the TS_Duarte algorithm.
190
D.T. Villanueva et al. Table 3. 44 LOLIB instances of sizes 44(28), 50(2), 56(11) y 60(3) Algorithm
Object. Func.
Avg. Err. Std. Dev.
TS_Duarte 19650.8806 56.37 ILSB 19949.4102 17.95 ILSP 19632.3082 12.76
262.03 59.76 60.23
#Best Known
CPU sec.
31 29 34
2.31 1.18 1.82
Table 4. 75 Random instances (25 of each group) Size Algorithm
Object. Func.
Avg. Err.
Std. Dev.
#Best Known
CPU sec.
TS_Duarte ILSB ILSP TS_Duarte 100 ILSB ILSP TS_Duarte 150 ILSB ILSP
0.3413 0.3444 0.3426 1197.2316 1321.8274 1232.2675 2250558.454 3020078.359 2814863.541
0.3 1.7 0.61 3.26 10.15 8.14 4.03 15.02 21.72
0.93 5.83 1.77 7.41 9.21 6.18 7.26 16.21 17.86
20 15 17 17 6 2 14 7 4
0.39 0.3164 0.4752 30.75 32.7828 42.2124 180.43 73.2504 82.1712
35
Table 5. The best known solutions for the LOLIB instances Instance
Value Instance
Value Instance
Value
be75eec 5.09 t65n11xx 2.09 t75i11xx 4455.05 be75np 16543623.61 t65w11xx 19.26 t75k11xx 1.32 be75oi 2.79 t69r11xx 14.04 t75n11xx 9.9 be75tot 297138.27 t70b11xx 93.67 t75u11xxa 0.068427 stabu70 14.86 t70d11xx 79.74 tiw56n54 2.64 stabu74 14.03 t70d11xxb 4.44 tiw56n58 3.63 stabu75 9.42 t70f11xx 1.27 tiw56n62 3.02 t59b11xx 76262.11 t70i11xx 121146.03 tiw56n66 2.69 t59d11xx 4086.33 t70k11xx 0.49 tiw56n67 1.88 t59f11xx 24.14 t70l11xx 798.92 tiw56n72 1.57 t59i11xx 621676.35 t70n11xx 0.05 tiw56r54 2.63 t59n11xx 1618.9 t70u11xx 2280495776 tiw56r58 3.6 t65b11xx 28230.62 t70w11xx 0.04 tiw56r66 2.19 t65d11xx 3898.61 t70x11xx 0.23 tiw56r67 1.54 t65f11xx 1.25 t74d11xx 4.76 tiw56r72 1.35 t65i11xx 826120.21 t75d11xx 5.06 t65l11xx 2657.72 t75e11xx 2062.25
Improving Iterated Local Search Solution for the Linear Ordering Problem
4
191
Conclusions
In this ongoing investigation, the linear ordering problem with cumulative costs is approached. The best known solution algorithm for the problem is the tabu search proposed in [3].The algorithm has four phases: construction, intensification, local search and diversification. In this work an experimental study was conducted to evaluate the intensification and diversification balance between these phases. The results show that the heuristic construction has a major impact on the tabu search algorithm performance, which tends to diminish with large instances. Then to evaluate the diversification potential of the heuristic construction, two iterated local search algorithms were developed. In this framework a general purpose diversification strategy based on initial solution construction methods is shown. The experimental evidence shows that the distribution of the heuristic construction used as diversification mechanism in the ILSP algorithm is more adequate for solving large instances than the approach used in the tabu search algorithm. The diversification potential of the heuristic construction method was confirmed because the Table 6. The best known solutions for the Random instances Instance
Value Instance
t1d35.1 t1d35.2 t1d35.3 t1d35.4 t1d35.5 t1d35.6 t1d35.7 t1d35.8 t1d35.9 t1d35.10 t1d35.11 t1d35.12 t1d35.13 t1d35.14 t1d35.15 t1d35.16 t1d35.17 t1d35.18 t1d35.19 t1d35.20 t1d35.21 t1d35.22 t1d35.23 t1d35.24 t1d35.25
0.931 0.167 0.154 0.196 1.394 0.2 0.12 0.226 0.436 0.205 0.369 0.235 0.196 0.138 1.376 0.287 0.2 0.384 0.236 0.068 0.202 0.177 0.346 0.132 0.143
t1d100.1 t1d100.2 t1d100.3 t1d100.4 t1d100.5 t1d100.6 t1d100.7 t1d100.8 t1d100.9 t1d100.10 t1d100.11 t1d100.12 t1d100.13 t1d100.14 t1d100.15 t1d100.16 t1d100.17 t1d100.18 t1d100.19 t1d100.20 t1d100.21 t1d100.22 t1d100.23 t1d100.24 t1d100.25
Value Instance 263.066 297.27 1431.21 8249.749 172.11 479.3 6638.95 3095.19 67.249 174.44 256.74 246.92 611.495 279.38 458.02 748.476 783.01 704.937 249.29 272.82 232.05 176.64 1724.49 559.05 698.71
t1d150.1 t1d150.2 t1d150.3 t1d150.4 t1d150.5 t1d150.6 t1d150.7 t1d150.8 t1d150.9 t1d150.10 t1d150.11 t1d150.12 t1d150.13 t1d150.14 t1d150.15 t1d150.16 t1d150.17 t1d150.18 t1d150.19 t1d150.20 t1d150.21 t1d150.22 t1d150.23 t1d150.24 t1d150.25
Value 9652.803 216751.9 772872.52 88416.64 87132.345 58579.73 172627.38 307534.865 403646.64 146736.66 13900.04 82560.127 102720.712 95393.684 352880.29 23720842.62 98564.17 767813.83 87027.909 2279010.02 48116.464 866394.89 23972543.85 117970.027 462316.51
192
D.T. Villanueva et al.
ILSP algorithm found 26 best known solutions, not found by the tabu search algorithm. In tables 5 and 6 we show an update to the best known solutions reported in [3], the new best known solutions found in this work are shown in boldface. Now we are working on developing new strategies to improve the performance of the metaheuristics used to solve the large instances reported in the literature.
References 1. Benvenuto, N., Carnevale, G., Tomasin, S.: Optimum power control and ordering in sic receivers for uplink cdma systems. In: IEEE-ICC 2005 (2005) 2. Bertacco, L., Brunetta, L., Fischetti, M.: The linear ordering problem with cumulative costs. Eur. J. Oper. Res. 189(3), 1345–1357 (2008) 3. Duarte, A., Laguna, M., Martí, R.: Tabu search for the linear ordering problem with cumulative costs. Springer Science+Business Media, Heidelberg (2009) 4. Laguna, M., Martí, R., Campos, V.: Intensification and diversification with elite tabu search solutions for the linear ordering problem (1998) 5. Schiavinotto, T., Stützle, T.: The linear ordering problem: Instances, search space analysis and algorithms. Journal of Mathematical Modelling and Algorithms 3(4), 367–402 (2005)
A Common-Sense Planning Strategy for Ambient Intelligence* María J. Santofimia1, Scott E. Fahlman2, Francisco Moya1, and Juan C. López1 1 Computer Architecture and Networks Group School of Computer Science. University of Castilla-La Mancha 2 Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA {MariaJose.Santofimia,Francisco.Moya, JuanCarlos.Lopez}@uclm.es, [email protected]
Abstract. Systems for Ambient Intelligence contexts are expected to exhibit an autonomous and intelligent behavior, by understanding and reacting to the activities that take place in such contexts. These activities, specially those labeled as trivial or simple tasks, are carried out in an effortless manner by most people. In contrast to what it might be expected, computers have struggled to deal with these activities, while easily performing some others, such as high profile calculations, that are hard for humans. Imagine a situation where, while holding an object, the holder walks to a contiguous room. We effortlessly infer that the object is changing its location along with its holder. However, such inferences are not well addressed by computers due to their lack of common-sense knowledge and reasoning capability. Providing systems with these capabilities implies collecting a great deal of knowledge about everyday life and implementing inference mechanisms to derive new information from it. The work proposed here advocates a common-sense approach as a solution to the shortage of current systems for Ambient Intelligence. Keywords: Ambient Intelligence, Common Sense, Planning.
1 Introduction One decade after Mark Weiser defined the concept of Ubiquitous Computing, the IST Advisory Group brought up the Ambient Intelligence paradigm [1]. Building upon the Ubiquitous Computing paradigm, Ambient Intelligence refers to those environments where people are surrounded by all kind of intelligent and intuitive devices, capable of recognizing and responding to situations. In these contexts, people perceive the environment as a service provider that satisfies their needs or inquiries in a seamless, unobtrusive, and invisible way. Systems for Ambient Intelligence are expected to observe their environment and understand the activities that take place in it, in a context-aware manner. Moreover, * This work has been funded by the Spanish Ministry of Science and Innovation under the project DAMA (TEC2008-06553/TEC), and by the Regional Government of Castilla-La Mancha under project RGRID (PAI08-0234-8083). R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 193–202, 2010. © Springer-Verlag Berlin Heidelberg 2010
194
M.J. Santofimia et al.
the behavior exhibited by these systems is supposed to be driven by a set of specific goals that are to be achieved, maintained, or satisfied. However, until now, these systems remain far from the envisioned scenarios in [1]. In the authors’ opinion, the major shortcoming of the current systems is that they lack the facility for common-sense reasoning, and the large body of everyday knowledge that such reasoning requires. Under the common-sense viewpoint, the problem of developing systems for Ambient Intelligence has to be tackled from two different angles: the cognitive and the behavioral. From the cognitive perspective, the problem can be addressed as an understanding problem. Comprehending a situation that takes place in a context might involve, for instance, the inference of implicit, nondeterministic or delayed effects. A delayed effect of having a kitchen sink with a stopper, after opening the tap, will be a water overflow. From a behavioral perspective, the problem can be addressed as a planning problem of deciding what action to take under given circumstances. Therefore, a common-sense strategy to planning and understanding, as presented in [2] seems to be the most compelling approach to emulate the human-like rationality and reasoning capability. The need for a planning strategy for Ambient Intelligence has already been stated in [3]. That work pays special attention to the device heterogeneity existing in Ambient Intelligence contexts, advocating a distributed-centralized approach based on the Hierarchical Task Network (HTN) formalism [4]. In spite of our emphasis on device dynamism and heterogeneity, we believe that this has to be addressed in the middleware layer, in a way that is transparent for the planner. This work uses the framework proposed in [5] to this end. Ultimately, we agree with [3] in the role assigned to the multi-agent system architecture as the context observer and regulator of behaviors. The multi-agent system assumes the responsibility for providing the planner with the required information about the context and the mechanisms to respond. From the cognitive perspective, understanding a situation involves the identification of implicit relations among the events taking place. To this end, we must effectively represent the knowledge, from which these connections are to be inferred. This implicit knowledge, is normally referred as common-sense knowledge. Mueller in [6] presents an extensive analysis about the key issues that should ground common-sense systems. Among these are the representation of events, time, effects of events, etc. This work advocates a common-sense knowledge base and reasoning engine, embedded in a Lisp-like language, to address these issues. By combining and supporting an action planner on a common sense system, this work proposes an approach to overcome some of the deficiencies of the current approaches to systems for Ambient Intelligence. This work basically joins two longstudied fields, common-sense reasoning and planning, to make them work together towards Ambient Intelligence. The remainder of this paper is organized as follows: First, a state-of-the-art survey of Ambient Intelligence is presented. This section also states the requirements for autonomous action in systems for Ambient Intelligence. The second section goes through the theoretical concepts grounding the planning strategy presented here. Section three describes the foundation of a planning strategy for Ambient Intelligence, and presents the common-sense approach that this work advocates. The fourth section describes the implementation details of the proposed solution. The last section summarizes the most important ideas presented in this work.
A Common-Sense Planning Strategy for Ambient Intelligence
195
2 A Semantic Model for Actions and Events in Ambient Intelligence The strong coupling between the action and event theory and planning has given birth to the concept of action planning. Commonly, actions and events have been treated as equivalent, or with the slight difference of considering actions as events intentionally generated [7]. However, this work strongly believes that not only events are not actions, as defended in [8], but adopting this assumption prevents the planning strategy from being accomplished by means of a low-level guidance, as demanded by Ambient Intelligence contexts. The main argument supporting this dissociation lays on considering actions and their agents as inseparable or correlative [9]. The theory of action for multi-agent planning [10] also advocate for this distinction, although hinting that actions are accomplished by agents in their endeavor to achieve a goal. Despite agreeing in considering agents along with actions, this work does not consider actions in terms of the targeting goals, but as justified later on, in terms of requirements for the action to take place and consequences of the action The planning problem has been traditionally stated in terms of a world description or initial state, the goal to be achieved and the available actions. Actions are specified in terms of prerequisites and the effects. Nevertheless, the planner strategy proposed here differs from this traditional approach in considering actions, as opposed to states of the world, as primary elements. Planning problems under this perspective are here stated as a set of a goal action, which is a non-feasible action intended to be performed on an object. The planner seeks the set of doable actions that provide the means to produce the same effects on the object as the goal action. The following section will take care of the planning strategy details. This section introduces the semantic model at the root of this proposal. What does a semantic model stand for? From our perspective, it is considered to be an agreement on how to interpret the knowledge represented in the knowledge base. Furthermore, resorting to a semantic model, to be shared among different instances, is also essential when these instances are expected to extract the same meanings or conclusions from the represented knowledge. The proposed semantic model, depicted in figure 1, is centered on the concept of a service. Under the Ambient Intelligence perspective, services can be decomposed into the set of the actions performed on objects. These services are offered by devices. Adopting this semantic model has a significant impact on the planning strategy, since we must consider not only the preconditions and effects of actions, but also the objects receiving those actions and the devices providing the services that give rise to actions. Applying this semantic model to the elements involved in the Ambient Intelligence architecture (Middleware and the Multi-Agent System), requires the semantic model to be formalized as an ontology. OWL[11] and RDF[12] are two standards commonly used to describe ontologies. Due to the nature of the relationships established among the classes of the semantic model, OWL2 language is used for description purposes. Figure 1 depicts the entities and relationships composing our proposed semantic model for Ambient Intelligence.
196
M.J. Santofimia et al.
Fig. 1. A Semantic Model for Ambient Intelligence
3 The Planning Strategy Exhibiting a self-driven behavior is the most challenging requirement for an Ambient Intelligence system. In this endeavor, devising architectures with the capability to compose services out of general specifications or requirements becomes an essential ingredient for self-sufficiency. Obviously, it is not a trivial matter, and so far it has only been achieved in a semi-automatic manner. Service composition has been largely studied and implemented in a wide spectrum of computer technologies. The Web Services[13] approach is probably the most successful approach to address the problem. Here, the problem of supporting automatic service composition is reformulated as an action planning problem, enhanced with common sense knowledge. 3.1 Leveraging Common Sense It is apparent that autonomous and self-sufficient behavior needs to be founded on knowledge about how things work, and on the capability to make decisions based on that knowledge. In this regard, available systems for common-sense knowledge are limited to a small group, mainly represented by Cyc[14], and Scone[15]. Cyc implements a Davidsonian[16] interpretation of events and actions. In the Cyc knowledge base, events are asserted as individuals about which facts can be stated, in order to specify the moment in time when the event took place, the location, or the performer agent, for instance. The approach adopted by Scone better meets the demands of the service composition. Scone models the knowledge about events and actions in terms of the context state or world-model just before the action or the event takes place, and the context state immediately after the action [17]. The planner exploits the fact that two actions that have similar after contexts and, therefore similar effects on the context are considered to be functionally equivalent. Moreover, the before context can be considered as the set of requirements that need to be met for the action to be performed. These two considerations are the cornerstone of proposed planning algorithm. The following code provides some examples of action and event descriptions, in Scone’s lisp-like syntax. Events and actions might have associated roles. For instance, the detectingFace action involves a detect event, and for that reason the action inherits from the event the detectionSource and the detectionObject roles. Roles are used to express the relevant characteristics of an event or an action
A Common-Sense Planning Strategy for Ambient Intelligence
197
being described. The detectionSource role is used to symbolize the source over which the detect event is performed. Generally, actions tend to specify the event roles. As it can be observed for the detectingFace action, the detectionSource role is concretized from being a general thing to be digital data. As mentioned above, the before context states the requirements that need to be satisfied for the action to take place. The requirement for detecting a face is that there exist an image file from where to seek for a face silhouette. Therefore, the before context for the detectingFace demands the digital data to be an imageFile and not any other digital data object. The after context of the same action describes the effects of performing the action. The after context for the detectingFace action describes that after performing the action, a face silhouette has been observed in the imageFile, implicitly stating the existence of a person, although not yet identified. (new-event-type {detect} '({event}) :roles ((:indv {detectionSource} {thing}) (:indv {detectionObject} {thing})) :before ((new-not-statement {detectionObject} {is noticed in} {detectionSource})) :after ((new-statement {detectionObject} {is noticed in} {detectionSource}))) (new-event-type {detectingFace} '({detect} {action}) :throughout ((the-x-of-y-is-z {detectionSource} {detectingFace} {data}) (the-x-of-y-is-z {detectionObject} {detectingFace} {face}) (the-x-of-y-is-z {object-of} {detectingFace} {imageFile}) (the-x-of-y-is-z {agent} {detectingFace} {faceDetectorSystem})) :before ((new-statement {data} {is recorded in} {imageFile})) ;;an image is required :after ((new-statement {detectionObject} {not yet identified as} {person identity}) (new-statement {face} {is noticed in} {imageFile})))
This way of representing the knowledge about how things works as a network of nodes and link among nodes, along with its particular implementation of the inference and search process, makes automatic service composition a feasible task. The markerpassing algorithms implemented by the Scone engine provides a fast and efficient way of performing common-sense reasoning on the basis of the available knowledge.
198
M.J. Santofimia et al.
As stated in [18], although with no proof of logical completeness, Scone is capable of performing “inheritance of properties, roles and relations in a multiple-inheritance type hierarchy; default reasoning with exceptions; detecting type violations; search based on set intersection; and maintaining multiple, overlapping world-views at once in the same KB.” The planner algorithm exploits this capabilities and faces the service composition task as a planning problem of achieving the course of actions that from a given before context (the current situation) lead to a concrete after context. 3.2 The Planning Algorithm Taking advantage of service versatility, systems for Ambient Intelligence could respond to whatever the needs on the basis of the available services and devices. In this context, needs are considered to be the desire of performing actions on objects. The innovative contribution of this work is the description of services as actions to be performed on objects. To this end, the idea of Hierarchical Task Networks (HTN) is adapted to work with actions, instead of tasks. The actions that can be performed by a system are determined by the devices and services available at each moment in time. Those actions that cannot be performed, due to the lack of services providing the functionality, are named here as non-feasible actions. Whenever the system demands the execution of a non-feasible action, the planner comes into play. As listed underneath, the Planning algorithm starts with an empty plan, the Π plan, to be completed with the list of actions, provided by services. This course of actions are intended to emulate the demanded non-feasible action. The course of actions is provided as a set of actions performed on objects, A and O respectively, and the results R of accomplishing such actions.
.
Let’s imagine a system where the biometric identification service was not available, while services such as the face recognition or face detection were. In order to figure out the alternative set of actions that could simulate the same functionality, the
A Common-Sense Planning Strategy for Ambient Intelligence
199
planner is instantiated with the following arguments, namely, the plan to be executed (initially empty), the non-feasible action to be simulated, the object on which the action is to be performed, and the result of performing that action, something like: Planning(plan, {biometric identification}, {biometric feature}, {person identity}) Where the {biometric identification} is the action to be simulated by means of the available actions in the system. The {biometric feature} is the object on which the action is to be performed. Notices that instances of biometric feature could be a face, an iris, or a fingerprint. Finally the {person identity} is the outcome of performing a biometric identification on a biometric feature. For a simulated context with a particular set of devices that provide a set of services, the resulting plan to the previous request is the following list of ternary elements, to be executed bottom-up: 1. {face recognition} {capture result of a recording image devi ce} {person identity} using the available face recognition system. 2. {detecting face} {capture result of a capturing biometric feature} {image file} using the available face detector system. 3. {capturing face} {thing} {image file} using the available video camera service.
4 Implementation Details The Belief-Desire-Intention model (BDI) has proved to be a powerful framework for building rational agents [19], mainly because it offers the possibility to describe the agent behavior depending on the goals to be met, the knowledge it holds about the context, and a set of plans that assist the agent towards its goals. Indeed, the BDI model is intended to reproduce the process carried out when people make decisions to achieve a certain goal and perfectly couples with the action planning proposed here. Adopting such an approach allows the Ambient System to describe its goals in terms of general goals, and expects the BDI system to determine how these goals are to be achieved. Here is where the common-sense planner comes into action by receiving the general description of the goal to be achieved, the object of that goal, and the expected result from achieving the goal. On the basis of this information, the planner provides the intelligent agent with the list of services that have to be instantiated. The multi-agent system implementation is supported on Jadex [20], an agentoriented reasoning engine for rational agents. Instead of using formal logic descriptions, Jadex proposes the use of two commonly known languages, Java and XML. The BDI agent is modeled by mapping the concepts of beliefs into Java objects, while desires and intentions are mapped into procedural recipes coded in Java that the agent carries out in order to achieve a goal. Despite the fact that agents have been a widely adopted solution for supporting service composition in the field of Web Services, its implementation in other fields of
200
M.J. Santofimia et al.
distributed systems has not been so prolific. Unlike Web Services, which share a communication protocol such as SOAP, services deployed in pervasive environments are rarely capable of cooperating with the other services, mainly due to the heterogeneity of their implemented protocols. This drawback is tackled in this work by means of a middleware technology, the ZeroC ICE[21], which makes the communication process transparent.
Fig. 2. System overview diagram
Figure 2 depicts an overall view of the Multi-Agent System (MAS), in its role of observing and regulating the Ambient Intelligence context. The MAS is composed of four agents, known as Manager, Selector, Composer, and Provider. So in order to be context-aware, the Manager agent subscribes to those channels, where services publish their status information and accept requests. Hypothetically, a presence sensor service would publish the presence detection to one of these channels whenever noticed. Automatically, the Manager agent, which has been implemented to react to this sort of events, notifies the Selector agent, who is the one that knows how to manage the situation. The following code, extracted from the description of the
A Common-Sense Planning Strategy for Ambient Intelligence
201
Selector agent, states that whenever an unauthorisedPresence event has been triggered, one of the goals to be achieved is the intruder_identification. <parameter name="unauthorisedPresence" class="Event"> beliefbase.eventTypes
The novelty here is that plans are not hard-coded in the Selector plans code, but on the contrary they are stated as requests to the planning module. For instance, whenever the goal intruder_identification is dispatched, the get_biometric_ID plan is used to accomplish the goal. In the plan Java code, a sentence like the following can be found: getPlan("{identification}", "{biometric feature}", "{person identity}") The main strength of this approach is grounded on the generality of the plans managed by the Selector agent. Finally,the Composer agent is in charge of instantiating each of the actions composing the plan, and the Provider agent is in charge of publishing this new composite service as an available service of the system.
5 Conclusions The work presented here provides a partial solution to the self-sufficiency issue demanded by systems for Ambient Intelligence. To this end, this work proposed a comprehensive solution intended to provide automatic service composition, achieved by means of a common-sense planning strategy. Combining a Belief, Desire, and Intention approach with the Scone system poses the basis for implementing an action planning, capable of solving the problem of automating the service composition task. The use of a middleware layer places an abstraction layer in between the heterogeneous services and the Ambient System supervising the environment, in such a way that service instantiation and supervision is achieved by simply supervising the communication channels where information is published and from where services receive invocations.
202
M.J. Santofimia et al.
References [1] Ducatel, K., Bogdanowicz, M., Scapolo, F., Leijten, J., Burgelman, J.C.: ISTAG: Scenarios for ambient intelligence in 2010. Technical report, ISTAG (2001) [2] Wilensky, R.: Planning and Understanding. Addison-Wesley, Reading (1983) [3] Amigoni, F., Gatti, N., Pinciroli, C., Roveri, M.: What planner for Ambient Intelligence applications? IEEE Transactions on Systems, Man, and Cybernetics, Part A 35(1), 7–21 (2005) [4] Erol, K., Hendler, J., Nau, D.S.: HTN planning: Complexity and expressivity. In: AAAI 1994 (1994) [5] Villanueva, F.J., Villa, D., Santofimia, M.J., Moya, F., Lopez, J.C.: A framework for advanced home service design and management. IEEE Transactions on Consumer Electronics 55(3), 1246–1254 (2009) [6] Mueller, E.T.: Commonsense Reasoning. Morgan Kaufmann, San Francisco (2006) [7] Hommel, B., Musseler, J., Aschersleben, G., Prinz, W.: The theory of event coding (TEC): A framework for perception and action planning. Behavioral and Brain Sciences 24, 849–878 (2001) [8] Bach, K.: Actions are not events. Mind 89, 114–120 (1980) [9] Hyman, J.: Three fallacies about action. In: 29th International Wittgenstein Symposium (2006) [10] Georgeff, M.P.: A theory of action for multiagent planning. In: Bond, A.H., Gasser, L. (eds.) Readings in Distributed Artificial Intelligence, pp. 205–209. Kaufmann, San Mateo (1988) [11] W3C: OWL web ontology language home page (2009), http://www.w3.org/TR/owl-features/ (retrieved March 15, 2009) [12] W3C: Resource description framework (RDF) (2010), http://www.w3.org/RDF/ (retrieved February 28, 2010) [13] DiBernardo, M., Pottinger, R., Wilkinson, M.: Semi-automatic web service composition for the life sciences using the biomoby semantic web framework. J. of Biomedical Informatics 41(5), 837–847 (2008) [14] Cycorp, I.: The Cyc project home page (2008), http://www.cyc.com (retrieved on December 10, 2008) [15] Fahlman, S.E.: The Scone knowledge-base project (2010), http://www.cs.cmu.edu/sef/scone/ (retrieved on February 28, 2010) [16] Davidson, D.: The Essential Davidson. Oxford University Press, USA (2006) [17] Chen, W., Fahlman, S.E.: Modeling mental contexts and their interactions. In: AAAI 2008 Fall Symposium on Biologically Inspired Cognitive Architectures, Washington (2008) [18] Fahlman, S.E.: Marker-passing inference in the scone knowledge-base system. In: Lang, J., Lin, F., Wang, J. (eds.) KSEM 2006. LNCS (LNAI), vol. 4092, pp. 114–126. Springer, Heidelberg (2006) [19] Wooldridge, M.J.: Reasoning about Rational Agents. The MIT Press, Cambridge (2000) [20] Pokahr, A., Braubach, L., Lamersdorf, W.: Jadex: A BDI reasoning engine. In: MultiAgent Programming, pp. 149–174. Springer Science+Business Media Inc., USA (September 2005) [21] ZeroC, I.: Ice home page (2008), http://www.zeroc.com/ (retrieved December 20, 2008)
Dialogue Manager for a NLIDB for Solving the Semantic Ellipsis Problem in Query Formulation Rodolfo A. Pazos R.1, Juan C. Rojas P.2, René Santaolaya S.2, José A. Martínez F.1, and Juan J. Gonzalez B.1 2
1 Inst. Tecnológico de Cd. Madero, Cd. Madero, Mexico Centro Nacional de Investigación y Desarrollo Tecnológico, Cuernavaca, Mexico [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract. A query written in natural language (NL) may involve several linguistic problems that cause a query not being interpreted or translated correctly into SQL. One of these problems is implicit information or semantic ellipsis, which can be understood as the omission of important words in the wording of a query written in NL. An exhaustive survey on NLIDB works has revealed that most of these works has not systematically dealt with semantic ellipsis. In experiments conducted on commercial NLIDBs, very poor results have been obtained (7% to 16.9%) when dealing with query corpora that involve semantic ellipsis. In this paper we propose a dialogue manager (DM) for a NLIDB for solving semantic ellipsis problems. The operation of this DM is based on a typification of elliptical problems found in queries, which permits to systematically deal with this problem. Additionally, the typification has two important characteristics: domain independence, which permits the typification to be applied to queries of different databases, and generality, which means that it holds for different languages such as English, French, Italian, Spanish, etc. These characteristics are inherited to the dialogue processes implemented in the DM, since they are based on this typification. In experiments conducted with this DM and a NLIDB on a corpus of elliptical queries, an increase of correctly answered queries of 30-35% was attained. Keywords: Natural language, natural language interfaces to databases, dialogue manager.
1 Introduction In this paper the implementation of a dialogue manager (DM) for a natural language interface to databases (NLIDB) in Spanish [1, 2, 3] is presented, which is based on a typification of problems in queries that systematically deals with the semantic ellipsis problem. Semantic ellipsis occurs naturally in queries written in natural language (NL). Usually human beings write the same way they speak. When communicating with R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 203–213, 2010. © Springer-Verlag Berlin Heidelberg 2010
204
R.A. Pazos R. et al.
another human being, we do it assuming a context. In conversations, we do not speak text-like; i.e., using paragraphs, parentheses, underscores, etc. We rather speak with short replies, and the omitted information is clear from the previous context, the situation, the participants' actions and the general knowledge on these types of situations. The problem that occurs, when semantic ellipsis is not correctly treated or no dealt with by NLIDBs, is that it causes the interface to interpret or translate incorrectly a NL query into a database query language (such as SQL), as revealed by the results of experiment 2 (described in Section 4). Queries misinterpreted by a NLIDB may lead users to make potentially harmful decisions based on incorrect results returned by the NLIDB. The works developed on NLIDBs are many and diverse. In each work, several types of problems are dealt with and several approaches are proposed. For the purposes of this paper, we consider the most recent ones. Among the most important domain dependent NLIDBs without DM we can mention: ILNES [4], VILIB [5], Kid [6], and CISCO [7]. Regarding the main domain independent interfaces without DM, the following stand out: EnglishQuery [8], PRECISE [9], ELF [10], Edite [11], SystemX [12], and MASQUE/SQL [13]; while, among the principal domain independent NLIDBs with DM we can mention: Rendezvous [14], STEP [15], TAMIC [16], and CoBase [17]. Other commercial interfaces that include a DM which are worth mentioning are: BBN’s PARLANCE [18] and InBase [19]. Despite the number and diversity of works on NLIDBs, none has clearly identified nor has systematically dealt with the semantic ellipsis problem. The dialogue processes implemented in the DM here described inherit from the typification of query problems two important characteristics: domain independence, which permits that these processes can be applied to queries for any database without needing to change the DM software, and generality, which means that these processes can be applied to different languages such as English, French, Italian, Spanish, etc.
2 Typification of Semantic Ellipsis Problems in Queries Before describing the DM, we present a typification of problems in elliptical queries (those that involve semantic ellipsis), which resulted from analyzing query corpora for the Northwind (198 queries in Spanish) and Pubs (70 queries in Spanish) databases, whose queries were formulated by students (detailed in [1]), and the ATIS database benchmark (2407 out of 2884 queries in English) [20]. The typification obtained is shown in Table 1. As can be observed in Table 1, the typification obtained is not associated to a particular DB (i.e., it is domain independent) and it is general, since the problem types identified are common to several languages such as English, French, Italian, Spanish, etc. Furthermore, it has been formally proved [21] the correctness of this typification, which means that it fulfills three important criteria: completeness, disjunction and domain independence. Using the typification of Table 1, dialogue processes were implemented, which systematically address the semantic ellipsis problem.
DM for a NLIDB for Solving the Semantic Ellipsis Problem in Query Formulation
205
Table 1. Typification of semantic ellipsis problems in queries
Case 1 1.1 1.2 2 2.1 3 3.1 4 4.1 5 5.1 5.2 6 6.1 7 7.1 8 8.1 8.2
Description Explicit columns and tables* Queries that include Select and Where clauses and explicitly include column and table names (therefore, no clarification of information is needed) Queries that include only the Select clause and explicitly include column and table names (therefore, no clarification of information is needed) Univocal implicit column in the Select clause Queries that include a table name without specifying which of its columns is(are) requested Univocal implicit column in the Where clause Queries that include a table name and a value in the search condition without specifying the table column related to the value Multivocal implicit column in the Select clause Queries that include a column name without specifying one of the several tables to which it may be related Multivocal implicit column in the Where clause Queries that include the name of a column without specifying one of the several tables to which it may be related Queries that include a value in the search condition without specifying the column nor the table to which it may be related (therefore, the value may be related to any column of any table) Inexistent column Queries that include a column name that can not be related to any database table Inexistent table* Queries that include the name of a table that does not belong to the database Ill-formed queries* Queries such that in the Select clause neither a column name nor a table name are specified (i.e., there is no Select clause). Queries whose Where clause includes a column name and/or a table name without specifying a search value
* No dialogue processes were implemented for cases 1, 7 and 8; however, the DM informs the user about the type of problem detected.
3 Implementation of the Dialogue Manager The DM implemented into the NLIDB [1, 2, 3] is constituted by dialogue processes based on the typification of Table 1. Fig. 1 shows the conceptual diagram of the NLIDB used and the DM implemented. Details of the implementation of the NLIDB can be found in [1, 2, 3]. The diagram of Fig. 1 shows the three main components: the NL-SQL translator comprises all the modules within the dashed-line rectangle, the DM consists of all the modules inside
206
R.A. Pazos R. et al.
the dotted-line rectangle, and finally, the rest of the figure describes the interaction between the user and the NLIDB. The event flow of the DM proceeds as follows: 1. Once a DB is selected, an automatic configuration process of the interface is performed (generation of dictionaries and domain data) from the DB metadata.
Domain Dictionary Generation
Synonym Extraction Process
General Synonym Dictionary
Digital Encyclopedia E
Result E
C Domain Dictionary
U s e r
Metadata Extraction
Metadata Dictionary
SQL Query
Translation Module
Query Processing
Information Request
NL Query Formulation
Dialogue Process Selection
Problem Detection
A
B
A
No D
Dialogue Manager
Lacking Information Retrieval
Dialogue Process Execution
Yes Ellipsis Problems?
Response
DB
D
NL-SQL Translator User Request
DBMS
B
User Query Modification
C
Fig. 1. Conceptual diagram of a NLIDB with DM
2. The query, written in NL by the user, is processed by several modules of the interface until it reaches the "Translation Module", where the process is interrupted. 3. The DM takes the information generated by the interface and analyzes this information for finding out if it lacks information for translating the query (this process is repeated as many times as lacking information is detected): a. If a lack of information is detected, it proceeds to Step 4. b. If no lack of information is detected, then the translation process is carried out (circle D in the figure). 4. Upon identifying the type of problem, the DM selects the dialogue process adequate for the problem, and: a. Executes the corresponding dialogue process. b. Retrieves information from the DB schema relevant to the information supplied and missing in the query. c. Requests information to the user (circle A in the figure). d. The user provides the information requested (circle B in the figure).
DM for a NLIDB for Solving the Semantic Ellipsis Problem in Query Formulation
207
e. Using the information from Step 4d, the query is modified and the process returns to the "Query Processing" module (circle C in the figure). 5. The query is translated to SQL and submitted to the DB server. 3.1 Implementation of Dialogue Processes In this section the dialogue processes of the typification shown in Table 1 are described. Case 2.1. Example: Can I have a listing of all flights from ATL to BOS? In this query a table name (flight) can be identified; however, in this case the name(s) of the column(s) related to the flights has(have) not been mentioned. The dialogue process devised for solving this kind of problem proceeds as follows: 1. Look in the DB queried for the table name (in this case, flight) specified. 2. Look into the table (found in step 1) for columns descriptions. 3. Display the following question to the user. Regarding flight, which information would you like to obtain? And display a list (check box type) of column descriptions (found in step 2). 4. Once the omitted column(s) is(are) selected by the user (in this case, flight number), the query is internally modified by the DM; thus, completing the lacking information. In this case, the modified query would read as follows: Can I have a listing of flight number of all flights from ATL to BOS? Case 3.1. Example: Give me the address of the employee Margaret Peacock? In this query a table name (employee) and the values (Margaret Peacock) next to the table name can be identified; however, in this case the name of the columns related to the supplied values have been omitted. The dialogue process devised for solving this kind of problem proceeds as follows: 1. Look in the DB queried for the table (employee) specified. 2. Determine the data type of the supplied value (Margaret) and look into the table for columns whose data types match that of the supplied value. 3. Display the following question to the user: Regarding Employee, which of the following Margaret may refer to? And display a list (check box type) of column descriptions (corresponding to the columns found in step 2), from which the user has to select a list item. 4. Once the omitted column is selected by the user (first name), the query is internally modified by the DM; thus, completing part of the lacking information. In this case, there is still lacking information (the column that contains the value Peacock); therefore, step 3 is executed again but instead of the value Margaret the new question involves value Peacock. The final query would read as follows: Give me the address of the employee whose first name is Margaret and whose last name is Peacock. Case 4.1. Example: What is the arrival time in Los Angeles? In this query the name of a column is specified (arrival time) but the table where it belongs is not specified. There could be several tables that contain a column called arrival time. The algorithm for this problem is the following:
208
R.A. Pazos R. et al.
1. Look in the DB for tables that contain the column name (arrival time) specified in the query. 2. Display to the user the following question: What does arrival time may correspond to? together with a list (radio button type) of the tables found in Step 1. 3. Once the omitted table (flight) is selected by the user, the query is internally modified by the DM. The resulting query would be as follows: What is the flight arrival time in Los Angeles? Notice that this query is still incomplete; therefore, it has to be further clarified. Case 5.1. Example; Give me the address of Alfredo Futterkiste. The algorithm is similar to that for case 4.1, but in this case the problem occurs in the Where clause. In the query a column name (address) is specified as well as a value (Alfredo Futterkiste) associated to the column (address). The algorithm is the following: 1. Look in the DB for tables that contain the table name (address) specified in the query. 2. Display to the user the following question: What does address may correspond to? together with a list (radio button type) of the tables found in Step 1. 3. Once the omitted table (employee) is selected by the user, the query is internally modified by the DM. The resulting query would be as follows: Give me the address of employee Alfredo Futterkiste. This query is still incomplete; therefore, it has to be further clarified. Case 5.2. Example. How many engines does an M80 have? In this query a value (M80) is specified; however, no table name is mentioned nor a column name associated to the supplied value. The algorithm for this type of problem is the following: 1. Determine the data type of the supplied value (M80). 2. Look for all the tables that constitute the DB. 3. Display to the user the question: What does M80 may correspond to? together with a list (radio button type) of the tables found in Step 2. 4. Once the omitted table (aircraft) is supplied by the user, the query is modified by the DM. The resulting query would be as follows: How many engines does an aircraft M80 have? Since this query is still incomplete, it has to be further clarified. Case 6.1. Example: Give me the schedule for flights from Dallas to Denver starting 05/01/1990. In the query the word schedule, which could be considered to refer to a column, can not be found in the DB. The algorithm for this type of problem is as follows: 1. Look in the DB for the column (schedule) specified. 2. Display a message to the user indicating that the supplied column name can not be found in any of the DB tables. Case 7.1. Example: Give me the address of Bar Harris. In this query the word Bar, that could be considered to refer to a table, can not be found in the DB. The algorithm for this type of problem is the following: 1. Look in the DB for the table (bar) specified. 2. Display a message to the user indicating that the supplied table name can not be found in the DB.
DM for a NLIDB for Solving the Semantic Ellipsis Problem in Query Formulation
209
Case 8.1. Example: From SFO to ATL. In this query the words SFO and ATL could be considered to refer to search values, but no table name is mentioned nor a column name associated to the supplied values. In this case the NLIDB does not have enough information for translating the query. Case 8.2. Example: Give me the departure time of flight. In this query a column name (departure time) and a table name (flight) are specified; however, no search value is supplied. This case has not been dealt with in the current version of the NLIDB. As can be seen in the preceding algorithms, the implementation of the dialogue processes is relatively simple. An important characteristic of these algorithms worth mentioning is that they inherit the characteristics of domain independence and generality from the typification (Table 1). The dialogue processes can be applied to queries to any DB and several languages such as English, French, Italian, Spanish, etc. Additionally, another important aspect is that these processes do not use a trial-and-error approach, but they rather analyze the information supplied in the query as well as the information from the DB schema and formulate accurate questions to the user for requesting the lacking information. Finally, it is important to point out that a query may include simultaneously two or more problems, and therefore, more dialogue cycles would be necessary to clarify the query. The more problems occur in a query, the more difficult is its translation, and in extreme cases the query could be impossible to translate, such as the example for case 8.1.
4 Experimental Results Experiment 1. This experiment was conducted in order to evaluate the performance of our NLIDB [1, 2, 3] using a corpus of elliptical queries. A sample of 102 queries (out of 2884 queries from a corpus of the ATIS benchmark [20]) was translated to Spanish and tested on the NLIDB with and without dialogue. (Note: this translation was necessary because the NLIDB is for Spanish.) The results are shown in Table 2. Table 2. Results for five types of problems in queries from the ATIS benchmark using our NLIDB
NLIDB
Results
Correct Without Incorrect dialogue Unanswered Total Correct Incorrect With dialogue Unanswered Total
2.1 0 16 2 18 4 7 7 18
Types of Problems 3.1 4.1 5.2 5 2 2 22 7 13 3 1 10 30 10 25 14 6 8 9 3 3 7 1 14 30 10 25
6.1 1 7 11 19 9 3 7 19
Total
%
10 65 27 102 41 25 36 102
10 64 26 100 40 25 35 100
210
R.A. Pazos R. et al.
As can be observed in Table 2, an increase of 30% in the percentage of correctly answered queries is achieved with the NLIDB that includes dialogue with respect to the NLIDB without it. It is worth mentioning that the poor performance achieved (40%) by the NLIDB with dialogue is caused by the extreme difficulty of the sample of queries used for this experiment as the following experiment shows for a commercial NLIDB. Experiment 2. This experiment was carried out for assessing the performance of ELF [10] (reputed as being one of the best commercial NLIDBs). Since the behavior of ELF depends on the customization needed to adapt it to a given database, we decided to use as many customizations as possible. The customization of ELF (like most NLIDBs) requires some expertise, so we recruited 20 MS students in Computer Science. Thus, for the customization, a sample of 20 representative queries from the ATIS benchmark was taken. Then, ELF was customized by the 20 students using the 20-query sample, from which 20 instances of ELF were obtained (one for each customization). Finally, each of these instances was tested using the same sample of 102 queries mentioned in the first experiment. The results are summarized in Table 3. Table 3. Results for five types of problems in queries from the ATIS benchmark using ELF
Customizer
Results
20 MS students
Correct Incorrect Unanswered Total
2.1 1 7 10 18
Types of Problems 3.1 4.1 5.2 6.1 2 1 3 0 9 2 11 9 19 7 11 10 30 10 25 19
Total
%
7 38 57 102
7 37 56 100
It is important to mention that ELF does not have a DM, but it does permit to customize the interface for dealing with recurrent patterns of errors encountered in query processing (instead of using the default automatic customization). However, despite including the manual customization, the experiments revealed that ELF does no deal successfully with semantic ellipsis, as shown by the impressively low 7% of correctly answered queries. This gives an idea of the importance of this problem and how it can affect the performance of NLIDBs. Experiments 3 and 4. These experiments were conducted for evaluating the performance of our NLIDB+DM [1, 2, 3] for users with different ability levels. For these experiments a sample of 20 representative queries from the ATIS database benchmark [20] for the problem types shown in Tables 4 and 5 was considered. The sample queries were translated into Spanish and were executed on the interface with DM by 5 MS students in Computer Science for experiment 3 and by 5 undergraduate students in Information Technology for experiment 4. The results are shown in Tables 4 and 5. Tables 4 and 5 show that BS students get 79% of correct answers (10% less than that by MS students). One possible explanation for these results is that MS students are in fact smarter, yet another explanation is that BS students were less motivated, since they were enrolled at an institution different from the one where our NLIDB was developed (unlike MS students). Note: the good performance of experiments 3
DM for a NLIDB for Solving the Semantic Ellipsis Problem in Query Formulation
211
and 4 (as compared with that of experiment 1) is explained by the fact that the queries for these experiments are not as difficult as those for experiment 1. The results from the preceding experiments show that the DM, based on the typification of Table 1, helps improve the percentage of correctly answered queries. Additionally, in other experiments on a corpus of 154 queries for the Northwind database, which are less difficult than those from the ATIS benchmark, our NLIDB [1, 2, 3] attained an 85% of correctly answered queries, which compares favorably with respect to the 17% and 75% obtained by the English Query [8] and the ELF [10] commercial interfaces (published elsewhere [22, 23]). Table 4. Results obtained by MS students for five types of problems in queries from the ATIS benchmark
Customizer Student 1 Student 2 Student 3 Student 4 Student 5 Average
Results Correct Incorrect Correct Incorrect Correct Incorrect Correct Incorrect Correct Incorrect Correct Incorrect
2.1 3 1 2 2 2 2 2 2 3 1 2.4 1
Types of problems 3.1 4.1 5.2 5 2 4 0 1 0 5 2 4 0 1 0 4 3 4 1 0 0 5 3 4 0 0 0 5 3 4 0 0 0 5.8 2.6 4 0.12 0.4 0
6.1 4 0 4 0 4 0 4 0 4 0 4 0
Total
%
18 2 17 3 17 3 18 2 19 1 18.8 1.52
90 10 85 15 85 15 90 10 95 5 89 11
Table 5. Results obtained by undergraduate students for five types of problems in queries from the ATIS benchmark
Customizer Student 1 Student 2 Student 3 Student 4 Student 5 Average
Results Correct Incorrect Correct Incorrect Correct Incorrect Correct Incorrect Correct Incorrect Correct Incorrect
2.1 3 1 1 3 3 1 2 2 2 2 2.2 1.8
Types of problems 3.1 4.1 5.2 3 1 4 2 2 0 5 2 4 0 1 0 4 2 4 1 1 0 5 2 4 0 1 0 5 2 4 0 1 0 4.4 1.8 4 0.6 1.2 0
6.1 3 1 3 1 4 0 3 1 4 0 3.4 0.6
Total
%
14 6 15 5 17 3 16 4 17 3 15.8 4.2
70 30 75 25 85 15 80 20 85 15 79 21
212
R.A. Pazos R. et al.
5 Final Remarks This work is the continuation of a NLIDB in Spanish [1, 2, 3], which has been supplemented with dialog processes. In this work we describe the implementation of a dialog manager based on a typification of semantic ellipsis problems occurring in queries, which constitutes an approach completely different from all the previous works on NLIDBs due to its generality. Therefore, the proposed approach can be applied to other domain-independent or domain dependent NLIDBs as well as to other languages such as English, French, Italian, etc. The dialog processes based on the proposed typification have three important characteristics: domain independence, ease of configuration, and NL independence. Concerning ease of configuration, it is worth mentioning that our dialog processes need no configuration, since they rely entirely on the semiautomatic configuration of our NLIDB, which extracts information from the DB schema and short descriptions for tables and columns. As a result of the integration of a DM to our NLIDB [1, 2, 3], based on a typification of semantic ellipsis problems in queries, we managed to improve by 30% (Table 2) the percentage of correctly answered queries that have the semantic ellipsis problem. We observed, from tests of ELF and English Query commercial interfaces, that NLIDBs that do not deal systematically with semantic ellipsis may yield poor results (75% and 17% success rates) for elliptical queries [22, 23]. We also observed from experiment 2 that, when dealing with a complex real-life database such as ATIS, the semantic ellipsis problem may grow even worse, as suggested by the results in Table 3.
References 1. González, J.J.: Traductor de Lenguaje Natural Español a SQL para un Sistema de Consultas a Bases de Datos. PhD dissertation. Computer Sci. Dept., Centro Nacional de Investigación y Desarrollo Tecnológico, Cuernavaca, Mexico (2005) 2. Pazos, R.A., Pérez, J., González, J.J., Gelbukh, A., Sidorov, G., Rodríguez, M.J.: A Domain Independent Natural Language Interface to Databases Capable of Processing Complex Queries. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds.) MICAI 2005. LNCS (LNAI), vol. 3789, pp. 833–842. Springer, Heidelberg (2005) 3. González, J.J., Pazos, R.A., Cruz, C., Fraire, H.J., Aguilar, S., Pérez, J.: Issues in Translating from Natural Language to SQL in a Domain-Independent Natural Language Interface to Databases. In: Gelbukh, A., Reyes-Garcia, C.A. (eds.) MICAI 2006. LNCS (LNAI), vol. 4293, pp. 922–931. Springer, Heidelberg (2006) 4. Rocher, G.: Traducción de Queries en Prolog a SQL. BS thesis, Universidad de las Américas, Puebla, Mexico (1999) 5. CORDIS: Telematics for Libraries - Projects, VILIB (1999), http://cordis.europa.eu/libraries/en/projects/vilib.html 6. Chae, J., Lee, S.: Frame Based Decomposition Method for Korean Language Query Processing. Computer Processing of Oriental Languages 11(4), 353–379 (1998) 7. GPLSI: Procesamiento del Lenguaje Natural (1998), http://gplsi.dlsi.ua.es/gplsi/areas.htm 8. Microsoft TechNet, Chapter 32- English Query Best Practices (2009), http://technet.microsoft.com/es-mx/library/cc917659en-us.aspx
DM for a NLIDB for Solving the Semantic Ellipsis Problem in Query Formulation
213
9. Popescu, A., Etzioni, O., Kautz, H.: Towards a Theory of Natural Language Interfaces to Databases. In: Proc. International Conference on Intelligent User Interfaces, Miami, USA, pp. 149–157 (2003) 10. ELF Software, ELF Software Documentation Series (2002), http://www.elfsoft.com/help/accelf/overview.htm 11. Reis, P., Matias, J., Mamede, N.: Edite - A Natural Language Interface to Databases: a New Dimension for an Approach. In: Proc. 4th International Conference on Information and Communication Technology in Tourism, Edinburgh, Scottland (1997) 12. Cercone, N., Mcfetridge, P., Popowish, F., Fass, D., Groeneboer, C., Hall, G.: The System X Natural Language Interface: Design, Implementation and Evaluation. Technical report. Centre for System Science, Simon Fraser University, British Columbia, Canada (1993) 13. Androutsopoulus, I., Ritchie, G., Thanish, P.: MASQUE/SQL, an Efficient and Portable Natural Language Query Interface for Relational Databases. In: Proc. 6th International Conference on Industrial & Engineering Applications of Artificial Intelligence and Expert Systems, Edinburgh, UK, pp. 327–330 (1993) 14. Minock, M.: A STEP Towards Realizing Codd’s Vision of Rendezvous with the Casual User. In: Proc. 33rd International Conference on Very Large Databases, Vienna, Austria, pp. 1358–1361 (2007) 15. Minock, M.: Natural Language Access to Relational Databases through STEP. Technical report. Dept. Computer Science, University of Umea, Umea, Sweden (2004) 16. Bagnasco, C., Bresciani, P., Magnini, B., Strapparava, C.: Natural Language Interpretation for Public Administration Database Querying in the TAMIC Demonstrator. In: Proc. 2nd International Workshop on Applications of Natural Language to Information Systems, Amsterdam, The Netherlands (1996) 17. Chu, W., Yang, H., Chiang, K., Minock, M., Chow, G., Larson, C.: Cobase: A Scalable and Extensible Cooperative Information System. Journal of Intelligent Information Systems 6, 223–259 (1996) 18. Ott, N.: Aspects of the Automatic Generation of SQL Statements in a Natural Language Query Interface. Information Systems 17(2), 147–159 (1992) 19. Boldasov, M.V., Sokolova, G.E.: QGen – Generation Module for the Register Restricted InBASE System. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 465–476. Springer, Heidelberg (2003) 20. DARPA Air Travel Information System, ATIS0 (1990), http://www.ldc.upenn.edu/Catalog/readme_files/atis/sdtd/ trn_prmp.html 21. Rojas, J.C.: Administrador de Diálogo para una Interfaz de Lenguaje Natural a Bases de Datos. PhD dissertation. Centro Nacional de Investigación y Desarrollo Tecnológico, Cuernavaca, Mexico (2009) 22. Pazos, R., Santaolaya, R., Rojas, J.C., Pérez, J.: Shedding Light on a Troublesome Issue in NLIBDs: Word Economy in Query Formulation. LNCS, pp. 641–648. Springer, Heidelberg (2008) 23. Pazos, R.A., Santaolaya, R., Rojas, J.C., Martínez, J.A., Pérez, J., Cruz, L.: Domain Independent Dialog Processes for Solving the Word-Economy Problem in a NLIDB. Polish Journal of Environmental Studies 17(4C), 457–462 (2008)
Hand Gesture Recognition Based on Segmented Singular Value Decomposition Jing Liu and Manolya Kavakli Department of Computing, Macquarie University, Australia {jliu,manolya}@ics.mq.edu.au
Abstract. The increasing interest in gesture recognition is inspired largely by creating a system which can identify specific human gestures and using gestures to convey information or control devices. In this paper we present a novel approach for recognizing hand gestures. The proposed approach is based on segmented singular value decomposition(SegSV D) and considers both local and global information regarding gesture data. In this approach, first singular vectors and singular values are evaluated together to define the similarity of two gestures. Experiments with hand gesture data prove that our approach can recognize gestures with high accuracy.
1 Introduction Hand gesture recognition involves capture, detection and recognition of hand movements. One common technique to obtain hand motion information is to adopt vision based technologies. Vision based technologies involve one or more cameras to collect images of hand movements. This method has problems due to image occlusion, ill-posedness and large computational cost for recognition [1]. Another approach is to equip user’s hand with special gloves that can capture hand and finger movements by special sensors [2]. Compared to vision-based acquisition techniques, the higher precision of instrumented gloves and smaller computational cost for recognition, make glove based technologies applicable to real time human computer interaction. In this paper we focus on recognizing gestures collected from glove sensors. A gesture sensing device generally has several sensors that can be attached directly to the user’s hand. For instance, a 5DT Data Glove 5 Ultra1 has 7 sensors per hand. A CyberGlove has around 20 sensors. These sensors measure finger bending or other movements of fingers and palm. Data captured from hand sensors is then sent back to the computer at regular intervals for processing. The data from each sensor can be regarded as a time series, x(t);[t = 1, 2, ..., m], generated sequentially along time axis t. It is called a univariate time series(UTS) in some articles [3]. There are normally more than one sensor in a gesture acquisition device. The data collected from all sensors in a device are used to represent different gestures. Therefore, a gesture is usually represented by a series of observations, xi (t);[i = 1, ..., n; t = 1, ..., m], where i and t index the number of sensors in a device and sampling time point respectively. For example, Fig. 1 describes typical gesture data received from a pair of 5DT Data Gloves. 1
http://www.vrealities.com/glove.html
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 214–223, 2010. c Springer-Verlag Berlin Heidelberg 2010
Hand Gesture Recognition Based on Segmented Singular Value Decomposition
215
1
0.8
Sensor Values
0.6
0.4
0.2
0
-0.2
-0.4
0
10
20
30 40 50 Sampling Time Point
60
70
80
Fig. 1. Typical gesture data from glove device
This kind of gesture data can be viewed as multivariate time series(MTS), which are conveniently stored in a matrix Gm×n ,(g1 (t), ..., gn (t)),(t = 1, ..., m). A column of G represents information from a sensor, and it is also called a variable. A row of G represents information from different sensors at a sampling time point t. Hand gestures have two aspects roughly: the static aspect (hand postures) and the dynamic aspect(dynamic gestures) [4,5]. Static gestures are in general characterized by poses or configurations of hands. Dynamic gestures can be defined either as the trajectory of hands or as a series of hand postures in a sequence of gesture data. Many gestures also have both static and dynamic elements, as in sign languages. Our intuition is that people use dynamic gestures more than static postures in daily life. How to recognize dynamic gestures is the focus of recent research in this field. One of the most challenging issues in a glove-based gesture recognition is that a particular gesture posed by different people may have different time stamps, since people may have various moving speeds. Even the same gestures posed by the same people at different times are probably not of identical lengths due to the high flexibility of human hand. Intuitively, one can not promise to do the same gesture exactly twice. Another challenge in glove-based gesture recognition is the device otherness. The captured data of same gestures by two different devices may be entirely different. Subtle control can be achieved with the development of the data glove technology nowadays. Apparently, the more accurate information we get about hands the easier for us to recognize them. However, the high cost of highly accurate devices sometimes make them less accessible for many researchers.
2 Related Work There are some studies that address the glove-based gesture recognition. Takashi et al. recognize 30 out of 46 right-hand Japaneese alphabet gestures with a data-glove based system by joint angle coding techniques [6]. Ishikawa and Matsumura [1] propose an approach recognizing hand gestures by measuring the angle of joints of hands.
216
J. Liu and M. Kavakli
Kadous uses PowerGloves to recognize 95 signs with 80% accuracy [7]. He, then, improves the accuracy rate approximately to 90% with higher quality equipment by the method named TClass [8]. This method extracts some metafeatures that are applied to machine learning algorithms(e.g. J48, Naive Bayes). An approach ROMAS utilizes biomechanical characteristics for user-independent gesture recognition with CyberGloves [9]. Eisenstein et al. [10] combine template matching and neural networks together in gesture recognition. Implementation of this combined system is inevitably complicated. A new haptic system uWave with 3-D accelerometers requires only a single training sample and allows users to employ personalized gestures [11]. UWave is, however, still user-dependent with the decreasing performance evaluated on different participants. These approaches mentioned above draw more or less on devices to obtain domain knowledge about hand information. Sometimes one has to get full understanding of hand trajectories even each finger bends in order to apply these methods to a particular context. They also treat UTS items of each gesture data individually and do not take correlations among UTS items into account. It may result in some limitations and unreliability, since there are naturally some important correlations between UTS items from different sensors [12]. As mentioned before, typically glove-based gesture data is stored in a matrix Gm×n and regarded as an MTS item. How to measure the similarity of two MTS items representing two gestures is the most important aspect for gesture recognition. Yang et al. [12] introduce a PCA-based similarity measure factor called Extended Frobenius norm(Eros). It discriminates different gestures using the sum of squared cosine values of top k ranked corresponding eigenvectors’ angles of two gesture data. Li et al. [13] propose recognizing motion sequences by the SVD. They demonstrate that similar motions have similar trajectories. Projection of motion matrices by the SVD should be therefore similar, especially along the direction with the maximal variance of data (i.e. the direction of first singular vector). Hence, they define a similarity measure method based on the product of both the effects of first singular vectors and singular values. They also apply the first k eigenvectors weighted by the corresponding eigenvalues [14]. They select the number k empirically according to their experiments. In essence, all these approaches treat each MTS item as a whole entity and therefore consider global attributes of each gesture adequately. As demonstrated by [15], however, the SVD can only reveal the characteristics of data distribution globally, and totally loses the temporal information. The SVD will treat a gesture as the same as it is performed reversely, but actually they could be totally different. In this paper instead of domain knowledge, We consider explicitly local and global information of gesture data. We propose a new approach for the glove-based gesture recognition using segmented singular value decomposition(SegSV D).
3 Computing SegSVD Our intention is to apply the SegSVD approach to gesture recognition. In this section we briefly discuss the SVD and the similarity measure based on the SVD as the background for the proposed approach.
Hand Gesture Recognition Based on Segmented Singular Value Decomposition
217
3.1 Similarity Measure Based on Singular Value Decomposition Suppose A is a general real m × n matrix, then there exists the SVD of A as the following factorization form A = U ΣV T , where U = [u1 , ..., um ] ∈ Rm×m and V = [v1 , ..., vn ] ∈ Rn×n are two unitary matrices, and Σ is m×n diagonal matrix with nonnegative real numbers which are singular values: σ1 ≥ σ2 ≥ ... ≥ σmin(m,n) ≥ 0. Column vectors ui and vj are the ith left and the jth right singular vectors of A respectively. It has been presented by [13] that singular values and the corresponding singular vectors of a matrix are uniquely determined. It also can be proven that the right singular vectors vi are equal to the corresponding ith principle components of the matrix. The definition of principle component analysis is to obtain the coordinate axes along which variances of matrix data are extremes. Hence, maximal variances are along vi for a matrix. Intuitively, similar gestures should have similar trajectories despite having different time lengths. Hence, the geometric structures of the gesture data matrices revealed by their SVDs should be similar for the same gestures and different for the different gestures [13]. This similarity can be described by the parallelness of right singular vectors of gesture data matrices [14]. Throughout this paper, we will refer singular vectors as right singular vectors for convenience. It has been known that singular values are numerically correlated with the variances of matrices along the corresponding singular vectors. The eigenvalues represent the importance of each associated singular vector. As we can see from singular value curve Fig. 2, the first eigenvalue is the most dominating component of all the eigenvalues, and other eigenvalues tail off gradually and approach to zero finally. The geometrical effect of the first singular vector, therefore, is more important than that of other components. It is reasonable to only consider the parallelness of the first eigenvectors for a similarity measure. It can be observed from Fig. 3 that the first singular vectors are similar to each other for the same gestures and dissimilar for the different gestures. The second singular vectors, however, can be quite different even if for the same gestures. As mentioned above gesture data collected by gloves can be viewed as the MTS. Li et al. [13] propose a similarity measure method between two multi-attribute motion sequences by the inner product of the first singular vectors. 1 0.9
Normalized singluar values
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0
5
10 15 Number of singular values
20
Fig. 2. Normalized singular values
25
J. Liu and M. Kavakli
Vector component values
Vector component values
218
0.2 0 same gesture
-0.2
different gesture -0.4 -0.6
0
5
10 15 (a) First singular vector components
20
25
10 15 20 (b) Second singular vector components
25
1 0.5
same gesture
0 -0.5 -1
0
5
Fig. 3. Singular vectors. (a)The first singular vectors of the same and different gestures. (b)the second singular vectors of the same gestures.
→ − → Ψ (Q, P ) = |u1 · v1 | × (− σ · λ − η)/(1 − η)
(1)
→ − → Where− σ and λ are the normalized singular values vectors corresponding to Q and P respectively. η is a scale factor set as 0.9 empirically. According to the characteristics of SVD, Eq.(1) can only reveal distribution character of gesture data globally and lose temporal information. In order to solve this problem, we propose to apply segmented singular value decomposition to similarity measure based on the SVD. 3.2 The Proposed SegSVD In this section we describe the proposed SegSV D, which takes into account both temporal and global information of gesture data. Let {Gi }N i=1 denotes a data set of gesture samples with preclassified labels , where Gi ∈ Rm×n typically have many more rows m than columns n (m > n). Gi = (g1 , ..., gn ), where each gi represents samplings from a sensor along time axis. Our intention is to recognize an unknown gesture belonging to the data set. The most challenging issue for achieving our goal is that all items in the data set even for the same gestures are probably of different length. One approach to solve this problem is to define a distance or similarity measure between the unknown item and the reference item in the data set, and then to use other recognition approaches (e.g. 1NN or kNN) based on this distance measure. As mentioned above similar gestures should have similar trajectories. Our intuition is that the corresponding sub-parts of similar gestures should also have similar trajectories. For instance, if we segment each item into S(i.e. S = 5) parts along time axis, the geometric structures revealed by the corresponding SVDs of S subparts should be alike for similar gestures. Hence we can then compare the similarity between each subparts. The number of time steps for each subpart is called its duration l in the following parts of this paper. For a given data set {Gi }N i=1 ; Gi = (g1 , ..., gn ), the number of variables n is normally fixed while m varies from item to item. We can rewrite Gi as follows after being segmented. Gi (td ) = [g1 (td ), ..., gn (td )]; d = 1, ..., S, where Gi (td ) denotes the
Hand Gesture Recognition Based on Segmented Singular Value Decomposition
219
subparts of each item. We can then define the similarity between two gesture items as follows. Ψ (Gi , Gj ) = 1/S
S
Ψ (Gi (td ), Gj (td ))
(2)
d=1
Ψ (Gi (td ), Gj (td )) indicates the similarity between two subparts of two items and S is the number of subparts. We can see from Eq.(2) that the sole parameter is the number of subparts S in addition to the choice of basic similarity measure Ψ . We can employ any similarity measure method (e.g.[13,14]) proven to be effective in the related fields. The approach proposed in [14] employs the first k eigenvectors for the similarity measure. However, Fig.3 shows that the second eigenvectors may be quite different for the same gestures. We also observe further in the practical that other eigenvectors are quite different except the first eigenvectors for the same gestures. Hence, we tend to adopt Eq.(1) as the method for the basic similarity measure. We revise Eq.(2) as follows. Ψ (Gi , Gj ) = 1/S
S
− → − → (|ud1 · v1d | × (σ d · λd − η)/(1 − η))
(3)
d=1
Where ud1 and v1d are the first eigenvectors of the dth subpart. σ d and λd are the normalized eigenvalue vectors associated with Gi (td ) and Gj (td ). The most important issue for the proposed similarity measure, therefore, is how to segment each item. In practical, the larger S indicates the shorter duration l, which would probably indicate the higher accuracy. On the other hand, however, the duration should be consistent with the temporal resolution of the problem. Short duration focuses more on local characteristics of the original data while longer duration tends to consider the gestures more globally. In our approach, we determine S and the duration l on the basis of the following conditions. – The number of subparts S is identical for all items in the data set as well as the unknown item to be recognized. – The duration l is shorter than the minimum length of items in the data set and longer than the number of variables n, that is l > n. Eq.(3) indicates that S should be identical for all items, otherwise it would result in measuring distances between two vectors with different lengths which is beyond our scope in this paper. For the second condition, we consider it from the point of view of simplifying calculation. However, we can not determine the two parameters just counting on the conditions above. As the number of subparts S is fixed for all items, we propose two ways to determine S and l. One approach is to select the fixed S and flexible l for different items. We call this approach average segmenting. We can divide each item Gi with length m into S parts averagely. Therefore the length of each part is l = [m/S], where [x] indicates rounding the elements of x to the nearest integers towards minus infinity. If m is not the integral multiple of S, then the length of last subpart is l = m − (S − 1) × [m/S]. This approach is easy to comprehend and implement in the algorithm design. The problem might be
220
J. Liu and M. Kavakli
the duration l to be smaller than the number of variables n for some short items. This is not consistent with the second condition. Another approach is to determine S and l empirically and apply the fixed S and l for all items. This can satisfy the second condition easily by choosing l larger than n. The key problem in this case is that S ∗ l may not be equal to the length m. If S ∗ l > m, we can adopt ideas from frame overlapping in speech signal processing [16]. The data used in subparts is overlapped with the previous subparts. We call this method subpart overlapping. If S ∗ l ≤ m, we employ the first approach average segmenting. However, we find, in practical, that S ∗ l > m is always right for our case. We adopt subpart overlapping for the algorithm design in the following part. 3.3 Gesture Recognition Algorithm Design In order to recognize gestures, first we have a set of gesture samples as training samples; then each gesture data in training set is separated into S parts with length l. The algorithm firstly assumes that parameters S and l are predetermined and known beforehand. However, the predetermined parameters may not be reasonable in many applications. We use a method that estimate the parameter S and l by observing the average performance of the designed algorithm. The first step in the algorithm is to estimate S and l. We use s to indicate the length of the shortest item in the data set. Since s > n is always tenable for the overwhelming majority applications, the pre-condition is that n ≤ l < s. Considering that the item is separated into S parts, l should at least be smaller or equal to s − S + 1. When l = s − S + 1, data of one row used in subparts are overlapped with the previous subparts for the shortest item; when l < s − S + 1, the overlapped data depend on l and m. As to the number of subparts, we pre-set S = 5 and l = n at the beginning of the algorithm. Kadous [8] applies time division feature for glove-based gesture recognition. He calculates average value of each segment and uses them as attributes to recognize gestures. It was empirically found that five segments led to the best result. That probably means five segments are not too many to be sensitive to variation in time and noise and not too few to characterize gesture sufficiently. We therefore separate gesture data into five subparts at first. The value of l increases from n to s − S + 1. It is convenient to repeat the algorithm for a fixed parameter S and then observe its average performance in order to estimate l. Then we get the following algorithmic processing: Input: training set, testing set. Output: error rate of classification 1. Pre-set S and l, separate each item in the training set into S parts with the size l×n. We make l ≥ n in the algorithm based on our analysis. 2. For each item Gi in the training set, compute SVD of the subparts and get the first eigenvectors ud1 (d ∈ (1, S)) and the normalized eigenvalue vectors σ d . 3. For each item Gj in the testing set, repeat the Step 2 and obtain the first eigenvectors v1d and the normalized eigenvalue vectors λd of its subparts.
Hand Gesture Recognition Based on Segmented Singular Value Decomposition
221
4. Compute Ψ (Gi , Gj ) as defined in Eq.3. Gj is recognized as the gesture Gi with the highest Ψ (Gi , Gj ). 5. Return the average error rate of classification. 6. Change the value of l and repeat the Step 2 to Step 5. Select l with the lowest error rate of classification as the final parameter.
4 Performance Evaluation 4.1 Dataset The performance evaluation is carried on a real-world dataset, that is AUSLAN (Australian Sign Language) dataset [8]. Data was captured mainly using two Fifth Dimension Technologies(5DT) gloves and two Ascension Flock-of-Birds magnetic position trackers. 22 sensors in sum are involved. Each instance can be viewed as an MTS sample with 22 variables. In order to compare our algorithm with the related work, we use the same 25 signs in [17]. These 25 signs are alive, all, boy, building, buy, cold, come, computer, cost, crazy, danger, deaf, different, girl, glove, go, god, joke, juice, man, where, which, yes, you and zero. Hence there are 675 items in our final data set with 27 samples per class. We take one item as the test set and the reminder as the training set. In experiments we recognize gesture Gj as the gesture Gi with the highest Ψ (Gi , Gj ). In principle, we can apply the method similar to K−Nearest Neighbor(KNN) in our work, that is calculating the first K highest Ψ (Gi , Gj ) and assigning the sign of the majority to the gesture being recognized. Normally the choice of K is odd number and often performed through cross-validation. However it has been shown that the performance of 1NN with Euclidean distance is very hard to exceed in many applications [17]. The obvious advantage of 1NN is that it does not need any parameters. After several trials, we found that the performance with K=3 or 5 are not as good as K=1 in our case. We finally compute the average error rate by leave-one-out cross validation. 4.2 Algorithm Complexity Let N be the total number of items in the dataset. The eigenvectors can be obtained by computing SVD on the covariance matrix with size n × n. The time complexity of computing SVD for m × n matrix is O(n3 ) [13]. The complexity of the first three steps is O(Sn3 ). Computing the similarity and performing the leave-one-out cross validation take O(N 2 ) (Step 4 and Step 5). Therefore the complexity of our algorithm is O(Sn3 + N 2 ). 4.3 Parameters Selection In our algorithm, we have two parameters l(the length of the subparts) and S(the number of the subparts) to be decided. Different parameter values can influence the performance of our approach. For the parameter S, we pre-set S = 5 since it was empirically found that five segments are suitable for characterizing gesture sufficiently [8]. However, we tested our approach with S = 3, 4, 5 and 6 and found that the best performance appears at S = 4 instead of S = 5.
222
J. Liu and M. Kavakli 0.13
0.12 S=3 S=4 S=5 S=6
Error rate
0.11
0.1
0.09
0.08
0.07
0.06 22
23
24
25 26 27 Length of subparts(l)
28
29
30
Fig. 4. Error rate with different parameters
For the parameter l, we restrained it between n and s − S + 1. Fig. 4 shows that the performance is getting worse when l increases. The better results are obtained with l = 24 approximately for different S. It is clear that the error rates of S = 5 and 6 are higher than that of S = 3 and 4. The performance of S = 3 and 4 is very close, however, the error rate is still lower when S = 4. The performance achieves the best result with S = 4 and l = 24. At this point the error rate is 0.0607 which is much lower than the result 0.11 obtained by the approach proposed in [13]. As observed in Fig. 4, the trend of error rates is upward as a whole when l > 24 for different S. The error rates achieve the best result with l = 24 except when S = 5 which achieves its best result with l = 23. These results provide us with an empirical conclusion that when S = 4 and l = 24, the performance of the proposed algorithm achieves the best result.
5 Conclusion In this paper, we have proposed an approach to recognize hand gestures. We examined the properties of singular value decomposition of gesture data obtained by gesture sensors. First singular vectors of two similar gestures are nearly parallel to each other and this is not applicable for dissimilar gestures. We have proposed to apply segmented singular value decomposition to similarity measure between two gestures. Our method adequately reveals the character of gesture data globally and locally, which highly improves the performance. The best result is obtained by segmenting gesture data into four parts with 24 time stamps of each subpart. We are aware that a comprehensive comparison with other approaches is needed in the future. We are also aware of the need for evaluating the proposed method with other datasets (e.g. gesture data with different DOFs). In future, We can also explore the possibility of multiple similarity measures using more than one method to improve gesture recognition in case that two methods could complement each other. Acknowledgments. This project is sponsored by the Australian Research Council Discovery Grant titled “Gesture Recognition in Virtual Reality” (DP09880088).
Hand Gesture Recognition Based on Segmented Singular Value Decomposition
223
References 1. Ishikawa, M., Matsumura, H.: Recognition of a hand-gesture based on self-organization using a dataglove. In: Proceedings of 6th International Conference on Neural Information Processing, ICONIP 1999, vol. 2 (1999) 2. Fels, S.S., Hinton, G.E.: Glove-talkii-a neural-network interface which maps gestures to parallel formant speech synthesizer controls. IEEE Transactions on Neural Networks 9(1), 205–212 (1998) 3. Yang, K., Shahabi, C.: A pca-based kernel for kernel pca on multivariate time series. In: Proceedings of ICDM 2005 Workshop on Temporal Data Mining: Algorithms, Theory and Applications, New Orleans, Louisiana, USA, pp. 149–156 (November 2005) 4. Agnes Just, S.M.: A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Comput. Vis. Image Underst. 113(4), 532–543 (2009) 5. Erol, A., Bebis, G., Nicolescu, M., Boyle, R., Twombly, X.: Vision-based hand pose estimation: A review. Computer Vision and Image Understanding 108(1-2), 52–73 (2007) 6. Takahashi, T., Kishino, F.: Hand gesture coding based on experiments using a hand gesture interface device. ACM SIGCHI Bulletin 23(2), 67–74 (1991) 7. Kadous, M.W.: Machine recognition of auslan signs using powergloves: Towards largelexicon recognition of sign language. In: Proceedings of the Workshop on the Integration of Gesture in Language and Speech, pp. 165–174 (1996) 8. Kadous, M.: Temporal Classification: Extending the Classification Paradigm to Multivariate Time Series. PhD thesis (2002) 9. Parvini, F., Shahabi, C.: Utilizing bio-mechanical characteristics for user-independent gesture recognition. In: 21st International Conference on Data Engineering Workshops, Tokyo, p. 1170 (April 2005) 10. Eisenstein, J., Ghandeharizadeh, S., Golubchik, L., Shahabi, C., Yan, D., Zimmermann, R.: Device independence and extensibility in gesture recognition. In: IEEE Virtual Reality Conference (VR), LA, Los Angeles, CA, USA, pp. 207–214 (March 2003) 11. Jiayang, L., Zhen, W., Lin, Z., Wickramasuriya, J., Vasudevan, V.: uwave: Accelerometerbased personalized gesture recognition and its applications, pp. 1–9 (March 2009) 12. Yang, K., Shahabi, C.: A pca-based similarity measure for multivariate time series. In: Proceedings of the 2nd ACM International Workshop on Multimedia Databases, Washington, DC, USA, pp. 65–74 (November 2004) 13. Li, C., Zhai, P., Zheng, S., Prabhakaran, B.: Segmentation and recognition of multi-attribute motion sequences. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, New York, USA, pp. 836–843 (October 2004) 14. Li, C., Prabhakaran, B.: A similarity measure for motion stream segmentation and recognition. In: Proceedings of the 6th International Workshop on Multimedia Data Mining: Mining Integrated Media and Complex Data, Chicago, pp. 89–94 (August 2005) 15. Li, C., Prabhakaran, B., Zheng, S.: Similarity measure for multi-attribute data. In: Proceedings of the 2005 IEEE International Conference on Acoustics, Speach, and Signal Processing (ICASSP), Philadelphia, USA (March 2005) 16. Ephraim, Y., Malah, D.: Speech enhancement using a minimum-mean square error shorttime spectral amplitude estimator. IEEE Transactions on Acoustics Speech and Signal Processing 32(6), 1109–1121 (1984) 17. Weng, X., Shen, J.: Classification of multivariate time series using two-dimensional singular value decomposition. Knowledge-Based Systems 21(7), 535–539 (2008)
Reasoning and Inference Rules in Basic Linear Temporal Logic BLTL S. Babenyshev and V. Rybakov Department of Computing and Mathematics, Manchester Metropolitan University, John Dalton Building, Chester Street, Manchester M1 5GD, U.K., and Institute of Mathematics, Siberian Federal University, 79 Svobodny Prospect, Krasnoyarsk, 660041, Russia [email protected], [email protected], [email protected]
Abstract. Our paper1 studies a formalization of reasoning in basic linear temporal logic BLTL in terms of admissible inference rules. The paper contains necessary preliminary information and description of new evolved technique allowing by a sequence of mathematical lemmas to get our main result. Main result is found explicit basis for rules admissible in BLTL (which, in particular, allows to compute admissible rules).
1
Introduction
An essential part of logical reasoning is the inference (deduction), which determines the conclusion form given premisses. Usually these procedures are based on inference rules, which make conclusion based on their preconditions. Consider as an example: ”When it rains, the grass gets wet. It rains. Therefore, the grass is wet.” If to mathematically formalize this approach to reasoning, the system will be uniquely determined by accepted inference rules and assumptions: axioms. Thus, the choice of inference rules is of fundamental importance. The most general concept of possible inference rules are admissible rules, which were introduced to consideration by P. Lorenzen (1955). In a logic L, a rule of inference is admissible if the set of theorems of L is closed w.r.t. this rule. It is clear immediately from the definition, that the class of admissible rules is more general concept of rules which are compatible with the logic, so this class is the most powerful collection of the rules which may be included in correct (sound and complete) axiomatic systems. Research for admissible rules was stimulated by a question of H.Friedman [7]: whether the problem of admissibility for inference rules is decidable in the intuitionistic logic IP C. This problem, as well as it analogues for various transitive modal logics (e.g. S4, K4, S5, S4.3) were studied and solved affirmatively by V.Rybakov (cf. [37], — first solution for IP C and S4, [38,39]) by finding algorithms deciding these problems. Later, solution of admissibility problem in IP C on base of technique close to proof theory was found by P.Roziere [45]. S.Ghilardy found 1
Supported by Engineering and Physical Sciences Research Council (EPSRC), U.K., grant EP/F014406/1.
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 224–233, 2010. c Springer-Verlag Berlin Heidelberg 2010
Reasoning and Inference Rules in Basic Linear Temporal Logic BLTL
225
solution of admissibility problem for IP C and S4 via unification and projective formulas (cf. [12]), (this technique and its enhancements were also successfully applied to unification and projectivity problems in various logics (cf. S. Ghilardi [13,14,15,16])). The A.Kuznetsov’s problem if IP C (and S4) has a finite basis for admissible rules was answered in negative by V.Rybakov (cf. [40] and [39], respectively). However, it was not clear, which explicit bases may describe rules admissible in IP C. This question was answered by R.Iemhoff [18], where it was shown that de Jong-Visser rules indeed form a basis for rules admissible in IP C. A modal variation of this basis forms a base for rules admissible in S4, V.Rybakov [41]. Starting from [18], structure of possible admissible rules in terms related to proof theory has been intensively studied (S.Artemov and R.Iemhoff [1], R.Iemhoff [19], R.Iemhoff and G.Metcalfe [20,21], G.Metcalfe [34,35]). Explicit bases were found, but the question if independent bases are possible, was open. It was solved by E. Jer´ abek, [22] who found an independent basis for rules admissible in IP C. Paper Jer´ abek [23] finds basis for admissible rules in several modal logics related to S4. The paper Jer´ abek [24], it seems first time explicitly, gives precise estimations of complexity for algorithms deciding admissibility. These all results concerning admissibility were impressive, but yet it has been visible that they primarily are focused to non-classical logics related to IP C and transitive modal logics (like S4 and K4). Therefore a move to logics with another background, or famous from viewpoint of applications, was felt to be necessary. In this line, recently, the admissibility problem for linear temporal logic BLTL was solved Rybakov [44]. Also the cases of just temporal logic based on integers Z, Rybakov, [42], and temporal logics admitting universal modality, Rybakov, [43] were studied and algorithms solving admissibility were found. Even, recently, the research towards Lukasiewicz logic was directed and algorithms solving decidability and bases for admissible rules were suggested Jer´abek [25,26]. The decidability problem for admissibility in the linear temporal logic LT L was solved (cf. above, i.e. [44]), but the question about finding explicit bases for rules admissible in LT L was open. In this paper we investigate this problem directed to a weaker version of LT L - the basic linear temporal logic BLT L. The logic LT L is well known important non-classical logic with background from AI and CS. LT L was widely implemented in AI and CS (cf. e.g. Manna and Pnueli [32,33], E.Clarke et al. [5]) and linear temporal logics were used by mathematicians and philosophers for reasoning about the knowledge and time (cf. e.g. R.Goldblatt [10], van Benthem [4,4]). LT L, as well as various other temporal logics, have numerous applications in problems arising in computing (cf. Eds. Barringer, Fisher, Gabbay and Gough [3]). For checking satisfiability in LT L a subtle technique based at automatons was evolved (cf. Carsten [8], Vardi [6,47,48]). The decidability of LT L also can be shown by bounded finite model property (cf. Sistla, Clarke [46]). First, to best of our knowledge, axiomatization for LT L was suggested in D. Gabbay, A. Pnueli, S. Shelah, J. Stavi, [9] (cf. also [27]). Later Lange [30], also given a decision procedure and a complete axiomatization for LT L extended
226
S. Babenyshev and V. Rybakov
by P ast. Various axiomatic systems for LT L and related logics are suggested and discussed in [28]. We aim to extend this cornerstone results towards finding axiomatization for rules admissible in BLTL. The main result of our paper is construction of an explicit bases for these rules. Using this basis and standard satisfiability (in BLTL) verification we obtain an algorithm computing admissible inference rules.
2
Definitions, Notation, Basic Facts on Logic BLTL
We include below all necessary definitions and notation to follow the paper. Though some preliminary acquaintance with non-classical propositional logics (symbolic/mathematical approach) and their algebraic and Kripke-Hintikka semantics is advisable. To introduce language for the logic BLTL we fix an enumerable set V ar := {x1 , x2 , x3 , . . . } of propositional letters (variables). The formulas over the propositional language L := ∨2 , ∧2 , →2 , ¬1 , N1 , ♦1 , 0 , ⊥0 are defined by the following grammar α ::= xi | α1 ∧ α2 | α1 ∨ α2 | α1 → α2 | ¬α | Nα | ♦α | | ⊥ | The set of all L-formulas is denoted by F mL . For a formula α, V ar(α) will denote the set of variables occurring in α. We will use the common abbreviation α for ¬♦¬α. Let N := N, N, R. be the Kripke structure based on natural numbers N = {0, 1, 2, 3, . . . }, where N ⊆ N × N (N is from Next) is defined as N := {(i, i + 1) | i ∈ N} and R :=N . The logic BLTL (Basic Linear Temporal Logic) is defined to be the set of all L-formulas valid in all Kripke models based on the frame N , i.e, BLTL is the set of all formulas α ∈ F mL such that for any valuation of variables ν : V ar(α) → 2N , N ν α, where the truth-values of non-boolean connectives N and ♦ are defined as follows: (N , i) ν Nα ⇐⇒ (N , i + 1) ν α, (N , i) ν ♦α ⇐⇒ ∃j i : (N , j) ν α. Note that we will be using symbol N both as a logical connective, as well as the name for the corresponding accessibility relation. We denote by M1 M2 the disjoint union of models M1 and M2 . Definition 1. For a fixed n, let N := {N , ν | ν : {x1 , . . . , xn } → 2N }. Directly from the definition of BLTL, N is an n-characteristic model for BLTL, i.e., for every L-formula α(x1 , . . . , xn ): α ∈ BLTL ⇐⇒ N α. The following formulas are known theorems of BLTL (that can be verified by direct check of their validity in N ): Nx → ♦x
(T1)
(x → y) ∨ (y → x) ♦x → (x ∨ N♦x)
(T2) (T3)
(x ∧ (x → Nx)) → x.
(T4)
Reasoning and Inference Rules in Basic Linear Temporal Logic BLTL
3
227
Stone’s Spaces for BLTL
Here we enumerate important facts, stated in the following lemmas, which are essentially used in mathematical proofs of this paper. Actually, we need the basics of algebraic semantics for BLTL and a rudimentary duality theory. Definition 2. A BLTL-algebra is an algebra A = A, ∨, ∧, ¬, N, ♦, , ⊥, where 1. A, ∨, ∧, ¬, , ⊥ is a boolean algebra; 2. N : A → A, ♦ : A → A; 3. A |= α = β for every bi-implication α ↔ β ∈ BLTL. Instead of A |= α = , we will customary write A |= α.
Given an n-generated BLTL-algebra A = A(a1 , . . . , an ) with the fixed generators a1 , . . . , an , we can define a Kripke model A∗ := U (A), N, R, ν, where (i) U (A) is the set of ultrafilters of the boolean algebra A, ∨, ∧, ¬, , ⊥, where A is the universe of A, (ii) for all u, v ∈ U (A) uNv ⇐⇒ (∀a ∈ A)(Na ∈ u ⇐⇒ a ∈ v), uRv ⇐⇒ (∀a ∈ A)(a ∈ v =⇒ ♦a ∈ u), (iii) ν is the valuation of variables x1 , . . . , xn defined as follows: u ∈ ν(xi ) ⇐⇒ ai ∈ u. Note that the definition of the valuation in A(¯ a)∗ depends on the choice of generators. Further on, we will customary write u ∈ A∗ instead of u ∈ U (A). Lemma 1. For each u ∈ A∗ , there exists a unique v ∈ A∗ (may be u itself ), such that uN v. Lemma 1 shows that the binary relation N ⊆ U (A)× U (A) is, in fact, a function, therefore we will often use the usual functional notation v = N (u), whenever uN v. It can be easily proved by induction, that Lemma 2. Suppose A(¯ a) is a BLTL-algebra. Then 1. α(¯ a) ∈ u ⇐⇒ (A∗ , u) α(¯ x); x); 2. A |= α(¯ a) ⇐⇒ A∗ α(¯ x). 3. for all α(¯ x) ∈ BLTL: A∗ α(¯ Note, that A∗ from the previous lemma is neither necessary based on the frame isomorphic to N , nor it is countable. Lemma 3. If A is a BLTL-algebra, then for all u, v, w ∈ A∗ : 1. uRv & uRw =⇒ vRw ∨ wRv, 2. uNv =⇒ uRv, 3. uRv & u = v =⇒ N (u)Rv.
228
4
S. Babenyshev and V. Rybakov
Reduced Forms of Inference Rules
An (inference) rule is a pair {α1 , . . . , αn }, β where α1 , . . . , αn , β are formulas. We will usually write a rule {α1 , . . . , αn }, β in the form α1 , . . . , αn /β. An informal meaning of this inference rule is: (i) β follows from premisses (assumptions, hypothesis) α1 , . . . , αn , or (ii) this rule infers β from α1 , . . . , αn . Since BLTL has conjunction in the language, it will suffice to consider only rules with a single premise. For a rule r = α/β: V ar(r) := V ar(α) ∪ V ar(β). A rule r = α/β is valid in a model M = F, ν (symbolically M r), if V ar(r) ⊆ dom(ν) and M α =⇒ M β. A rule r is valid in a frame F , if, for every valuation ν of variables from V ar(r), F ν r. Thus, if the rule r is not valid in F , then there is a valuation ν such that F ν r. In that case we say that r is refuted in F (by ν). The strongest class of possible structural logical rules correct for a given logic Λ, main object of our paper, — admissible rules was introduced into consideration by Lorenzen (1955) [31]. Given an arbitrary propositional logic Λ, F mΛ is the set of all formulas in the language of Λ. Let r := ϕ(x1 , . . . , xn ), . . . , ϕm (x1 , . . . , xn ) /ψ(x1 , . . . , xn ) be a rule in the language of the logic Λ. Definition 3. Rule r is said to be admissible in Λ if, ∀α1 ∈ F mΛ , . . . , ∀αn ∈ F mΛ , 1≤i≤m [ϕi (α1 , . . . , αn ) ∈ Λ] =⇒ [ψ(α1 , . . . , αn ) ∈ Λ]. As an example, the rule Nx/x is admissible for BLTL but is not valid in BLTLframes. An n-clause (or clause if n is clear from context) is a formula over L-language with n variables, which has the form n k=1
t(B,k)
xk
∧
n
(♦xk )t(♦,k) ∧
k=1
n
(Nxk )t(N,k) ,
(1)
k=1
where t is a function t : {B, ♦, N} × {1, . . . , n} → {0, 1}, and for each formula α, α0 := ¬α, α1 := α. It is easy to see that there can be only as much as 23n distinct clauses over a set of n variables. We denote the set of all clauses over variables x1 , . . . , xn by Φ(x1 , . . . , xn ). For every φ ∈ Φ(X) and S ∈ {B, ♦, N}, we denote θS (φ) := {xk ∈ X | t(S, k) = 1}. We say that a clause φ is satisfiable in a model M, if there is a world w ∈ M, such that (M, w) φ. Given a Kripke model M with the valuation ν of finite domain, every world w ∈ M determines a unique clause φw (M) ∈ Φ(dom(ν)), defined as φw (M) :=
e
xkB,k ∧
(M,w)xk eB,k
(♦xk )e♦,k ∧
(M,w)(♦xk )e♦,k
(Nxk )eN,k .
(M,w)(Nxk )eN,k
We will not mention M in φw (M), whenever the model M is clear from context. A rule r, with V ar(r) = {x1 , . . . , xn }, is said to be in the reduced normal form if r= φj /x1 , (RNF) 1js
Reasoning and Inference Rules in Basic Linear Temporal Logic BLTL
229
where each disjunct φj is an n-clause. For a formula α, Sub(α) denotes the set of subformulas of α. For a rule r = α/β: Sub(r) := Sub(α) ∪ Sub(β). Using the technique similar to one described in [39, Section 3.1], we can prove Lemma 4. Every rule r = α/β can be transformed in exponential time to a rule rnf in the reduced normal form, which is equivalent to r w.r.t. validity and admissibility. The reduced normal form, obtained in Lemma 4, is uniquely defined and will be denoted by rnf .
5
Coherent Strings of Clauses
This section describes auxiliary technique which is quit essential to getting main result our results. Definition 4. A finite sequence of clauses φ1 , φ2 , . . . , φn is called a coherent string if for each i < n: 1. θN (φi ) = θB (φi+1 ), 2. θ♦ (φi ) = θB (φi ) ∪ θ♦ (φi+1 ). If a coherent string φ1 , φ2 , . . . , φn , φ1 satisfies the condition θ♦ (φi ) = for all i = 1, 2, . . . , n, then it is called a coherent loop.
n
j=1 θB (φj ),
We will list for future reference some easy properties of coherent strings. Lemma 5. (i) for all u, v ∈ A∗ , if uN v, then φu , φv is a coherent string, (ii) if string φ1 , . . . , φk and φk , . . . , φk+l are coherent, then the string φ1 , . . . , φk−1 , φk , φk+1 , . . . , φk+l is coherent. We will also need several combinatorial results. Definition 5. Suppose we have a finite non-empty set A and suppose that a, b ∈ A, a = b. A reflexive binary relation P ⊆ A × A is called a, b-oriented, if ¯ =⇒ ∃x, y ∈ P ∩ (X × X), ¯ ∀X ⊆ A : a, b ∈ X × X ¯ is A \ X — the complement of X in A. where X Lemma 6. If P ⊆ A × A is a, b-oriented, then there exists a P -path from a to b. Lemma 7. Let ξ, η ∈ A∗ . If ξRη, then there exists a coherent string φξ , . . . φη of satisfiable in A∗ clauses.
230
6
S. Babenyshev and V. Rybakov
Main Results
For n = 2, 3, . . . , let n n ((x ∧ x ) → ⊥)) → (x → ♦x ) /⊥. sn := (( i=1 xi ) ↔ ) ∧ i i j i j =j j =i i=1 Lemma 8. Let A = A(a1 , . . . , an ) be an n-generated BLTL-algebra, such that A∗ si , for i = 2, . . . , 23n . Then there is u ∈ A∗ , such that θB (φu ) = θN (φu ) = θ♦ (φu ). Definition 6. A d-frame is a finite frame {a0 , . . . , an }, N, R, where 1. there is k: 0 ≤ k ≤ n such that N = {ai , ai+1 | i = 0, . . . , n − 1} ∪ {an , ak }, 2. R is the reflexive and transitive closure of N. (In the definition: a0 , . . . , an are distinct.) A frame W ∪ {a}, N ∪ {a, a}, R ∪ {a, a}, where W, N, R is a d-frame and a ∈ / W , is called a de -frame. A Kripke model based on a d(de )-frame (see Fig. 1) we will call a d(de )-model, whenever φa0 = φa1 . Let, for n = 1, 2, . . . ,
rn := ¬
n
(xi ↔ Nxi ) ∧ N ♦¬x0 /x0 . 2
i=1
Lemma 9. Let r = 1jk φj /x1 be a rule in the reduced form and suppose that A = A(a1 , . . . , an ) is an n-generated BLTL-algebra, such that A |= a)/a1 and A |= rk . Then there exists a d-model M such that M r. 1jk φj (¯
ak+2 ak+1 a0 φa0
a1 N
φa0
ak−1 ak
a2 N
N
N
an
N N
a
N N N
an−1 b-model be -model Fig. 1. An illustration of d(de -model)
e-model
Reasoning and Inference Rules in Basic Linear Temporal Logic BLTL
Lemma 10. Let r =
231
φj (x1 , . . . , xn )/x1 be a rule in the reduced form.
1js
Suppose there is a de -model M = F, μ, with dom μ = {x1 , . . . , xn }, such that M r, Then r is not admissible in BLTL. Theorem 1. The set of inference rules R := {sn | n 2} ∪ {rn | n 1} ∪
x x x, x → y , , Nx x y
plus any axiomatization of BLTL form a basis for admissible rules of BLTL. Using this theorem and standard satisfiability (in BLTL) verification we obtain an algorithm computing admissible inference rules.
7
Conclusion
The major goal of this paper is to provide a logical framework suitable for formalization reasoning within temporal logic BLTL. This logic, as a fragment of standard linear temporal logic LT L, reflects basis laws for reasoning about computational processes within linear time (linear flaw of computations). The accepted semantical approach to such logics is based on well-understood KripkeHintikka semantics, and potentially allows to consider many highly expressive logical operators. We consider most general aspect of logical reasoning – admissible inference rules. Our main result is constructed basis for all rules admissible in BLTL. Using this basis we obtain an algorithm computing rules admissible in BLTL. The paper represents scheme of mathematical proof that the suggested sequence of rules indeed from a basis (detailed proofs of lemmas and theorems, due to limitation on the size of paper, are omitted). These results may be useful for researchers from the fields of AI and CS for verifying correctness of reasoning about properties which might be formalized in BLTL. There are many open avenues for further research. In particular the following open problems are not solved yet: (i) If a finite basis for rules admissible in BLTL is possible? (2) If no, if an independent basis exists? (3) If the unification problem in BLTL for formulas with parameters (meta-variables) is decidable?
References 1. Artemov, S., Iemhoff, R.: The basic intuitionistic logic of proofs. Journal of Symbolic Logic 72(2), 439–451 (2007) 2. Blok, W., Pigozzi, D.: Algebraizable logics. Mem. Amer. Math. Soc., vol. 396. AMS, Providence (1989) 3. Barringer, H., Fisher, M., Gabbay, D., Gough, G.: Advances in Temporal Logic. Applied logic series, vol. 16. Kluwer Academic Publishers, Dordrecht (1999) 4. van Benthem, J.: The Logic of Time. Kluwer, Dordrecht (1991) 5. Clarke, E., Grumberg, O., Hamaguchi, K.: Another look at ltl model checking. In: Dill, D.L. (ed.) CAV 1994. LNCS, vol. 818, pp. 415–427. Springer, Heidelberg (1994)
232
S. Babenyshev and V. Rybakov
6. Daniele, M., Giunchiglia, F., Vardi, M.Y.: Improved automata generation for linear temporal logic. In: Halbwachs, N., Peled, D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 249–260. Springer, Heidelberg (1999) 7. Friedman, H.: One hundred and two problems in mathematical logic. Journal of Symbolic Logic 40(3), 113–130 (1975) 8. Fritz, C.: Constructing b¨ uchi automata from linear temporal logic using simulation relations for alternating b¨ uchi automata. In: Ibarra, O.H., Dang, Z. (eds.) CIAA 2003. LNCS, vol. 2759, pp. 35–48. Springer, Heidelberg (2003) 9. Gabbay, D., Pnueli, A., Shelah, S., Stavi, J.: On the temporal analysis of fairness. In: POPL 1980: Proceedings of the 7th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 163–173. ACM, New York (1980), doi:http://doi.acm.org/10.1145/567446.567462 10. Goldblatt, R.: Logics of time and computation. Center for the Study of Language and Information, Stanford (1987) 11. Goranko, V., Passy, S.: Using the universal modality: Gains and questions. Journal of Logic and Computation 2, 5–30 (1992) 12. Ghilardi, S.: Unification in intuitionistic logic. J. Symb. Log. 64(2), 859–880 (1999) 13. Ghilardi, S.: Unification through projectivity. J. Log. Comput. 7(6), 733–752 (1997) 14. Ghilardi, S.: Best solving modal equations. Ann. Pure Appl. Logic 102(3), 183–198 (2000) 15. Ghilardi, S.: A resolution/tableaux algorithm for projective approximations in ipc. Logic Journal of the IGPL 10(3), 229–243 (2002) 16. Ghilardi, S.: Unification, finite duality and projectivity in varieties of heyting algebras. Ann. Pure Appl. Logic 127(1-3), 99–115 (2004) 17. Harrop, R.: Concerning formulas of the types a → b ∨ c, a → ∃xb(x). Journal of Symbolic Logic 25(1), 27–32 (1960) 18. Iemhoff, R.: On the admissible rules of intuitionistic propositional logic. J. Symb. Log. 66(1), 281–294 (2001) 19. Iemhoff, R.: Intermediate logics and visser’s rules. Notre Dame Journal of Formal Logic 46(1), 65–81 (2005) 20. Iemhoff, R., Metcalfe, G.: Proof theory for admissible rules. Ann. Pure Appl. Logic 159(1-2), 171–186 (2009) 21. Iemhoff, R., Metcalfe, G.: Hypersequent systems for the admissible rules of modal and intermediate logics. In: Art¨emov, S.N., Nerode, A. (eds.) LFCS. LNCS, vol. 5407, pp. 230–245. Springer, Heidelberg (2009) 22. Jer´ abek, E.: Independent bases of admissible rules. Logic Journal of the IGPL 16(3), 249–267 (2008) 23. Jeˇr´ abek, E.: Admissible rules of modal logics. J. Log. and Comput. 15(4), 411–431 (2005), doi:http://dx.doi.org/10.1093/logcom/exi029 24. Jer´ abek, E.: Complexity of admissible rules. Arch. Math. Log. 46(2), 73–92 (2007) 25. Jeˇr´ abek, E.: Admissible rules of L ukasiewicz logic. Journal of Logic and Computation (accepted) 26. Jeˇr´ abek, E.: Bases of admissible rules of L ukasiewicz logic. Journal of Logic and Computation (accepted) 27. Kr¨ oger, F.: Temporal logic of programs. Springer, New York (1987) 28. Kr¨ oger, F., Merz, S.: Temporal Logic and State Systems, Texts in Theoretical Computer Science. EATCS Series. Springer, Heidelberg (2008) 29. Kapron, B.: Modal sequents and definability. Journal of Symbolic Logic 52(3), 756–765 (1987) 30. Lange, M.: A quick axiomatisation of ltl with past. Math. Log. Q. 51(1), 83–88 (2005)
Reasoning and Inference Rules in Basic Linear Temporal Logic BLTL
233
31. Lorenzen, P.: Einfuerung in Operative Logik und Mathematik. Springer, Heidelberg (1955) 32. Manna, Z., Pnueli, A.: Temporal Verification of Reactive Systems: Safety. Springer, Heidelberg (1995) 33. Manna, Z., Pnueli, A.: The Temporal Logic of Reactive and Concurrent Systems: Specification. Springer, Heidelberg (1992) 34. Metcalfe, G.: Proof theory for Casari’s comparative logics. J. Log. Comput. 16(4), 405–422 (2006) 35. Metcalfe, G., Olivetti, N.: Proof systems for a g¨ odel modal logic. In: Giese, M., Waaler, A. (eds.) TABLEAUX 2009. LNCS, vol. 5607, pp. 265–279. Springer, Heidelberg (2009) 36. Mints, G.: Derivability of admissible rules. Journal of Soviet Mathematics 6(4), 417–421 (1976) 37. Rybakov, V.: A criterion for admissibility of rules in the modal system S4 and the intuitionistic logic. Algebra and Logica 23(5), 369–384 (1984) 38. Rybakov, V.: Rules of inference with parameters for intuitionistic logic. Journal of Symbolic Logic 57(3), 912–923 (1992) 39. Rybakov, V.: Admissible Logical Inference Rules. Studies in Logic and the Foundations of Mathematics, vol. 136. Elsevier Sci. Publ./North-Holland (1997) 40. Rybakov, V.: The bases for admissible rules of logics s4 and int. Algebra and Logic 24, 55–68 (1985) 41. Rybakov, V.: Construction of an explicit basis for rules admissible in modal system S4. Mathematical Logic Quarterly 47(4), 441–451 (2001) 42. Rybakov, V.: Logical consecutions in discrete linear temporal logic. Journal of Symbolic Logic 70(4), 1137–1149 (2005) 43. Rybakov, V.V.: Multi-modal and temporal logics with universal formula - reduction of admissibility to validity and unification. J. Log. Comput. 18(4), 509–519 (2008) 44. Rybakov, V.V.: Linear temporal logic with until and next, logical consecutions. Ann. Pure Appl. Logic 155(1), 32–45 (2008) 45. Rozi`ere, P.: Admissible and derivable rules in intuitionistic logic. Mathematical Structures in Computer Science 3(2), 129–136 (1993) 46. Sistla, A.P., Clarke, E.M.: The complexity of propositional linear temporal logics. J. ACM 32(3), 733–749 (1985) 47. Vardi, M.Y.: An automata-theoretic approach to linear temporal logic. In: Banff Higher Order Workshop, pp. 238–266 (1995), http://citeseer.ist.psu.edu/vardi96automatatheoretic.html 48. Vardi, M.Y.: Reasoning about the past with two-way automata. In: Larsen, K.G., Skyum, S., Winskel, G. (eds.) ICALP 1998. LNCS, vol. 1443, pp. 628–641. Springer, Heidelberg (1998)
Direct Adaptive Control of an Anaerobic Depollution Bioprocess Using Radial Basis Neural Networks Emil Petre, Dorin Şendrescu, and Dan Selişteanu Department of Automatic Control, University of Craiova, A.I. Cuza 13, Craiova, Romania {epetre,dorins,dansel}@automation.ucv.ro
Abstract. This work deals with the design and analysis of a nonlinear and neural adaptive control strategy for an anaerobic depollution bioprocess. A direct adaptive controller based on a radial basis function neural network used as on-line approximator to learn the time-varying characteristics of process parameters is developed and then is compared with a classical linearizing controller. The controller design is achieved by using an input-output feedback linearization technique. Numerical simulations, conducted in the case of a strongly nonlinear, time varying and not exactly known dynamical kinetics wastewater biodegradation process, are included to illustrate the behaviour and the performance of the presented controller. Keywords: Nonlinear systems, Neural networks, Adaptive control, Wastewater treatment bioprocesses.
1 Introduction The traditional control design involves complicated mathematical analysis and has difficulties in controlling highly nonlinear and time varying plants as well. A powerful tool for nonlinear controller design is the feedback linearization [1], [2], but the use of it requires the complete knowledge of the process. In practice there are many processes described by highly nonlinear dynamics; thus an accurate model for these processes is difficult to develop. Therefore, in recent years, it has been noticed a great progress in development of adaptive and robust adaptive controllers, due to their ability to compensate for both parametric uncertainties and process parameter variations. An important assumption in previous works on nonlinear adaptive control was the linear dependence on the unknown parameters [2], [3]. Recently, there has been considerable interest in the use of neural networks (NNs) for the identification and control of complex dynamical system [4], [5], [6], [7], [8]. The main advantage of using NNs in control applications is based both on their ability to uniformly approximate arbitrary input-output mappings [9] and on their learning capabilities that enable the resulting controller to adapt itself to possible variations in the controlled plant dynamics [5]. More precisely, the variations of plant parameters are transposed in the modification of the NN parameters (i.e. the adaptation of the NN weights). Using the feedback linearization and NNs, several NNs-based adaptive controllers were developed for some classes of uncertain, time varying and nonlinear systems [4], [5], [7], [10], [11]. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 234–243, 2010. © Springer-Verlag Berlin Heidelberg 2010
Direct Adaptive Control of an Anaerobic Depollution Bioprocess
235
In this paper, the design and analysis of a direct adaptive control strategy for an anaerobic digestion bioprocess with incompletely known and time varying dynamics is presented. In fact, by using the feedback linearization, the design of a linearizing controller and of a direct adaptive controller is achieved. The design of the direct adaptive controller is based on radial basis function NN. The control signals are generated using a NN approximation of the functions representing the uncertain and time varying plant dynamics. Adaptation in this adaptive controller requires on-line adjustment of NN weights. The adaptation law is derived in a manner similar to the classical Lyapunov based model reference adaptive control design, where the stability of the closed loop system in the presence of the adaptation law is ensured. The derived control method is applied to an anaerobic bioprocess used for the wastewater treatment. This bioprocess is characterized by strongly nonlinear, time varying and not exactly known dynamical kinetics. The control objective is to maintain a low pollution level. The paper is organized as follows. In Section 2, the nonlinear model of the anaerobic wastewater treatment process is presented. In Section 3, an exactly linearizing control law and a direct neural adaptive controller are designed. Also, the closed loop system stability and the boundedness of the controller parameters are proved. An application of these algorithms to a real wastewater treatment bioprocess is presented in Section 4, and the results of numerical simulations are presented and discussed. Concluding remarks in Section 5 finished the paper.
2 Nonlinear Model of the Anaerobic Bioprocess Anaerobic digestion is a multi-stage biological wastewater treatment process whereby bacteria decompose the organic matter to carbon dioxide, methane and water. Four metabolic paths can be identified in this process: two for acidogenesis and two for methanisation [12], [13], [14], [15]. Since the anaerobic digestion is a very complex bioprocess, his model being described by more than ten nonlinear differential equations, from an engineering point of view to control this bioprocess it can be used an appropriately reduced order model, which can be achieved by using the singular perturbation method [12], [14]. To control pollution level, a simplified model of an anaerobic wastewater treatment process will be used; this is described by the following nonlinear differential equations [16]:
X& 1 = r1 − D X 1 , S&1 = −k1 r1 − D S1 + D S in X& 2 = r2 − D X 2 , S& 2 = k 3 r1 − k 2 r2 − D S 2 S& = k r + k r − D S − Q 5
4 1
5 2
5
(1)
CO 2
P& = k 6 r2 + k 7 r1 − D P − QP where X1 and X2, are acidogenic and acetoclastic methanogenic bacteria, respectively, S1, S2 and S5 are glucose, acetate and inorganic carbon respectively, and P is methane; r1 and r2 are the rates of the acidogenic and methanisation reactions, respectively. Each reaction rate is a growth rate and can be written as ri = μ i X i , i = 1, 2 , where μi , i = 1, 2 are the specific growth rates of reactions ri [12], [14]. The nonlinear character of (1) is given by the reaction kinetics, its modelling being the most difficult
236
E. Petre, D. Şendrescu, and D. Selişteanu
task. For this bioprocess, the reaction rates are described as r1 = μ1 X 1 = α1 X 1S1 , where μ1 = μ1* S1 /( K M 1 + S1 ) known as Monod model, and r2 = μ 2 X 2 = α 2 X 2 S 2 , where μ 2 = μ ∗2 S 2 /( K M 2 + S 2 + S 22 / K I 2 ) known as Haldane model, with μ1* , K M1 ,
μ*2 , K M 2 and K I 2 - positive kinetic coefficients. Let us consider that the anaerobic digestion process is operated in a Continuous Stirred Tank Reactor (CSTR), and the only inlet substrate is the organic matter S1. QP = c p P with c p > 0 and QCO2 represent the gaseous outflow rates of CH4 and CO2 respectively, Sin is the influent substrate (pollutant) concentration, k i , i = 1,K , 7, are the yield coefficients, and D is the dilution rate. By defining the state vector as ξ = [ X 1 S1 X 2 S2 S5 P ]T , the model (1) can be written in a compact form as: ξ& = Kr (ξ) − Dξ + F − Q = KG (ξ)α (ξ) − Dξ + F − Q
(2)
where F = [0 DSin 0 0 0 0 ]T is the vector of inflow rates, Q = [0 0 0 0 QCO2 QP]T is the vector of gaseous outflow rates, r = [r1 r2 ]T is the vector of reaction rates, α = [α1 α 2 ]T is the vector of specific reaction rates, r (ξ) = G (ξ)α (ξ) , G (ξ) = diag{Si X i } , i = 1, 2 , and K is the yield coefficient’s matrix: ⎡1 − k1 0 k 3 k 4 k 7 ⎤ K =⎢ ⎥ ⎣0 0 1 − k 2 k 5 k 6 ⎦
T
(3)
For the anaerobic bioprocess described by dynamical model (1) we consider the problem of controlling the output pollution level y defined as y = c1 S1 + c2 S 2 , where c1 and c2 are known constants, by using an appropriately control input under the following assumptions: (i) the reaction rates ri are incompletely known; (ii) the matrix K is known; (iii) the vectors F and Q are known either by measurements or by user's choice.
3 The Design of Control Strategies 3.1 Preliminaries and Control Objective
Consider the class of SISO affine time-varying nonlinear systems given by x& (t ) = f ( x, t ) + g ( x, t ) u (t ), y (t ) = h( x, t ),
(4)
where x ∈ Π ⊂ ℜ n is the state vector, u ∈ ℜ and y ∈ ℜ are the input and the output, respectively. Assume that the unknown nonlinear functions f , g : Π × [0, ∞] → ℜ n and h : Π × [0, ∞] → ℜ are smooth time-varying functions, and the domain Π contains the origin x = 0 , which, if u = 0 , ∀t ≥ 0 , is an equilibrium point, that is f (0, t ) = 0, for all t ≥ 0. Assume also that the system (4) has a strong relative degree δ , 1 ≤ δ ≤ n (see [1], [10]). If the system (4) has the strong relative degree δ, then
Direct Adaptive Control of an Anaerobic Depollution Bioprocess
237
differentiating y with respect to time δ times, the system dynamics may be written in a so-called normal form [10], [11]:
ζ& 1 = ζ 2 = L1f h( x, t ) , ζ& 2 = ζ 3 = L 2f h( x, t ) , ... , ζ& δ −1 = ζ δ = L δf −1h( x, t ) , ζ& δ = L δf h( x, t ) + L g L δf −1h( x, t ) u = α(ζ, η, t ) + β(ζ, η, t ) u , η& = f 0 (ζ, η, t )
(5)
where: ζ ∈ ℜ δ is the external part containing the process output y and its (δ − 1) derivatives, η ∈ ℜ n−δ is the internal part containing the states associated with the zero dynamics. Then the δth derivative of y in (4) can be written as y (δ ) = L δf h( x, t ) + L g L δf −1 h( x, t )u (t ) = α( x, t ) + β( x, t ) u (t ) ,
(6)
where α ( x, t ) = L δf h( x, t ) and β( x, t ) = Lg L δf −1h( x) are modified Lie derivatives [10]. Assume that the control gain β( x, t ) is bounded i.e. 0 < β 0 ≤ β( x) ≤ β1 , where β 0 and β1 are positive constants [8], [11]. Also, assume that there are some functions B( x) ≥ 0 such that | β& ( x, t ) | ≤ B( x) for all x ∈ Π . Then the external part ζ may be stabilized by u, while the internal part η is made uncontrollable by the same control [10]. The zero dynamics of system (4) are assumed uniformly exponentially attractive. Then, the uniformly boundedness of the internal part of system is ensured [11]. Our main control objective is to design an adaptive control system which will cause the plant output y to asymptotically track a desired output trajectory y d in the presence of the unknown and time varying plant parameters, using only local measurements. Usually, the desired output trajectory may be defined as the output of a stable linear reference model, with relative degree greater than or equal to δ , which characterizes the desired performances. Assume that the desired output trajectory and its δ − 1 derivatives y d , y& d ,K , y d(δ−1) are measurable and bounded. Define by ( δ −1)
(δ−2)
+ λ δ − 2 et + L + λ 0 et a measure of et = y d − y the tracking error and by e s = et et [10], that is the coefficients λ i , i = 0, 1, K , δ − 2 , are picked so that the polynomial
L( s ) = s δ−1 + λ δ−2 s δ−2 + L + λ 0 ,
(7)
from E s ( s ) = L( s ) Et ( s ) , with E s (s ) and Et (s ) the Laplace transforms of es (t ) and et (t ) , is Hurwitz. Notice that the control goal is to drive e s (t ) → 0 as t → ∞ , the shape of error dynamic being dictated by the choice of the design parameters in L( s ). 3.2 A Direct Adaptive Control Law
Assuming that the state x is accessible, firstly, we may define an ideal input-output linearizing control law, which compensates for the dynamics of the system, as
u * (t ) = (1 / β( x, t ))(−α ( x, t ) + v(t )) = u u ( x, v) + u k ( x) ,
(8)
where the signal v(t ) is considered as a new input defined as [10]
v(t ) = y d( δ) + γ es + es ,
(9)
238
E. Petre, D. Şendrescu, and D. Selişteanu
with e s (t ) = e&s (t ) − e ( δ) (t ) , u u ( x, v) the unknown part of control law, and u k ( x ) a known part of control law, which is assumed to be well-defined a priori [10], [11]. In this paper a direct adaptive control law is defined using a radial basis neural network (RBNN) with adjustable parameters to approximate u u ( x, v). A RBNN is made up of a collection of p > 0 parallel processing units called nodes. The output of the ith node is defined by a Gaussian function γ i ( x) = exp (− | x − ci | 2 / σ i2 ) , where x ∈ ℜ n is the input to the NN, ci is the centre of the ith node, and σ i is its size of influence. The output of a RBNN, y = F ( x,W ) , may be calculated as [8] F ( x, W ) = ∑i =1 wi γ i ( x ) = W T (t )Γ ( x ) , p
(10)
where W (t ) = [ w1 (t ) w2 (t ) K w p (t )]T is the vector of network weights and Γ( x) is a
set of radial basis functions defined by Γ( x) = [ γ 1 ( x) γ 2 ( x) K γ p ( x)]T . Given a RBNN, it is possible to approximate a wide variety of functions f (x) by making different choices for W. In particular, if there is a sufficient number of nodes within the NN, then there is some W * such as sup x∈S x F ( x, W * ) − f ( x ) < ε , where
S x is a compact set, ε > 0 is a finite constant, provided that f (x) is continuous [8]. Then, the ideal control law (8) may be represented by an ideal controller where the unknown part u u ( x, v) is approximated by a RBNN, Fu = Wu* (t )Γ( x, v) , as u * (t ) = Fu ( x, v,Wu* ) + u k ( x) + μ u ( x, v, t ) ,
(11)
where Wu* is the vector of the optimal parameters of the approximator, given by [8]
⎡ ⎤ Wu* = arg min ⎢ sup Fu ( x, v,Wu ) − (u* − uk ) ⎥ , Wu ∈Ωu x∈S , v∈S ⎣ x v ⎦
(12)
and μ u ( x, v, t ) is the approximation error. From the universal approximation property of RBNNs, there exists Wu* such that | μ u | ≤ M u for some finite M u > 0 . The subspaces Sx, Sv are defined as compact sets through which state trajectories of the system and signal v may travel. The subspace Ωu is a convex compact set that contains feasible parameters for Wu* . Subsequently, an adaptive algorithm will be defined to estimate Wu* with Wˆ u . The estimates Wˆ u are used to define the controller
uˆ (t ) = Fu ( x, v,Wˆ u ) + u k (t ) = Wˆ uT (t )Γ( x, v) + u k (t ) ,
(13)
where Fu ( x, v, Wˆ u ) is the RBNN output, which approximates the ideal controller defined in (11). A parameter error vector is defined as ~ (14) Wu (t ) = Wˆ u (t ) − Wu* (t ). Finally, a direct adaptive control law is defined as:
u (t ) = uˆ (t ) + u sd (t ) , where u sd (t ) is a sliding mode control term defined as [10]:
(15)
Direct Adaptive Control of an Anaerobic Depollution Bioprocess
(
239
)
u sd (t ) = ( B( x) | es |) /( 2β 02 ) + M u ( x, v) sgn(es ).
(16)
For the parameter updating in (13) we consider the next law, where Qu is a positive definite diagonal matrix.
& Wˆ u (t ) = Qu−1Γu ( x, v )e s (t ) .
(17)
3.3 Boundedness of the Controller Parameters
To prove the stability of the direct adaptive controller we assume that the controller parameter variations W& u* are bounded (which may result from practical aspects). Theorem 1. Consider the nonlinear time varying system (4) with strong relative degree δ . Assume that (i) 0 < β 0 ≤ β( x) ≤ β1 with β 0 and β1 known positive constants, (ii) | β& ( x, t ) | ≤ B( x) for some known functions B( x) ≥ 0, (iii) | μ ( x, v, t ) | ≤ M ( x, v) u
u
for some finite M u > 0, (iv) y d , y& d ,K, y d(δ−1) are measurable and bounded, (v) x, y, y& ,K , y (δ−1) are measurable, (vi) 1 ≤ δ < n with the zero dynamics exponentially attractive, and (vii) W& u* is bounded. Under these conditions there exist direct adaptive control laws (15), (13) and (16) and update law (17) such that all internal signals are uniformly bounded and et is small in the mean [10], [11]. Proof. Consider the following Lyapunov function candidate ~ ~ V (t ) = (1 / 2β( x, t ) ) ⋅ es2 + (1 / 2) ⋅ WuT QuWu .
(18)
Using the assumption (i), it can be seen that (1 / 2β1 )es2 ≤ V ≤ (1 / 2β 0 )es2 + b for some ~ b > 0 according to the boundedness of Wu . Hence, V is positive definite and decreasing. Its time derivative is:
(
)
~ ~& V& (t ) = (es / β( x, t ) ) ⋅ e&s − (β& ( x, t )es2 ) /(2β 2 ( x, t )) + WuT QuWu .
(19)
From definitions of es (t ) and et (t ) one obtains that e& s (t ) = es (t ) + y d( δ ) − y ( δ) and from (6), (9), (15), (8), (13), (11) and (14) we have ~ (20) e&s (t ) = − γes − β( x, t ) WuT Γu ( x, v) − μ u − β( x, t ) u sd .
(
)
With (20) and update law (17), V& (t ) in (19) becomes ⎡ β& ( x, t )es ⎤ γes2 ~ −⎢ 2 V& (t ) = − − μu ⎥ es − esusd − WuT QuW&u* . β( x, t ) ⎣ 2β ( x, t ) ⎦
~ ~ The term − WuT QuW& u* reflects the effect of time varying parameters [10]. Since Wu is bounded, Qu is a constant matrix, and W& * is bounded, one obtains u
~ − WuT QuW& u* ≤ Wγ ,
(21)
E. Petre, D. Şendrescu, and D. Selişteanu
240
for some Wγ > 0 indicating the boundedness or parameter variations. Then
) [(
(
)(
]
)
V& ≤ − γ es2 / β( x, t ) − β& ( x, t )es / 2β2 ( x, t ) − μu ( x, v, t ) es − esusd + Wγ
(22)
Using the boundedness of β, β& and μ u , substituting (16) in (22) and using the fact that e sgn( e ) = | e | we have V& ≤ −( γe 2 / β ) + W . Thus, V& ≤ 0 for | e s | ≥ β1Wγ / γ , s
s
s
s
1
γ
that is the tracking error es is uniformly bounded. As E s ( s ) = L( s ) Et ( s ) and L(s) is a stable polynomial with degree δ − 1 we obtain that the tracking error and its derivatives e, e&, K, e (δ−1) are uniformly bounded. Since the desired reference and its time derivatives y d , y& d ,K , y d(δ−1) are assumed to be bounded, then the system output y and its derivatives y& ,K , y (δ−1) are bounded. Hence, ζ is uniformly bounded and thus x is uniformly bounded. The fact that V& ≤ 0 also implies that parameter estimation Wˆ is u
uniformly bounded (see (14) and the boundedness of Wu* ). Therefore, the boundedness of Wˆ u assures that uˆ and usd and hence u are uniformly bounded. Also, V& ≤ 0 for | e s | ≥ β1Wγ / γ , assures
t +T
∫t
es2 dt ≤ (β1 / γ ) ⋅ (V (t )
−V (t + T )) + (β1 / γ )WγT ,
T > 0 , that is, the tracking error is small in the mean [10]. Note that by choosing γ large enough we can obtain smaller mean for es. In [10] it was proved that in order to obtain the uniform asymptotic stability of the output y it was assumed that | W& u*, j | ≤ k | es | , where W& u*, j is the jth component of the vector W& u* and k is a positive constant (because the tracking error is usually large if the controller parameters vary fast [10]), and the sliding mode control term (16) takes the form u sd = ( B( x) | es |) /( 2β 02 ) + M u ( x, v) sgn(es ) + Wγ sgn( es ) , then V& ≤ − γe s2 / β1 . Thus, V& ≤ 0, and we can obtain the uniform boundedness of all internal signals us-
(
)
∞ ∞ ing a similar analysis as in Theorem 1. From ( γ / β1 ) ∫ es2 dt ≤ ∫ V&dt = V (0) − V (∞) p
0
we obtain that e s ∈ L2 . Since es and e&s are bounded and es ∈ L2 , by Barbalat’s Lemma [3] we have uniform asymptotic stability of es, which implies uniform asymptotic stability of the tracking error et, i.e. lim t →∞ et = 0 . It must be noted that if the above assumption takes the form | W& * |≤ ke − rt , r > 0 , we obtain the same results. u, j
4 Application and Simulation Results In this Section we will apply the designed RBNN adaptive control in the case of the anaerobic digestion bioprocess presented in Section 2. In order to control the output pollution level y, as input control we chose the dilution rate, u = D. Then, from (1) we obtain the following input-output process model whose relative degree δ = 1: y& = (c2 k3 − c1k1 ) X 1S1α1 − c2 k 2 X 2 S 2 α 2 − uy + c1 u S in
(23)
Here, for the anaerobic digestion process, the main control objective is to maintain the output y at a specified low level pollution y d ∈ ℜ despite any influent pollutant variation and time varying of some process parameters.
Direct Adaptive Control of an Anaerobic Depollution Bioprocess
241
Consider the ideal case where maximum prior knowledge concerning the process is available, that is in (2) the specific reaction rates α1 and α2 are assumed completely known, while all the state variables and all the inflow and outflow rates are available for on-line measurements. Then, the next exactly linearizing feedback control law u * = (1 /(c1S in − y ) ) ⋅ ( y& d + λ1 ( yd − y ) − (c2 k3 − c1k1 ) X 1S1α1 + c2 k 2 X 2 S 2 α 2 )
(24)
determines a dynamical behaviour of closed-loop system described by the following first order linear stable differential equation: ( y& d − y& ) + λ1 ( y d − y ) = 0 , λ1 > 0
(25)
The controller (24) leads to a linear error model e& = −λ1e , where e = y d − y represents the tracking error, that for λ1 > 0 has an exponential stable point at e = 0 . Since the prior knowledge about the process previously assumed is not realistic, we will analyze a more realistic case, where the process dynamics are incompletely known and time varying. We will assume that the specific reaction rates α1 and α2 are completely unknown. Since the control law (24) contains the specific reaction rate α1 and α2 assumed unknown and time varying, this must be estimated by using a RBNN. Using (24), the control law (8) can be rewritten in the form: u (t ) = (1 / β(ξ))(− α(ξ, t ) + y& d + λ1 ( yd − y ) ) = uu (ξ, t ) + u k (ξ) ,
(26)
where uu (ξ, t ) = −α(ξ, t ) / β(ξ, t ), u k (ξ) = ( y& d + λ1 ( y d − y ) ) / β(ξ, t ) . In (26) the function α(ξ, t ) = −(c 2 k 3 − c1k1 ) X 1 S1α1 + c2 k 2 X 2 S 2 α 2 is assumed to be unknown, while the function β(ξ) = c1 S in − y is assumed to be known and, much more, | β(ξ) | ≥ β 0 > 0. Then a RBNN is used to construct an on-line estimate of uu (ξ, t ) respectively of α(ξ, t ) using the adaptive algorithm from subsection 3.2. The performance of designed controllers has been tested through extensive simulations by using the process model (1) under realistic conditions. The values of yield and kinetic coefficients are [16]: k1 = 3.2, k2 = 16.7, k3 = 1.035, k4 = 1.1935, k5 = 1.5, k6 = 3, k7 = 0.113, μ1∗ = 0.2 h-1, K M 1 = 0.5 g/l, μ∗2 = 0.35 h-1, K M 2 = 4 g/l, K I 2 = 21 g/l, and the values c1 = 1.2, c2 = 0.75. It must be noted that for the adaptive controller a full RBNN with deviation σ i = 0.05 was used. The centres ci of the radial basis functions are placed in the nodes of a mesh obtained by discretization of states X 1 ∈ [1, 12] g/l, X 2 ∈ [0.4, 0.65] g/l, S1 ∈ [0.1, 1.4] g/l and S 2 ∈ [0.3, 1.6] g/l with dX = dS = 0.1 g/l, i = 1, 2. The controller parameter vector Wˆ was initialized to i
i
u
zero. The gain matrix was chosen as Qu−1 = diag (10 −3 ) . The values used in the sliding mode control term were chosen as: B = 0.5, β 0 = 0.1, Mu = 0.01. The reference yd in (24) and (26) is obtained by filtering a desired piecewise constant setpoint by reference model y& d = −λ d y d + λ d y sp , with λ d = 0.5 , with ysp the desired setpoint. The response of closed loop system with the RBNN adaptive controller by comparison to the response with exactly linearizing controller (as benchmark case) is given in Fig. 1. We consider that the cinetic coefficients μ1* , μ*2 are time varying as: μ1* (t ) = 0.2 (1 − 0.1 cos (πt / 10) ) , μ*2 (t ) = 0.35 (1 − 0.1 cos (πt / 10) )
(27)
E. Petre, D. Şendrescu, and D. Selişteanu
242
In Fig. 2 the control action is presented, both for the exactly linearizing and adaptive control cases, while in Fig. 3 the function α(ξ, t ) provided by the RBNN is depicted versus the “real” function. Finally, Fig. 4 shows the time variation of the disturbance Sin. From these figures it can be seen that the behaviour of adaptive system with NN controller is very good, although the process dynamics is incompletely known and time varying. Note also the regulation properties and ability of the controller to maintain the controlled output y close to his desired values (very low level for y*) despite the very high load variations for Sin. It must be noted that a preliminary tuning for the RBNN controller is not necessary. 0.08
g/l
1/h
1.6
2 1
0.06
1.4 1.2
2
1
ysp
1
0.04
yd
0.8
0.02
0.6 Time (h)
Time (h)
0
0.4 0
20
40
60
80
0
100
Fig. 1. The controlled output y (exactly linearizing (1) versus adaptive case (2))
20
40
60
80
100
Fig. 2. The control action u (exactly linearizing (1) versus adaptive case (2)) 42
3.5
g/l
g/lh
3
40
1
38
2.5 2
2
36
1.5
34
1
32 30
0.5 Time (h)
0 0
20
40
60
80
100
Fig. 3. The function provided by the RBNN (1) versus the ideal one (2)
Time (h)
28 0
20
40
60
80
100
Fig. 4. The profile of the disturbance Sin
5 Conclusion A direct adaptive control strategy for a nonlinear systems for which the dynamics is incompletely known and time varying was presented. The controller design is based on the input-output linearizing technique. The unknown controller functions are approximated using radial basis neural networks. The form of the controller and the neural controller adaptation laws were derived from a Lyapunov analysis of stability. It was demonstrated that under certain conditions, all the controller parameters remain
Direct Adaptive Control of an Anaerobic Depollution Bioprocess
243
bounded and the plant output tracks, with user specified dynamics, the output of a linear reference model. The proposed algorithm was applied to control the pollution level in the case of an anaerobic bioprocess with strongly nonlinear, time varying and not exactly known dynamical kinetics, used for the wastewater treatment. The simulation results confirm the effectiveness of this controller by comparison to an exactly linearizing nonadaptive controller. Acknowledgments. This work was supported by CNCSIS–UEFISCSU, Romania, project number PNII–IDEI 548/2008.
References 1. Isidori, A.: Nonlinear Control Systems, 3rd edn. Springer, Berlin (1995) 2. Sastry, S., Isidori, A.: Adaptive control of linearizable systems. IEEE Trans. Autom. Control 34(11), 1123–1131 (1989) 3. Sastry, S., Bodson, M.: Adaptive control - Stability, Convergence and Robustness. Prentice Hall International, Inc., Englewood Cliffs (1990) 4. Fidan, B., Zhang, Y., Ioannou, P.A.: Adaptive control of a class of slowly time varying systems with modelling uncertainties. IEEE Trans. on Autom. Control 50, 915–920 (2005) 5. Hayakawa, T., Haddad, W.M., Hovakimyan, N.: Neural network adaptive control for a class of nonlinear uncertain dynamical systems with asymptotic stability guarantees. IEEE Trans. on Neural Networks 19, 80–89 (2008) 6. McLain, R.B., Henson, M.A., Pottmann, M.: Direct adaptive control of partially known non-linear systems. IEEE Trans. Neural Networks 10(3), 714–721 (1999) 7. Petre, E., Selişteanu, D., Şendrescu, D., Ionete, C.: Neural networks-based adaptive control for a class of nonlinear bioprocesses. Neural Comput. & Applic. 19(2), 169–178 (2010) 8. Spooner, J.T., Passino, K.M.: Decentralized adaptive control of nonlinear systems using radial basis neural networks. IEEE Trans. on Autom. Control 44(11), 2050–2057 (1999) 9. Funahashi, K.: On the approximate realization of continuous mappings by neural networks. Neural Networks 2, 183–192 (1989) 10. Diao, Y., Passino, K.M.: Stable adaptive control of feedback linearizable time-varying nonlinear systems with application to fault-tolerant engine control. Int. J. Control 77(17), 1463–1480 (2004) 11. Petre, E., Selişteanu, D., Şendrescu, D.: Neural Networks Based Adaptive Control for a Class of Time Varying Nonlinear Processes. In: Int. Conf. on Control, Automation and Systems ICCAS 2008, COEX, Seoul, Korea, October 14-17, pp. 1355–1360 (2008) 12. Bastin, G., Dochain, D.: On-line Estimation and Adaptive Control of Bioreactors. Elsevier, Amsterdam (1990) 13. Dochain, D., Vanrolleghem, P.: Dynamical Modelling and Estimation in Wastewater Treatment Processes. IWA Publishing (2001) 14. Petre, E.: Nonlinear Control Systems – Applications in Biotechnology, 2nd edn. Universitaria, Craiova (2008) (in Romanian) 15. Dochain, D. (ed.): Automatic Control of Bioprocesses. ISTE and John Wiley & Sons, UK (2008) 16. Petre, E., Selişteanu, D., Şendrescu, D.: Adaptive control strategies for a class of anaerobic depollution bioprocesses. In: Proc. of Int. Conf. on Automation, Quality and Testing, Robotics, Cluj-Napoca, Romania, Tome II, May 22-25, pp. 159–164 (2008)
Visualisation of Test Coverage for Conformance Tests of Low Level Communication Protocols Katharina Tschumitschew1, Frank Klawonn1, Nils Oberm¨oller2, and Wolfhard Lawrenz2 1
Department of Computer Science Ostfalia University of Applied Sciences Salzdahlumer Str. 46/48 D-38302 Wolfenb¨uttel, Germany 2 C&S group GmbH Am Exer 19c D-38302 Wolfenb¨uttel, Germany
Abstract. Testing is a process detecting errors in the design of software and hardware. Tests can normally not prove the correctness of programs or hardware, since they usually form only a small subset of all possible inputs or situations. Therefore, test cases should be chosen carefully and should cover the set or space of possible inputs or situations as broadly as possible. Thus, test coverage is one of the crucial question in testing. In this paper, we focus on conformance test for low level communication protocols like FlexRay or LIN as they are found very often in cars. These conformance tests are usually defined by choosing specific values for a larger number of variables. In this paper, we propose a method that can visualise the test cases in comparison to all possible situation and provides important information about the test coverage.
1 Introduction Electronic devices in cars play a crucial role for the comfort of the driver and the passengers (navigation, entertainment,. . .) as well as for safe driving (ABS, ESP, etc.). The electronic devices need to exchange information via communication networks in cars. Typical low level communication protocols for such networks are for example Controller Area Network (CAN), FlexRay and Local Interconnect Network (LIN) (for an overview see for instance [1]). Since some of the devices in the car are responsible for safety critical tasks, it must be ensured that the communication networks operate properly and that devices strictly obey the protocols in order to guarantee interoperability. Before a car manufacturer introduces a new device like an ECU (electronic control unit) component (for instance, a communication controller or transceiver) into a car, the device must pass certain conformance tests, ensuring that the device functions properly together with other devices. Such conformance tests usually consist of specific configurations in which the device must react properly. A configuration is usually described by a number of (mostly) R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 244–252, 2010. c Springer-Verlag Berlin Heidelberg 2010
Visualisation of Test Coverage for Conformance Tests of Communication Protocols
245
real variables. For each communication protocol a specification is available in which constraints according the variables are described. The constraints consist of limitations of the range of variables but also of relations between variables in the form of equations and inequalities. The test specification for a specific specification of a communication protocol defines the test cases. Very often, test houses extend this set of mandatory tests for better test coverage. The test cases are based on experience and also systematic analysis of the specification. However, it is not clear at all, how well the test cases cover the set of admissible configurations. In contrast to classical software testing [2,3] for computer programs where criteria like function, statement, branch, decision or path coverage are of interested, we have to deal with configurations of numerical values, so that these test coverage methods are not applicable in our context. In this paper, we propose to apply visualisation techniques to better understand the test coverage for such protocols. The paper is organised as follows. We first describe in Section 2 the given problem in an abstract way, i.e. the general structure of the so-called system operational variable space consisting of the variables involved in the communication protocol and the constraints for the variables. Section 3 introduces our ideas for visualising the test coverage. These ideas are then applied to two communication protocol – FlexRay and LIN – in Section 4. The conclusions in Section 5 provide perspectives for future work.
2 Characterisation of the Constraints It is not in the scope of this paper to discuss the meaning of the variables involved in low level communication protocols in this paper. In order to illustrate, how our approach works, we use a simple abstract example. A conformance test for low level communication protocols is usually based on setting variables to specific values that determine the test case. The variables do not have only limited ranges, but they are also relations between the variables in the form of equations or inequalities. Let us consider the three variables I1 , R1 and U1 with the constraints I1 + I2 = I3 ,
(1)
0 ≤ I2 ≤ 1,
(2)
0 ≤ I3 ≤ 1,
(3)
R1 =
U1 , I1
(4)
0 ≤ I1 ≤ 1,
(5)
0 ≤ U1 ≤ 5.
(6)
Figure 1 shows the subset – i.e. the surface – of admissible parameter configurations taking the constraints (1)–(6) into account. Only those points on the surface shown in
246
K. Tschumitschew et al.
Fig. 1. The surface of admissible parameter combinations given by the constraints in Equations (1)–(6)
Figure 1 are admissible parameter configurations. The points outside the surface are not permitted. If we only have to deal with three variables as in this simple example, we could directly use the visualisation in Figure 1 and also include the test cases as spots on the surface of admissible parameter configurations. However, the situation is usually more complicated. – The equations and inequalities can be more complex than in the example and it can be difficult or even impossible to compute the set of admissible parameter configurations in a simple closed form solution. – The number of variables or parameters is usually much larger than 3 as in the example. For instance, in the case of the FlexRay protocol that is considered in more detail here, 45 variables are defined of which 29 variables can be chosen – given the specified constraints are satisfied. Therefore, the possible configurations form a subset of 29-dimensional space and not of a 3-dimensional space. In the next section, we propose a method with which we can still construct visualisations of the subset of admissible configurations including the location of the test cases.
3 Visualisation of Admissible Configurations and Test Cases Our method consists of four steps. – In the first step, we generate a large number of random parameter configurations ignoring the constraints. One can imagine the random parameter configurations as the spots in Figure 1.
Visualisation of Test Coverage for Conformance Tests of Communication Protocols
247
Fig. 2. Elimination of the random test cases in Figure 1 that do not satisfy the constraints
Fig. 3. Adding the test cases from the test case specification to the admissible random configurations in Figure 2
248
K. Tschumitschew et al.
Fig. 4. Visualisation of the projection to two variables for the FlexRay protocol: A set of generated admissible random configurations (small blue points) vs. test cases (large red points)
– In the second step, we remove all spots (parameter configurations) that do not satisfy the given constraints as illustrated in Figure 2. If we have generated sufficiently many random parameter configurations, the remaining spots corresponding to admissible parameter configuration will cover the set of all admissible parameter configurations in a sufficiently dense manner. – Then we add the test cases to the set of admissible random configurations indicated by the red spots in Figure 3. – It should be noted that Figures 1–3 only illustrate the first three steps of our methods. The random parameter configurations and the test cases are not points in a 3-dimensional space, but in a much higher-dimensional space – 29 dimensions in the case of FlexRay. Therefore, we still need techniques to visualise the location of the test cases in comparison to the random parameter configuartions. Scatter plots are a simple way to visualise the test cases together with the random parameter configurations. A scatter plot shows the projection of the high-dimensional data to two chosen variables. Note that the number of scatter plots can be quite large. In the
Visualisation of Test Coverage for Conformance Tests of Communication Protocols
case of FlexRay, we have
29 2
249
= 406 scatter plots – too many to inspect all of them
in detail. We can either just pick randomly some of these scatter plots or use other projection techniques. Principal component analysis (PCA) [4] is a dimension reduction technique that fits a projection plane1 into a high-dimensional space so that the projection of the data in the high-dimensional space to the projection plane preserves as much of the variance of the data as possible. An alternative to PCA is multidimensional scaling (MDS) [5] whose aim is to visualise high-dimensional data in a plane. But instead of trying to preserve the variance as PCA does, MDS tries to preserve the distances between the data in the high-dimensional space.
Fig. 5. Visualisation of two principal components for the FlexRay protocol: A set of generated admissible random configurations (small blue points) vs. test cases (large red points) 1
PCA is not restricted to a fit a two-dimensional projection plane into the high-dimensional space. The projection plane can be of arbitrary dimension. Of course, it should be lower then the dimension of the original higher-dimensional space.
250
K. Tschumitschew et al.
Note that in our case, the data are the admissible random parameter configurations together with the test cases.
4 Results We have applied our proposed method to two communication protocols: FlexRay and LIN. The tests have been taken from the respective test specifications for the protocols. Figure 4 shows a scatter plot resulting from a projection to the two variables gMacroPerCycle and gPayLoadLengthStatic of the FlexRay protocol. The test cases are shown in red, the random cases in blue. It can be seen that the test cases for these two variables focus on extreme cases at the boundary of the (sample of) admissible configurations, except the one case more or less in the middle. This shows that, at least for these two variables, the test cases seem to be chosen more or less well, taking into account that failures usually occur in extreme cases and not in standard configurations. However, the
Fig. 6. Visualisation of a projection for the LIN protocol: A set of generated admissible random configurations (small blue points) vs. test cases (large red points)
Visualisation of Test Coverage for Conformance Tests of Communication Protocols
251
Fig. 7. Visualisation of two principal components for the LIN protocol: A set of generated admissible random configurations (small blue points) vs. test cases (large red points)
extreme cases are mainly tested for the second variable. There are no test cases where the first variable has an extreme value, but the second has a non-extreme value. Figure 5 shows the projection to two principal components, indicating again that most test cases represent extreme configurations, but that there is also a large area of configurations where no tests are applied. For the protocol LIN, there are 24 variables whose values can be set, given the constraints are satisfied. Figure 6 shows a projection to the two variables TH INTERBYTE and protected ID, whereas Figure 7 shows two principal components for the LIN protocol. In both figures, there is a line of test cases. The reason for this line is that some parameters are only varied in a few test cases resulting in the corresponding lines. Therefore, the LIN test cases also do not go as much to the extreme values as the FlexRay test cases as can be seen in Figure 6 which contains more or less no test case for extreme values for this specific projection. One of the reasons, why the LIN test case coverage is lower than for FlexRay is that – although different values are admissible for
252
K. Tschumitschew et al.
all variables – for some of the variables default values are chosen that are more or less never changed. There are also variables for which a certain setting makes the value of another variable irrelevant. Therefore, in such cases it does not make sense to define test cases with variations of the irrelevant variable. Figure 7 includes some extreme test cases, but they still do not cover the whole boundary of the admissible configuration properly.
5 Conclusions Our visualisation technique can help to get an impression, how well the chosen test cases cover the set of all admissible configurations. Region without test cases can be identified and one can check it might be useful to add test cases in such regions or whether such test cases might not be of high importance. The test cases defined in the test specification of a communication protocol cannot be changed with out agreement from the standardisation committees responsible for the protocol. However, as a certified test house, C&S group GmbH has introduced additional tests for better test coverage which can be derived from the proposed visualisations. Of course, before introducing a new test case that increases the visual test coverage, it is necessary to check whether the test case is meaningful in terms of practical situations. Our future research work will focus on more sophisticated projection techniques known as nonlinear PCA [6,7,8]. We also plan to extend our approach to protocols that are mainly based on discrete variables. This would enable us to also include higher level modern standards like the Automotive Open System Architecture (AUTOSAR) [9].
References 1. Paret, D.: Multiplexed Networks for Embedded Systems: CAN, LIN, FlexRay, Safe-by-Wire... Wiley, Chichester (2007) 2. Myer, G.: The Art of Software Testing, 2nd edn. Wiley, Hoboken (2004) 3. Woodward, M., Hennell, M.: On the relationship between two control-flow coverage criteria: all JJ-paths and MCDC. Information and Software Technology 48, 433–440 (2006) 4. Jolliffe, I.: Principal Component Analysis. Springer, New York (2002) 5. Borg, I., Groenen, P.: Modern Multidimensional Scaling: Theory and Applications. Springer, Berlin (1997) 6. Kolodyazhniy, V., Klawonn, F., Tschumitschew, K.: Neuro-fuzzy model for dimensionality reduction and its application. International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems 15, 571–593 (2007) 7. Lowe, D., Tipping, M.: Feed-forward neural networks topographic mapping for exploratory data analysis. Neural Computing and Applications 4, 83–95 (1996) 8. Scholz, M., Kaplan, F., Guy, C., Kopka, J., Selbig, J.: Non-linear pca: A missing data approach. Bioinformatics 21, 3887–3895 (2005) 9. Kindel, O., Friedrich, M.: Softwareentwicklung mit AUTOSAR. Grundlagen, Engineering, Management f¨ur die Praxis. Dpunkt.verlag, Heidelberg (2009)
Control Network Programming with SPIDER: Dynamic Search Control Kostadin Kratchanov1, Tzanko Golemanov2, Emilia Golemanova2, and Tuncay Ercan1 1
Department of Software Engineering, Yasar University, Universite Cad. Yagacli Yol No.35-37, Bornova/Izmir, 35100 Turkey {kostadin.kratchanov,tuncay.ercan}@yasar.edu.tr 2 Department of Computing, Rousse University, 8 Studentska Street, Rousse, Bulgaria {EGolemanova,TGolemanov}@ecs.ru.acad.bg
Abstract. The report describes the means for dynamic control of the computation process that are available in Spider – a language for Control Network Programming. Keywords: Control network programming, programming paradigms, programming languages, non-procedural programming, search control, computation control, dynamic control, heuristic algorithms.
1 Introduction This report builds on publications [1-4] and, together with [5], completes this series of six papers describing the fundamentals of the programming paradigm referred to as Control Network Programming or CNP: the rationale behind its development, the structure of a program, a program’s interpretation, problem types particularly well suited for CNP, solutions to representative problems. More specifically, the exposition and descriptions focus on the SPIDER programming language for CNP which is the core of the SPIDER programming environment for CNP1. 1.1 A Program in CNP Is a Network – The CN Control Network Programming means programming through “control networks”. It is a programming paradigm in which the program can be visualized as a finite set of graphs. This set of graphs is called a control network (CN). The graphs comprising the CN are referred to as subnets; one of the subnets is identified as the main subnet. Each subnet consists of nodes (states) and arrows. A chain of primitives is assigned to each arrow. The primitives may be thought of as elementary actions and are technically user-defined procedures. A subnet may call other subnets or itself. Both subnets and primitives can have parameters and variables. The complete program consists of two main components - the CN and the definitions of the primitives. 1
A considerably more advanced programming environment called WinSpider is now operational; however, for consistency, all explanations and examples here are based on SPIDER. The SPIDER environment for CNP is available for free download at http://controlnetworkprogramming.com
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 253–262, 2010. © Springer-Verlag Berlin Heidelberg 2010
254
K. Kratchanov et al.
An example of the graphical representation of a simple CN is shown in Fig.1. The corresponding textual representation of the same CN in the SPIDER language is also demonstrated. This exemplary CN consists of a single subnet called Astar. It is a working demo of a possible CNP implementation of the A*-algorithm (e.g., [6-9]) for solving the 8-puzzle problem and was originally presented in [10]. Note that the above example is not meant as representative for the type of problems where using CNP is particularly effective but is rather a simple illustration of a CN.
Fig. 1. Graphical and textual representations of the CN of an implementation of the A*algorithm for solving the 8-puzzle problem
The program in CNP integrates conveniently declarative and imperative features where a possibly declarative problem description – the CN – can be considered as “leading”. In general, this description may be nondeterministic [1-3], and the execution (computation, inference, interpretation) of a CN program is essentially search in the CN. Generally, the built-in search strategy (a form of extended backtracking [3,4]) directs automatically the search and, ideally, the programmer should not be concerned with procedural details of the computation. In practice, however, in many cases procedural aspects are significant with ramification for efficiency, termination and choice among alternative solutions. Therefore, similarly to Prolog and other declarative languages, Spider provides means for control of the computation process [11,12].
Control Network Programming with SPIDER: Dynamic Search Control
255
These means are the so called system options for static search control which are discussed in detail in [4]. The system options of this group are summarized in Table 1. Table 1. Static control system options Option SOLUTIONS
Default value 1
Other values
Description
BACKTRACKING ONEVISIT
YES NO
ALL, ASK, unsigned integer NO YES
Specifies max number of solutions to be found Is backtracking allowed Are multiple entries into a node (within a recursive level) allowed Max number of repeated visits to a node in the current path Max number of repeated invocations of a subnet
LOOPS
ANY
unsigned integer
RECURSION
ANY
unsigned integer
1.2 Dynamic Computation Control and the Purpose of This Report In addition to the static search control discussed above, Spider features powerful tools for a second type of computation control which we call dynamic search control. Two groups of tools for dynamic search control are available: system options for dynamic search control and control states. The former are summarized in Table 2. The control states available are SELECT, RANGE and ORDER. Table 2. Dynamic control system options Option ARROWCOST
Default value 1
MAXPATHCOST
ANY
RANGEORDER
NO
ORDEROFARROWS
NO
PROXIMITY
0
NUMBEROFARROWS
ANY
Other values real value (const, var) real value (const, var) LOWFIRST, HIGHFIRST, RANDOM RANDOM
real value (const, var), or NEAREST integer value (const, var)
Description All arrows in scope have the specified cost Arrow fails if current cost exceeds value See description of control state RANGE Is the order of attempting arrows emanating from a given (regular) state random? See description of control state SELECT Max number of attempted outgoing arrows
The fundamental purpose for introducing dynamic search control is the “automatic” (or “nonprocedural”, “declarative”) implementation of heuristic algorithms. Paper [5] is especially devoted to this issue. The aim of the current report is to describe the tools for dynamic search control provided in SPIDER with simple illustrations.
256
K. Kratchanov et al.
The phrases “Static control” and “Dynamic control” are somewhat imprecise and have been chosen mainly for convenience. Other possible distinctive names would have been “Basic control” and “Advanced control”. Static control refers to controlling the basic parameters of the search process while Dynamic control is mainly related to more advanced features leading to implementing heuristic and more efficient algorithms. We may argue that the values of the static options are chosen when writing the CN and cannot be changed at execution time. In contrast, most dynamic options’ values can be set equal to values of given variables, and therefore may be dynamically changed by changing the values of these variables during execution. In principle, it would be easy to allow the values of the options from the “static” group to also equal variables; however, we have not needed such a feature so far. 1.3 A Grain of History Ideas related to controlling the computation/search process and modeling heuristic strategies in CNP have been discussed in the Intelligent Systems group in Rousse University led by K. Kratchanov and I. Stanev, and many colleagues have contributed to the development, implementation and applications of those ideas. Currently, there are two main lines of research. They have common roots but subsequently developed different foci, solutions, implementations and achievements. The first line is presented in this paper; some earlier publications are [10,13,14]. The programming language used in the second approach is called Net. There are considerable differences between SPIDER and Net in concept, syntax and implementation discussed in [10]; most importantly, Net follows the concept of a completely specified language while Spider uses an underlying language [1]. Some publications related to the second line are [15,16].
2 Arrow Costs and System Option ARROWCOST As emphasized repeatedly (e.g., [2,14]), problems that can be naturally visualized as graph-like structures are prime candidates for effective and simple CNP solutions. One particular example is the case when the problem representation is based on the state space approach [7]. Often, the arrows in graph-like problem representations are assigned numeric values called costs. For example, in the map traversal problem [4Fig.7,6,17] or the traveling salesperson problem [7,8,19] the costs are the distances between the cities2; in the sheep and wolves problem [2,9] there might be different costs for carrying one sheep or one wolf in the boat, and so on. Normally, in a CNP solution to a problem with graph-like representation the CN mimics this representation [2]. Therefore, possible advantages may be expected if we can specify costs for the arrows in the CN. In SPIDER arrows can be assigned costs in two ways. Firstly, it can be done directly in the textual representation of the CN using prefixes in brackets. For instance, modifying the example of Fig. 1, we can write 0: 2
(20) Init
>1;
If a state represents a partial tour the cost of an arrow will be the additional distance introduced by traveling to the latest point in the tour.
Control Network Programming with SPIDER: Dynamic Search Control
257
specifying cost 20 for the given arrow. The corresponding segment of the graphical representation of the CN will look as follows: 0
(20) Init
1
Secondly, identical arrow costs can be specified using the system option ARROWCOST for all arrows in the scope [4] of the system option.
Fig. 2. Using arrow costs: Finding all paths with their lengths
Fig. 2 is a modification of the solution to the map traversal problem shown in Fig. 9 of [4]. For convenience, the map is repeated in Fig. 3 below. In the example, all paths between given initial and final cities will be found and printed together with the length in kilometers of each path. Value ARROWCOST=0 must be specified with range the whole CN. In this way all “system” arrows without specified cost will get cost 0 and will not affect the accumulated cost of a solution found. The cost of the
258
K. Kratchanov et al.
current path is automatically calculated by the SPIDER interpreter as value of an accessible system variable (called O.CurrCost) and will be printed by the primitive Print.
Fig. 3. The Map Traversal problem
3 System Option MAXPATHCOST and Branch-and-Bound If the accumulated cost of the current path exceeds the value of the system option MAXPATHCOST the forward execution of the active arrow fails. Note that the value of MAXPATHCOST may equal the value of a variable thus allowing dynamic change of the threshold. Let us consider the exemplary map of Fig.3 and assume that a minimal path from A to D is sought. Our solution is based on the CN of Fig.2, with MAXPATHCOST=Bound added for the whole CN where Bound is a variable whose initial value is the maximal real value. Note that SOLUTIONS=ALL. Primitive Print must also be slightly modified by adding the statement Bound:=O.CurrCost. The first solution path found and printed will be ABD with cost 120. During the execution of primitive Print the value of variable Bound will be changed to 120. Then the interpreter will backtrack from state FINISH of the main subnet, the control will enter backwards subnet MAP throw nodes RETURN and D into node B. From B, the second outgoing arrow to node C will be attempted. The current path cost will become 150. As this value exceeds the current value of MAXPATHCOST which is 120, arrow BÆC will fail and the control will return to node B. This means that the system will not produce the solution path ABCD whose cost is worse than that of the current optimal cost 120 of path ABD. Same applies to solution ACD. It is clear that the last found solution path will be optimal (or shortest if all costs in the map are 1). Thus, our CNP program will always produce an optimal (or shortest) solution. Moreover, in the process of search many “unpromising” paths such as ABC will be pruned. Such paths are called insipid paths in [9]. Our solution is equivalent or very similar to the Branch-and-Bound strategy [7,9,18] and DFS adapted for finding the cheapest solution within a prescribed amount of search effort [9]. Branch-andBound in the literature usually means “best-first branch-and-bound” [19]; our solution is not “best-first” but is completely non-procedural. (Branch-and-Bound in [17] has a different meaning.)
Control Network Programming with SPIDER: Dynamic Search Control
259
A simpler version of our solution where the value of MAXPATHCOST is set to a constant will correspond to what is known as cost-bound (respectively length-bound) depth-first search [9].
4 Order of Attempting the Outgoing Arrows and Control States Normally the arrows emanating from a given node in the CN are attempted in the order they are specified in the CN. The three control states featured in SPIDER (ORDER, RANGE, and SELECT) allow this order to be controlled and changed, even dynamically. In order to achieve this (and consequently certain heuristically informed choice) arrows emanating from a control state can be equipped with ``heuristic`` values – such arrows will be called evaluated arrows. Note that these evaluations are different from the arrow costs discussed earlier. The three control states are introduced below. Their usage is illustrated in [5]. 4.1 Control State ORDER The first type of control state is called control state ORDER (Fig.4). Here, Control is the identifier of the state, sel and S1 are variables. The outgoing arrows are evaluated – the evaluations can be constants or variables, like in the case of S1. A control state ORDER has an associated quantity called a selector; in the example the selector`s value is the value of the variable sel. The interpreter chooses the arrows in the order of how close their evaluations are to the value of the state selector. For instance, if the values of both sel and S1 are 0 then the arrows will be attempted in the order b, c, a; if the values of sel and S1 are 100 and 3.14, respectively, the order of the arrows will be a, b, c.
Fig. 4. Control state ORDER – graphical representation and specification in the CN
4.2 Control State RANGE and System Option RANGEORDER There are two versions of control state RANGE – without or with an ELSE clause. They are shown in Fig.5. There are two selectors – L and H where L≤H, and only the arrows with heuristic values in the range between L and H are considered, the arrows outside the specified range are cut off. When there are no arrows within the range the ELSE arrow is tried if any, otherwise backtracking is performed.
260
K. Kratchanov et al.
Fig. 5. The two versions of control state RANGE – graphical representation and specification in the CN
The arrows falling within the range must be attempted in certain order. This order is determined by the value of the system option RANGEORDER. If the value of RANGEORDER is NO then the arrows are chosen according to their order in the CN, if RANGEORDER=LOWFIRST the arrows with smaller heuristic values are first, and if RANGEORDER=HIGHFIRST the arrows with larger heuristic values are first. Value RANGEORDER=RANDOM implies a random choice among the arrows within the range. 4.3 System Option ORDEROFARROWS While system option RANGEORDER is used with control states of type RANGE, system option ORDEROFARROWS is used with normal states. Its possible values NO and RANDOM have meaning identical of that for the same values of RANGEORDER. 4.4 Control State SELECT and System Option PROXIMITY Control state SELECT is illustrated in the upper part of Fig.6. Analogously to control state RANGE, there is a second version with an ELSE arrow. Here, an outgoing arrow is active (can be selected) only if its evaluation is equal to the value of the state selector. For the example, if Sel=3.14 and S1=3.14, then both arrows b and c will be active. The state selector and arrow evaluations are real number and their comparison is very sensitive. In the example considered, if S1=3.142857 then only arrow c will eventually be active. The system option PROXIMITY defines a tolerance value – the values of the selector and evaluations are considered equal if the absolute value of their difference is not greater than the value of PROXIMITY. For example, in the lower part of Fig.6 if Sel=3.14 and S1=3.142857 then both arrows b and c will be active. The system option PROXIMITY may also have the constant value NEAREST – in that case the outgoing arrow selected will be the one whose evaluation is the nearest to the value of the selector.
Control Network Programming with SPIDER: Dynamic Search Control
261
Fig. 6. Control state SELECT without and with system option PROXIMITY – graphical representation and specification in the CN
5 Conclusion The means for dynamic control of the computation process in Spider have been discussed. These include system options for dynamic control and control states. Most prominent usage of the dynamic search control is for automatic, “non-procedural” implementation of various heuristic algorithms for informed search or improved uninformed search algorithms. Report [5] presented to the same conference is devoted to the usage of dynamic control for non-procedural implementation of heuristic algorithms, as well as related ideas for automation of programming.
References 1. Kratchanov, K., Golemanov, T., Golemanova, E.: Control Network Programming. In: 6th IEEE/ACIS Conf. on Computer and Information Science (ICIS 2007), Melbourne, Australia, pp. 1012–1018. IEEE Computer Society Press, Los Alamitos (2007) 2. Kratchanov, K., Golemanova, E., Golemanov, T.: Control Network Programming Illustrated: Solving Problems With Inherent Graph-Like Structure. In: 7th IEEE/ACIS Conf. on Computer and Information Science (ICIS 2008), Portland, Oregon, USA, pp. 453–459. IEEE Computer Society Press, Los Alamitos (2008) 3. Kratchanov, K., Golemanova, E., Golemanov, T.: Control Network Programs and Their Execution. In: 8th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases (AIKED 2009), Cambridge, UK, pp. 417–422. WSEAS Press (2009) 4. Kratchanov, K., Golemanov, T., Golemanova, E.: Control Network Programs: Static Search Control with System Options. In: 8th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases (AIKED 2009), Cambridge, UK, pp. 423–428. WSEAS Press (2009) 5. Kratchanov, K., Golemanova, E., Golemanov, T., Ercan, T.: Nonprocedural Implementation of Local Heuristic Search in Control Network Programming. In: 14th Int. Conf. on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2010), Cardiff, UK (2010)
262
K. Kratchanov et al.
6. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2010) 7. Luger, G.F.: Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 6th edn. Addison Wesley, Boston (2009) 8. Jones, M.T.: Artificial Intelligence: A Systems Approach. Infinity Science Press, Higham (2008) 9. Shinghal, R.: Formal Concepts in Artificial Intelligence. Fundamentals. Chapman & Hall, London (1992) 10. Golemanov, T.: An Integrated Environment for Control Network Programming. Research report, Rousse University (1996) 11. Golemanov, T., Kratchanov, K., Golemanova, E.: Spider vs. Prolog: Simulating Prolog in Spider. In: 10th Int. Conf. on Computer Systems and Technologies (CompSysTech 2009), Rousse, Bulgaria, ACM Bulgaria, Sofia. ACM International Conference Proceeding Series, vol. 433,pp. II.9-1-II.9-7. ACM, New York (2009) 12. Golemanova, E., Kratchanov, K., Golemanov, T.: Spider vs. Prolog: Computation Control. In: 10th Int. Conf. on Computer Systems and Technologies (CompSysTech 2009), Rousse, Bulgaria, ACM Bulgaria, Sofia. ACM International Conference Proceeding Series, vol. 433, pp. II.10-1-II.10-6. ACM, New York (2009) 13. Kratchanov, K., Golemanov, T., Golemanova, E., Stanev, I.: Control Network Programming in SPIDER: Built-in Search Control Tools. In: Ciftcibasi, T., Karaman, M., Atalay, V. (eds.) New Trends in Artificial Intelligence & Neural Networks, pp. 105–109. EMO Scientific Books, Ankara (1997) 14. Golemanova, E., Golemanov, T., Kratchanov, K.: Built-in Features of the SPIDER Language for Implementing Heuristic Algorithms. In: Conf. on Computer Systems and Technologies (CompSysTech 2000), Sofia, pp. 2091–2095. ACM Press, New York (2000) 15. Stanev, I.: Formal Programming Language Net. Part I – Conception of the language. In: Conf. on Computer Systems and Technologies (CompSysTech 2001), Sofia (2001) 16. Stanev, I.: Formal Programming Language Net. Part II – Syntax Diagrams. In: Conf. on Computer Systems and Technologies (CompSysTech 2001), Sofia (2001) 17. Winston, P.: Artificial Intelligence, 3rd edn. Addison Wesley, Reading (1992) 18. Lee, R.C.T., Tseng, S.S., Chang, R.C., Tsai, Y.T.: Introduction to the Design and Analysis of Algorithms. McGraw-Hill, Singapore (2005) 19. Levitin, A.: Introduction to the Design and Analysis of Algorithms, 2nd edn. Addison Wesley, Boston (2007)
Non-procedural Implementation of Local Heuristic Search in Control Network Programming Kostadin Kratchanov1, Emilia Golemanova2, Tzanko Golemanov2, and Tuncay Ercan1 1 Department of Software Engineering, Yasar University, Universite Cad. Yagacli Yol No.35-37, Bornova/Izmir, 35100 Turkey {kostadin.kratchanov,tuncay.ercan}@yasar.edu.tr 2 Department of Computing, Rousse University, 8 Studentska Street, Rousse, Bulgaria {EGolemanova,TGolemanov}@ecs.ru.acad.bg
Abstract. The report describes the type of improved uninformed or heuristic search algorithms that are well-suited for non-procedural implementation in Control network programming, and how this can be achieved using the tools for dynamic computation control. Keywords: Control network programming, heuristic algorithms, programming paradigms, programming languages, nonprocedural programming, local search, automated programming.
1 Introduction Control network programming (CNP) and, more specifically, the Spider language support powerful means for dynamic control of the computation/search process. These means – dynamic control system options and control states - were introduced in [1]. This report is devoted to describing the usage of the dynamic control for nonprocedural implementation of various heuristic strategies. CNP is especially effective for non-procedural implementation of certain type of heuristic algorithms. We refer to this type of algorithms as local search. We address the question of how to recognize this type of heuristic approaches. We also discuss the issue of how to model other types of heuristic algorithms. Although such “non-local” algorithms need a procedural implementation, it turns out that in most cases we can actually use a generic, universal control network (CN), and only correspondingly adjust the primitives used in this CN.
2 Variants of Hill Climbing We start with a discussion of a number of CNP implementations of the well-known Hill Climbing heuristic strategy [2-6] on the example of the map traversal problem as defined in [1]. We assume that the map given is the one shown in Fig. 3 in [1]. As usual for similar problems, the heuristic used is the air distance (also called straightline distance) between the cities (e.g., [2]). Table 1 specifies the air distances. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 263–272, 2010. © Springer-Verlag Berlin Heidelberg 2010
264
K. Kratchanov et al.
2.1 Using Control State ORDER – Movements Downhill Allowed An implementation for finding a path in the map is shown in Fig.1. An ORDER state is used for every city in the map with multiple outgoing arrows – in our example these are cities A, B and C. Primitive SetAirDist sets the heuristic values of the arrows according to Table 1. For example, the primitive call SetAirDist (‘B’,A1) associated with the arrow entering control state OrderA finds in Table 1 the air distance from city B to the final city, and assigns this value to variable A1. The value of A1 is actually the heuristic value of the arrow from OrderA to B, in other words, the heuristic value of city B in the classical hill-climbing algorithm. The selectors of all control states ORDER have values 0. Therefore, when the control is in a given control state (e.g., OrderA) the emanating arrows will be attempted in the order of the air distances from the corresponding cities, a city with a smaller air distance chosen first. For state OrderA, if the goal city is D then the values of A1 and A2 will be 15 and 70, respectively and the control will first go to node B, and then to node C if backtracking into OrderA has happened.
Fig. 1. Non-recursive Hill-Climbing solution with usage of control state ORDER
Non-procedural Implementation of Local Heuristic Search in CNP
265
Table 1. Air distances between cities in the map
This algorithm corresponds to steepest ascent hill climbing with backtracking [4] in which moving downhill is allowed. It also corresponds to the hill climbing strategies described in [5,6]. Note that this strategy is complete, in contrast to most of the other approaches described further. 2.2 Using Control State RANGE – Movements Downhill Not Allowed More often, the hill climbing algorithm includes an additional restriction that the heuristic value of a next state cannot be worse than the heuristic value of the current state [4].
Fig. 2. Non-recursive Hill-Climbing solution using control state RANGE – variants 1 and 2
This algorithm can be simulated in CNP using control states RANGE. For our example, the CN will be similar to the one shown in Fig.1. The segment of the modified CN that corresponds to state A is shown in the left part of Fig.2. Primitive call SetAirDist(‘A’,A)is added to the arrow entering node RangeA. The second parameter is an output parameter. The value of variable A is calculated dynamically – note that variable A is actually the higher selector determining the range for the control state RangeA. Option [RANGEORDER=LOWFIRST] must be used
266
K. Kratchanov et al.
for each control state of type RANGE in subnet Map – as a result, a city with the smallest air distance among the possible neighbors of the current city will be attempted, provided the air distance of the neighbor is smaller than the air distance of the current city. A second variant is illustrated in the right part of Fig.2. Primitive SetAirDist has an additional first parameter. Basically, this primitive acts in the same manner as before – SetAirDist(‘A’,’B’,A1) sets the value of variable A1 to equal the air distance from city B to the final city, as specified in Table 1. However, if the air distance from B is larger than the air distance from A then the value of A1 will be made -1. Negative values are outside the range specified for RangeA, and therefore transition from RangeA to B will not be possible (this branch will be cut off). 2.3 A Recursive Solution with Usage of Control State RANGE Fig.3 illustrates another, recursive implementation of the hill-climbing strategy for the map traversal problem. Its idea is similar to the non-recursive solution shown on the left of Fig. 2. Primitive Connection(City, NextCity, AirDist) checks if there is a road from City to NextCity. If such a connection exists then it sets the air distance from NextCity to the final city according to Table 1. In case no link exists the heuristic value is set to -1. The last value is outside the range and therefore such arrows will be pruned.
Fig. 3. Recursive Hill-Climbing solution with usage of control state RANGE
Recursive solutions are typically preferable when the map (or the state space) is too large.
Non-procedural Implementation of Local Heuristic Search in CNP
267
3 System Option NUMBEROFARROWS The system options for static search control in CNP were described in [7]. Most system options for dynamic control were discussed in [1]. The last such option, NUMBEROFARROWS, is introduced here. The value of the system option NUMBEROFARROWS determines the maximum number of outgoing arrows which will be tried by the interpreter for a particular state in the option’s scope. 3.1 Irrevocable Hill Climbing If we modify the CN from Fig.1 adding [NUMBEROFARROWS=1] for each of the three control states then only a single arrow will be attempted. That means that the CNP solution will actually model the well-known fast algorithm of irrevocable hillclimbing [5]. Most hill-climbing algorithms in the literature are actually irrevocable hill-climbing [2,3]. Note that the same effect (irrevocable hill-climbing) will be achieved if, instead of [NUMBEROFARROWS=1]we use [BACKTRACKING=NO]. The Nearest-Neighbor algorithm for the Traveling Salesperson problem [3,8,15] is actually a specific implementation of irrevocable hill climbing. It can be easily modeled in CNP. 3.2 Hill Climbing with Limited Revocability If, instead of [NUMBEROFARROWS=1]we use[NUMBEROFARROWS=b]for a given constant b, then the resulting algorithm will be similar (but not identical) to beam search, called also pruned breadth-first search [2,5,6]. Similarly, based on the hill-climbing solutions of Sections 2.2 and 2.3 instead of the solution from Section 2.1, we can simulate irrevocable hill-climbing using [NUMBEROFARROWS=1] or [BACKTRACKING=NO], and hill climbing with restricted revocability using [NUMBEROFARROWS=b].
4
Introducing Random or Stochastic Choice
It is well known that the local greedy algorithms (such as hill-climbing) can get stuck in a local optimum. Therefore, various variants have been invented expanding the search space in an effort to avoid becoming trapped in a local optimum [2,3]. This is achieved by incorporating certain form of randomness in the choice of the successor of a state, which in the CNP case means in the choice of the outgoing arrow to follow. Such modifications are stochastic hill climbing (a next uphill move is chosen randomly, first-choice hill climbing (successors are generated stochastically and the first of them which is better than the current state is chosen), simulated annealing (a random move is picked; if it is “better” the move is accepted, otherwise the algorithm accepts the move with some probability depending on the steepness of the descent and the “temperature” – bad moves are more likely to be allowed at the start when the temperature is high). For discussions of the importance of introducing randomness in
268
K. Kratchanov et al.
robot control algorithms and generally in computation theory, the interesting reader is referred to [9,10]. The CNP implementation of some of these algorithms is described below. 4.1 Stochastic Hill Climbing Three variants are suggested. They can be illustrated by the diagrams shown in Fig.2 and Fig. 3. The only difference is that the following system option values must be specified: [RANGEORDER=RANDOM] and [NUMBEROFARROWS=1]. The first option sets a random order of the active outgoing arrows. This selection is irrevocable due to [NUMBEROFARROWS=1]. 4.2 First-Choice Hill Climbing This algorithm implements stochastic hill climbing by generating successors randomly until one is generated that is better than the current state. A CNP solution is shown in Fig.4. Arrows emanating from state FromA will be selected randomly due to [ORDEROFARROWS=RANDOM], and the first leading to a city with air distance better than the air distance of the current city will be successful. Primitive IsBetter(Child,Parent) calculates the air distances of Child and Parent and succeeds if Child is better than Parent.
Fig. 4. First-Choice Hill Climbing – declarative solution
The recursive solution can similarly be modified to emulate First-choice hill climbing. 4.3 Simulated Annealing Finally, without much discussion, we will only include here (Fig. 5) the CN for a possible CNP solution to the Traveling Salesperson Problem (for cities A, B, C and D) that implements the Simulated Annealing algorithm. The pseudocode of primitive Accept is very similar to the pseudocode of the basic process in [11]. The effect of
Non-procedural Implementation of Local Heuristic Search in CNP
269
primitive Neighbor and option [RANGEORDER=RANDOM] is swapping randomly a pair of points. The values of NTemps (number of temperature steps to try) and NLimit (number of trials at each temperature) in our solution must be by 1 less than the respective values in [11] as the value of the system option LOOPS limits the maximum number of repeating visits of a state.
Fig. 5. Simulated annealing for TSP
5 Non-procedural Local Search in CNP Traditional search algorithms for both exhaustive and informed search [2-6] typically require “procedural” implementation, that is, a procedure employing loops. In many cases the usage of data structures usually called OPEN and CLOSED is fundamental in these algorithms where OPEN is normally a stack, a queue or a priority queue depending on the type of the search strategy. Instead of iterative, recursive algorithms are sometimes used (e.g., [3]); this, however, does not change their “procedural” nature. Even the implementation of search algorithms in Prolog remains procedural (with few exceptions) – the Prolog program actually models the imperative algorithms [12]. CNP, respectively Spider, has a very interesting feature: a large group of search algorithms can be specified “non-procedurally”. By this we mean that the algorithm is represented by a CN which generally is a description of the problem, and not a simulation of an iterative procedure. Trivial examples would be the algorithms for animal
270
K. Kratchanov et al.
identification and arithmetic expression recognition from [13] where the search performed is simply the extended backtracking search built-in in the Spider interpreter. All other algorithms discussed in [1] and in this paper, with the exception of A*, may be assigned to this group as well. (A mixed case is Simulated annealing which has a considerable procedural element.) The CNs of these CNP implementations are not simulations of procedural algorithms, and no additional data structures such as OPEN are used. Instead, the CNs are conceptually declarative descriptions of the corresponding problem; the usage of appropriate system options and control states ensures that the search performed on the CN by the standard Spider interpreter will actually correspond to the corresponding heuristic or improved uninformed algorithm under question, for example hill climbing or branch-and-bound. What actually happens is that the CNP programmer uses the tools for dynamic control to achieve, without procedurally specifying any algorithm, a behavior of the standard interpreter that corresponds to the desired heuristic or other more advanced search strategy. Not all heuristic or advanced uninformed search algorithms are suited for such non-procedural implementation in CNP. An algorithm that may be directly implemented should be based on the extended backtracking strategy built-in in the interpreter. We can easily model different modes of choice of the candidate outgoing arrow (such as best-first, or randomly, or within a range, or a restricted number of arrows, etc.), or modify other parameters of the backtracking (e.g., forbid backtracking). We can perform such modifications even dynamically. Basically, we can do anything we want with the arrows of the current state; we can backtrack but then we will again have a current state. Therefore, we call this group of search strategies “local search”. This phrase is related to but is not identical with the phrase “local search” as usually understood in the literature (e.g., [2,3,15]) where the emphasis is different. Note also, that our local search is usually revocable. We have illustrated the implementation of a number of local search algorithms in [1] and in this paper, such as optimal search with cutting off insipid paths (branchand-bound), hill climbing, irrevocable hill climbing, nearest neighbor search, version of beam search, stochastic hill climbing, first-choice hill climbing.
6 Some Search Strategies Must Be Implemented Procedurally It must be understood that not all search algorithms are well suited for an elegant nonprocedural implementation in SPIDER. Most importantly, we have no automatic means to implement strategies making use of a global data structure OPEN where OPEN can include candidate arrows emanating from different states. On the other side, CNP is a universal programming paradigm. Therefore, all heuristic and other algorithms can be, in principle, modeled. This can be actually done quite easily, typically using data structures similarly to classical implementations in imperative, logic or functional programming, such as OPEN. Such CNP realizations of heuristic algorithms have also been done, for instance numerous versions of the A* algorithm (e.g., [16]). By the way, one such implementation is the CN program shown in Fig.1 of [1].
Non-procedural Implementation of Local Heuristic Search in CNP
271
These “procedural” implementations fall outside the subject of this report. We would only like to mention that such CNs are actually “generic” – they depend on the strategy to be implemented but normally not on the specific problem being solved. While the CN remains unchanged, only the primitives used must be adjusted to the particular problem. At a second level of generalization, the CN will even remain identical for a number of search strategies. Somewhat similar approach is taken in welldesigned object-oriented implementations (e.g., [12]) based on the usage of abstract classes, interfaces and similar more abstract concepts.
7 Conclusion The most prominent usage of the tools for dynamic search control in CNP is for automatic, “non-procedural” implementation of various heuristic algorithms for informed search or improved uninformed search – no search strategy has to be programmed; the aimed search strategy will be an automatic result of the work of the standard interpreter. The questions of what search algorithms are well suited for such an elegant declarative CNP implementation and how to specify (program) such a strategy have been targeted in this report. A wider view at the approaches to implementing various search strategies in CNP is the subject of [17]. There is a two-fold relationship of this research with automated programming and automated software engineering. Firstly, CNP is a natural choice as an implementation tool when considering many problems involving automated programming. A related early application is [14]. Secondly, a natural next step would be research towards automating CN programming itself or, stated differently, moving to a higher level in CNP programming. Instead of using system options and control states to specify a heuristic algorithm, a challenging but worthy task would be considering the possibility of, for example, simply setting the value of a higher-level option STRATEGY to HILLCLIMBING asking the interpreter to implement itself a hill-climbing strategy for solving the problem specified by the CN. Developing methods for CNP software engineering and automated CNP engineering is also among the possible areas for future research.
References 1. Kratchanov, K., Golemanov, T., Golemanova, E., Ercan, T.: Control Network Programming with SPIDER: Dynamic Search Control. In: 14th Int. Conf. on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2010), Cardiff, UK (2010) 2. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2010) 3. Luger, G.F.: Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 6th edn. Addison Wesley, Boston (2009) 4. Thornton, C., du Boulay, B.: Artificial Intelligence. In: Strategies, Applications, and Models Through Search, 2nd edn., Amacom, New York (1998) 5. Shinghal, R.: Formal Concepts in Artificial Intelligence. In: Fundamentals. Chapman & Hall, London (1992)
272
K. Kratchanov et al.
6. Winston, P.H.: Artificial Intelligence, 3rd edn. Addison Wesley, Reading (1992) 7. Kratchanov, K., Golemanov, T., Golemanova, E.: Control Network Programs: Static Search Control with System Options. In: 8th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases (AIKED 2009), Cambridge, UK, pp. 423–428. WSEAS Press (2009) 8. http://en.wikipedia.org/wiki/Nearest_neighbour_algorithm 9. Martin, F.G.: Robotic Explorations. Prentice Hall, Upper Saddle River (2001) 10. Hromkovic, J.: Theoretical Computer Science: Introduction to Automata, Computability, Complexity, Algorithmics, Randomization, Communication, and Cryptography. Springer, Berlin (2004) 11. http://home.att.net/~srschmitt/sa_demo/SA-demo_1.html 12. Luger, G.F., Stubblefield, W.A.: AI Algorithms, Data Structures, and Idioms. In: Prolog, Lisp, and Java. Addison Wesley, Boston (2008) 13. Kratchanov, K., Golemanova, E., Golemanov, T.: Control Network Programming Illustrated: Solving Problems With Inherent Graph-Like Structure. In: 7th IEEE/ACIS Conf. on Computer and Information Science (ICIS 2008), Portland, Oregon, USA, pp. 453–459. IEEE Computer Society Press, Los Alamitos (2008) 14. Kratchanov, K.D., Stanev, I.N.: A Rule-Based System for Fuzzy Natural Language Robot Control. In: Plander, I. (ed.) Proc. 4th Intern. Conf. Artificial Intelligence and InformationControl Systems of Robots, Smolenice, Czechoslovakia, pp. 304–309. North-Holland, Amsterdam (1987) 15. Levitin, A.: Introduction to the Design and Analysis of Algorithms, 2nd edn. Addison Wesley, Boston (2007) 16. Golemanov, T.: An Integrated Environment for Control Network Programming. Research report, Rousse University (1996) 17. Kratchanov, K., Golemanova, E., Golemanov, T., Ercan, T., Ekici, B.: Procedural and Non-Procedural Implementation of Search Strategies in Control Network Programming. To be Presented at: Intl Symposium on Innovations in Intelligent Systems and Applications (INISTA 2010). Kayseri & Cappadocia, Turkey (2010)
Meta Agents, Ontologies and Search, a Proposed Synthesis Ronald L. Hartung1 and Anne Håkansson2 1
Department of Computer Science Franklin University Columbus, Ohio, USA 2 Department of Communication Systems KTH, The Royal Institute of Technology Stockholm, Sweden
Abstract. The semantic web dates back to 2001 and a number of tools, and representational forms are proposed and defined. However, the challenges of the semantic web are still largely unfulfilled, in the opinion of the authors. What is proposed here is a synthesis of ideas and techniques aimed a presenting the user with a much more structured search result that can expose the relationships contained within them. This does not increase semantic search, but can provide a much richer view for the user to apply human understanding though exposing the semantics represented by the structure within search results. Keywords: Agents, multi-agent, ontology, search, semantic web.
1 Introduction Ontology was described as a key component in the description of the semantic web, web 2.0 and even web 3.0. [3, 2, 12] It is reasonable to conclude that whatever degree of success these three interlinked initiatives have achieved, it does not seem that the original vision has been realized. This paper seeks to look at some issues around the application of ontology and propose an alternative approach. The approach will support parallelism in the search by use of multi-agents and stretch the application of ontology as a component of search. The multi-agent system approach enables the parallelism, but also allows the logic to be simplified. The search result will be structured, and not a linear list of results. This approach has evolved over several years as a result of joint research efforts combined with introspection about what worked well and were the gaps are found. The goal of this search technique is to return results that will expose relationships with in the results. This structure contains relationships that can be exposed via metadata on the source pages. It also can be extended by recourse to ontologies in several ways. While this does not significantly extend the semantics of the search, it can help the user find relationships in the results that the user can explore and possibly recognize useful information that may have gone unnoticed. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 273–281, 2010. © Springer-Verlag Berlin Heidelberg 2010
274
R.L. Hartung and A. Håkansson
2 Related Work It has been suggested that applying multiple search engines then analyzing the results for best descriptions of the contents is a reasonable way to extend search [1, 17]. This idea is interesting and can be further extended in some other directions. Another proposal for improving search is through the use of ontology. The classical use of ontology in search is to expand the users’ search terms. Using the concepts in the query terms and using ontology content to add additional terms to the query results in an expanded query [13, 14]. This is sometimes done using similarity computations based on ontology structure [18]. A very interesting approach is found in the work of Stojanovic [15] where the query is analyzed for ambiguity; the redundant queries are analyzed by using the relationships in the ontology. For example, if the query contains terms that have a subset relationship between them, then one term can be eliminated without damage to the query results.
3 Problem Areas There are several problem areas that are the basis for the approach proposed in this paper. These are based on observations in prior research efforts coupled with comments of other researchers. The assumptions that underlie the approach are explored in this section. Philosophy explored ontology for many years, even though their aim was different from the applications in AI and computer science. Ontology was to be the description of all that exists in the real world. Furthermore, it was to be a hierarchy structured classification taxonomy. Unfortunately, the philosophers have never agreed on even the top level of a universal ontology. The best computer science has approached philosophy’s goal for ontology can be found in the CYC [4] system. This is a research project gone commercial, so the best reference is to the CYCcorp web site. It is instructive to observe that CYC’s creators choose to accept a lack of global consistency in favor of locally enforced consistency. This clearly adds flexibility to the ontology designer, but requires additional care when reasoning systems use the results of the ontology. It would be easy to have contradictions added to the assumptions used by the reasoner. A seminal issue, confronting any approach to ontology, is the lack of simple definition. The problem is that objects in the world exhibit a wondrous and perplexing property. That is, while humans try to define objects, the world resists our definitions. Take the common object chair; generally we define that as something used to sit upon. But chairs take so many forms that it very hard to rigorously define a chair. Worse, a chair can be defined as something being sat upon, which includes objects other that things strictly chair-like or non-chairs objects used as a chair. And conversely a chair can be used for many other purposes. This example is a common one for philosophers to play with, but it illustrates a more interesting problem for search. That is, a search of strangely disconnected and diverse pieces of information can come together to form relevant search results. Added to these problems is the everlasting conundrum of symbolic AI, the symbolgrounding problem. There is no solution to symbol grounding yet found satisfactory.
Meta Agents, Ontologies and Search, a Proposed Synthesis
275
Adding ontologies into the mix does not really do anything to solve symbol grounding. If anything, ontologies could give a false sense of security that somehow our systems are “understanding”. Thus avoiding facing or hiding symbol grounding issues. Ontology mapping and alignment have been a big effort for some years [10, 16] with the hope being that greater interoperability could be obtained by providing a mapping between the disparate ontologies. While this can and does work, it is still a primitive and costly procedure. In addition, it often requires user input to obtain high quality mappings. In the proposal herein, the choice is not to use alignment or mapping of the ontologies, but rather to leave it to the user to interpret the search results though their own understanding. While this is really the case in search results now, the approach described herein is to structure the data to make the multiple ontologies represented in the results. This explicitly avoids mixing the results from separate ontologies. The semantic web encourages the construction of multiple ontologies. These are often designed to be domain ontologies, meaning an ontology that captures a view of a set of objects in a domain. This is clearly easier to construct than a single correct and global ontology. However, there are no standards for their construction, beyond the use of ontology languages like OWL or RDF. Worse, ontologies are written for specific purposes. An example of this is the friend of a friend, FOAF. This is an RDF based description of people. While this is not a problem, since the terms use a “foaf:” or can be detected by inclusion of the FOAF definitions. Other ontologies many not have the same purpose or a domain specified. Certainly, the quality issue will be there, given the difficulty of constructing ontologies. Whatever solution path choosen, ontology quality will be a confounding factor. Our approach will allow the user to know what ontology sources are used and to evaluate the results. One approach that has been proposed is to allow the user to choose the domain ontology. When the user is searching for information, the user may have a concept of the domain in which the desired information resides. But, this can be over dependence on the user. Even if the user has a strong idea of the domain, there is no assurance that the information is stored within the domain the user believe it to be. Many of the existing search techniques assume a single ontology. Given the issues with ontology design and the lack of a single ontology, this search technique will work with varying degrees of success, depending on how well the domain of the ontology matches the domain of the query. A second issue, somewhat related to ontology, is RDF metadata. As part of the web 2.0 effort, metadata is supposed to be attached to web pages. RDF is a formal mechanism for metadata, defined by the W3C organization. However, observations indicate that RDF is not that widespread in the net. Moreover, while RDF may be useful, it also might not be useful. The user will still be stuck with the problem of validating the data. Within a domain and given a community of use within the domain, a solid well-defined RDF will have a chance to come into existence. This is neither a guarantee, nor it likely to be useful to those outside the community, when searching results from these pages.
4 Proposed Solution In our research, we employ multi-agent technologies. These systems [6, 7, 8] use agents for the lowest level tasks and meta-agents that organize the agents into a
276
R.L. Hartung and A. Håkansson
computation fabric. These systems are employed for a number of problems. The fundamental organizing principle is to use meta-agents to form a hierarchy over the agents. The agents are the leaves of the hierarchy and are the lowest level of the work, which, in this case, the leaves will be the search elements. The meta-agents are used to organize the search elements. In this paper, we apply a multi-agent system, with agents and meta-agents, to perfom semantic search with ontologies. Search is used as an example task in this paper; however, the same design can be extended into other forms of reasoning based systems. The solution will combine several aspects described in the following subsections. 4.1 Multiple Starting Ontologies Multiple base ontologies is one technique that can be used to counter the lack of a single usable ontology. For searches that are constrained to a single search domain, a useful single ontology may be a workable solution. But this does not meet the need for a general search tool. Likewise, an ontology savvy user can select a well-chosen ontology to ground the search. However, this assumption seems problematic and unlikely to be a workable approach. The approach favored here is using multiple ontologies. At this time, a fixed set is used. A longer-term solution is to build a reasoning system to select a likely set of ontologies. Accepting the idea that the systems will not be able to apply one single ontology, instead multiple ontologies must be employed. The relationships between these ontologies and the pages, being referenced, will be dynamic and responsive to the problem or search under consideration. The extension here is to relate multiple aspects of the solution and structure the solution space according to these aspects. The search system has a predefined set of ontologies. A future extension will be to consider adapting the set of starting ontologies based on the apparent domain of the query. For each starting ontology, the user supplied query terms are expanded to include synonyms from the ontology. This uses the standard expansion approach is similar to [13, 14]. The expanded search of each ontology is used to start a separate parallel search, which builds the top-level meta-agents of the hierarchy. 4.2 Parallelization via Agents The set of starting meta-agents spawn multiple base level agents. The base level agents form a distributed search mechanism that locates a set of web pages. The agent stores the location of pages via a URI [9]. In addition to locating the pages, the agent will extract and normalize metadata from the page. This metadata is used to organize the pages under the meta-agents. The number of pages to be held in a single agent is parameterized. There is a trade off to be made between the number of agents to use and the amount of storage required for a set of pages in a single agent. This is under study. 4.3 Search Result Structure A major departure from the most ontology based search approaches is to provide a richer structure to the search result. It is one thing to locate a set of results, it is another to expand the linear list of results into a complex conceptual structure. The goal
Meta Agents, Ontologies and Search, a Proposed Synthesis
277
in our system is provide a semantically rich result. A considerable part of the approach to ontology-based search is to improve recall and precision of the search results. This is a partial goal, since our goal extends to provide a conceptual structure of the results to help the user interpret what the search engine has returned. The search results will be structured in a multi-hierarchy. This follows the work in a resent research [9]. The cluster approach is used in the research in this paper.. The multi-hierarchy is constructed with multiple hierarchies over a set of leaf nodes. The multi-hierarchy is the fusion of multiple hierarchies. This takes the form of a hierarchy with crosslinks, both crosslinks joining nodes within a level of the hierarchy and crosslinks that join nodes in different levels. This results in a structure that mixes, in fact entangles, multiple hierarchies. In the application used here, the leaf nodes are shared between the multiple hierarchies. It is also possible that hierarchies may share nodes at higher levels. In either case, it is important to preserve the separateness of the hierarchies. The primary hierarchy formation mechanism is clustering of the leaves using metadata that describes the pages. The metadata is extracted from the web pages located by the searching agents. The clusters are formed by collecting sets of pages, which common metadata. The strength of the cluster is defined by the quantity of common metadata. Weaker clusters have fewer common items and are organized at a lower level in the hierarchy. The clusters are contained within meta-agents, i.e., each cluster is collected by a meta-agent. The meta-agent is an active part of the cluster formation. The cluster algorithm is contained in the meta-agent [9]. As pages are located by search engines, located in the leaf node agents, the metadata from the pages is broadcast to the meta-agents above the agents and the clustering process is applied. The algorithm has two parts. The first step determines that the page fits the current cluster. The second step determines if a new cluster should be formed because there is a set of pages with common elements not covered by existing clusters. The resulting cluster collection is a multi-hierarchy, because the higher level clusters are not composed as a collection of lower clusters. That is, a strict single hierarchy is not maintained. These meta-agents provide a parallel computation to form the clusters. 4.4 Clustering through Normalization Normalization of the metadata on web pages is required to make this approach work. If a large subset of web pages found on the web applied RDF as a uniform or at least universal metadata approach, this would be less of a problem. However, we cannot assume RDF metadata will be found on any given page. The solution is a tool called WebCat [11]. WebCat processes web content, both pages and other known documents, to produce RDF metadata. There are two advantages to this solution. First of all, WebCat will produce usable metadata for web pages. WebCat will also process other document types, for example, pdf documents. This includes extracting RDF or other metadata that are found on the page. Second, it gives a standardized RDF metadata generated by a consistent algorithm. This standardized generation of RDF metadata is likely to have higher reliability than “user added” metadata, which can be expected to vary in quality. As a side note to this comment, there is much work in social networking to gather and validate metadata by voting methods, but this approach does not apply directly in a searching context. However, using data in social media is a very useful extension to searching.
278
R.L. Hartung and A. Håkansson
The clustering of pages into the hierarchy is based on the metadata [10]. This metadata is used to group pages into clusters. The clusters are contained in meta agents in a hierarchy, interposed between the top level meta agent and the leaf nodes the searching agents. The hierarchy is a multi-hierarchy. This means that web pages can appear in multiple clusters. [9]. The clusters are ranked by strength. Strength of a cluster is defined as the number of common metadata terms shared among members of the cluster. Clusters are organized into levels by their strength. Therefore, clusters of strength one, i.e., those that share only a single term, are placed just above the leaves (agents). For any search result, clusters at the given level may not exist, for example, if all the clusters share at least two metadata terms, there will be no clusters of strength one. The other observation is that clusters at a higher level do not have to subsume clusters at a lower level. Although it is common that upper level clusters do subsume lower ones. The simple observation that demonstrates this: assume we have metadata terms A, B and C. The figure 1 shows a possible result.
Fig. 1. Cluster Example
In this figure, the nodes represent clusters, each of which is maintained in a metaagent. The meta-agents will have links to the agents where URL and meta-data are held (the lowest level agents are not shown in the figure to keep the diagram simple). In this hypothetical case, the cluster of terms A and B are fully contained in the ABC cluster. However, the clusters for B C and C have terms not found in the ABC cluster, see rightmost nodes in the figure. Additionally, there are no pages, which have A without B, see left node in the figure. Also no B pages exist that do not also have C. however, there are pages with have only C. Thus, a multi-hierarchy is required to fully structure this data. 4.5 Ontology Extension Ontology extension is the third technique. Clustering builds a multi-hierarchy of related pages based on metadata terms related to the search. Ontology extension will attempt to identify other relationships among the pages. Using these expanded relationships. additional mappings can be added. These will take several forms. One form
Meta Agents, Ontologies and Search, a Proposed Synthesis
279
will be to add concept graph structures mixed into the hierarchy. The other form will gather pages together in additional clusters. One example is to use FOAF as an ontology to describe relationships between people [5]. This can be extended to represent relationships between authors of a set of documents. For a researcher, this view can be used to get a view of the work in set of papers or pages located in a search result. Figure 2 shows the kind of relationships one can derive from a set of papers.
Fig. 2. Author and reference relationships
The data in Figure 2 is relatively simple to extract from documents. The algorithm used here is different from the clustering approach. The metadata is scanned for terms that fit a set of ontologies. These ontologies are pre-selected. The example above is an author and reference ontology that is useful for research or literture searches. The next set is to expand the data by searching for the individuals and using FOAF data to expand the structure above, see Figure 2. This can be extended by referencing the web for the authors and adding data found from social networking sites, especially where these sites have employed FOAF or other recognized ontologies. In a search involving corporations or products, a different set of relations similar to FOAF is possible. This is an area with interesting relations for products are “used in”. “supplied by”, manufactured by”, etc. The relationships can be anything that is useful within a domain. This is ontology that should be standardized like FOAF. The other form of structuring through ontology extension is to examine the metadata from the pages to locate ontologies strongly related to the pages. This is not applied at the page level, but at the cluster level. Since a cluster is defined by a common set of metadata, this metadata are used to locate ontologies of interest. This requires a threshold on the size of the metadata set before it is useful to trigger the ontology search.Currently, the design parameter is a minimum of four terms to trigger ontology search. The searchis applied to ontologies collected during the search and to ontologies previously located by the search system. The ontolgies located by this approach is used as annotation for the clusters, cluing the user that ontologies are available and can be examined for semantic content.
280
R.L. Hartung and A. Håkansson
5 Conclusions and Future Work Computational complexity is a serious issue with this approach, the computation applied beyond searching will be on the order of N2. A major reason to use the multiagent computational approach for generating the result structure is the ease of implementation on a parallel processor system. The increasing development of multi-core processors can be used to implement the approach. Also, this approach can be tested using existing search tools to do the lowest level searching. However, before this can become a fully useful scheme, the storage for a complex results needs to be dealt with. The real value of this approach is more complex view of the search results that will have semantic information about the results. This a serious step forward for researchers in that the result can be saved and used to explore the topic. Further, the storage of search results should enable additional search results added into the structure. This work, like all the work in ontology based search, calls out for the creation of better ontologies. This is a separate area of research work that the authors intend to engage. Especially the area of ontology quality. In the meantime, this approach does have the potential to locate ontologies as part of the search and to interrelate them into the results. However, a challenge left to the users will be dealing with the ambiguity of the mismatch in ontologies. Selection of the stating ontologies is a problem left for the future. This work uses a pre selected set of ontologies. Research is needed in reasoning systems that can select an effective subset of ontologies to start a search, based on the initial search terms. The authors early experience leads to the hypothesis that this is not a trivial process and will require some good engineering work. The most important work for the future is to provide an organized and usable interface to view the results. This will an interesting endeavor that synthesizes work in human computer interaction with visualization techniques.
Acknowledgement This work is supported by Sweden-Korea Research Cooperation Programme funded by STINT, The Swedish Foundation for International Cooperation in Research and Higher Education. http://www.stint.se/
References [1] Amitay, E.: In Common Sense - Rethinking Web Search Results. In: IEEE International Conference on Multimedia and Expo. (III), pp. 1705–1708 (2000) [2] Antoniou, G., Van Harmelen, F.: A Semantic Web Primer. The MIT Press, Cambridge (2004), ISBN 0262012103 [3] Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American Magazine (May 17, 2001), http://www.sciam.com/article.cfm?id=thesemanticweb&print=true (retrieved March 26, 2008) [4] CYC, http://www.cyc.com/ (March 19, 2010) [5] FOAF specification, http://xmlns.com/foaf/spec/ (March 19, 2010)
Meta Agents, Ontologies and Search, a Proposed Synthesis
281
[6] Håkansson, A., Hartung, R.: Using Meta-Agents for Multi-Agents in Networks. In: Arabnia, H., et al. (eds.) Proceedings of The 2007 International Conference on Artificial Intelligence (ICAI 2007), The 2007 World Congress in Computer Science, Computer Engineering, & Applied Computing, vol. II, pp. 561–567. CSREA Press, USA (2007) [7] Håkansson, A., Hartung, R.L.: Calculating optimal decision using Meta-level agents for Multi-Agents in Networks. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007, Part I. LNCS (LNAI), vol. 4692, pp. 180–188. Springer, Heidelberg (2007) [8] Håkansson, A., Hartung, R.: Autonomously creating a hierarchy of intelligent agents using clustering in a multi-agent system. In: Arabnia, H., et al. (eds.) Proceedings of The 2008 International Conference on Artificial Intelligence (ICAI 2008), The 2008 World Congress in Computer Science, Computer Engineering, & Applied Computing, vol. II, pp. XX-XX. CSREA Press, USA (2008) [9] Håkansson, A., Hartung, R.: Automatically creating Multi-Hierarchies (unpublished 2010) [10] Kalfoglou, Y., Schorlemmer, M.: Ontology Mapping: the State of the Art. The Knowledge Engineering Review 18(1), 1–31 (2003) [11] Martins, B., Silva, M.: The WebCAT Framework Automatic Generation of Meta- Data for Web Resources. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 236–242. IEEE Computer Society, Los Alamitos (2005) [12] O’Reilly, T.: What is Web 2.0, http://oreilly.com/web2/archive/ what-is-web-20.html (March 22, 2010) [13] Passadore, A., Grossi, A., Boccalatte, A.: AgentSeeker: an ontology-based enterprise search engine http://ceur-ws.org/Vol-494/mallowawesomepaper8.pdf (2009) [14] SHOE, http://www.cs.umd.edu/projects/plus/SHOE/search/ (March 19, 2010) [15] Stojanovic, N.: On the query refinement in the ontolog-based searching for information in Infomration Systems, vol. 30, pp. 543–563. Elsevier, Amsterdam (2005) [16] Wang, P., Xu, B.: LILY: The Results for the Ontology Alignment Contest OAEI 2007. In: The 6th International Semantic Web Conference and the 2’nd Asian Semantic Web Conference (2007), http://www.ontologmatching.org [17] Witt, A.: Multiple Hierarchies: New Aspects of an Old Solution. Interdisciplinary Studies on Information Structure 2, 55–85 (2005) [18] Zhang, K., Tang, J., Hong, M., Li, J., Wei, W.: Wieghted Ontology-Based Search Exploiting Semantic Similarity. In: Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., Zhang, Y. (eds.) APWeb 2006. LNCS, vol. 3841, pp. 498–510. Springer, Heidelberg (2006)
Categorizing User Interests in Recommender Systems* Sourav Saha1, Sandipan Majumder1, Sanjog Ray2, and Ambuj Mahanti1 1
IM Calcutta [email protected], [email protected], [email protected] 2 IIM Indore [email protected]
Abstract. The traditional method of recommender systems suffers from the Sparsity problem whereby incomplete dataset results in poor recommendations. Another issue is the drifting preference, i.e. the change of the user’s preference with time. In this paper, we propose an algorithm that takes minimal inputs to do away with the Sparsity problem and takes the drift into consideration giving more priority to latest data. The streams of elements are decomposed into the corresponding attributes and are classified in a preferential list with tags as “Sporadic”, “New”, “Regular”, “Old” and “Past” – each category signifying a changing preference over the previous respectively. A repeated occurrence of attribute set of interest implies the user’s preference for such attribute(s). The proposed algorithm is based on drifting preference and has been tested with the Yahoo Webscope R4 dataset. Results have shown that our algorithm have shown significant improvements over the comparable “Sliding Window” algorithm. Keywords: Drifting Preference, Recommender Systems, Collaborative, Yahoo Movies.
1 Introduction Recommender systems are technology based systems that provide personalized recommendations to users. They generate recommendations by profiling each user. Profiling is done by observing each user’s interests, online behavior and transaction history. Recommendations can also take into account opinions and actions of other users with similar tastes. Recommender systems algorithms can be classified into two major categories, namely content based and collaborative filtering based. A third approach called hybrid approach combines both content based and collaborative filtering based methods. In content based recommendations, a user is recommended items similar to the items he preferred in the past. For content based recommendations, each item in the items domain must be characterized by a set of keywords that describe it. In content-based recommendations, a profile of a user is first created by keyword analysis of the items previously seen and rated by him. On the basis of the *
This research has been supported by Indian Institute of Management Calcutta (IIM Calcutta), work order no. 019/RP: SIRS//3397/2008-09.
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 282–291, 2010. © Springer-Verlag Berlin Heidelberg 2010
Categorizing User Interests in Recommender Systems
283
profile created, the user is then recommended new items. Content based systems are not very popular as they lag in recommendation quality when the user’s tastes change. Also, in many domains getting high quality content data is a major issue. In collaborative filtering, the user is recommended items that people with similar tastes and preferences liked in the past. This technique mainly relies on explicit ratings given by the user and is the most successful and widely used technique. There are two primary approaches that are used to build collaborative filtering memory based recommender systems: user based collaborative filtering [1] and item based collaborative filtering [3]. In user based collaborative filtering, a neighborhood of k similar users is found for a particular user. Items are then recommended to the user from the set of items seen by its k nearest neighbors but not yet seen by him. In item based collaborative filtering, similarities between the various items are computed. Items are then recommended to the user from the set of k items that have highest similarity to the items already present in the user’s profile. One drawback of traditional recommendation systems are that they assume user interests to be static, which is not the case as preferences change with time. This change in preference can occur in two ways. The first kind of change that is most widely seen occurs when a user develops a new interest. Some common examples of this kind change are a movie viewer acquiring a recent liking for western movies, a book reader developing a new interest for books by an author or on a particular subject. Examples of users developing new interests are prevalent in most domains where recommender systems are extensively used like books, movies, music, research papers, television etc. Popular recommender system algorithms, like user based collaborative filtering and item based collaborative filtering algorithms, use only ratings data to generate recommendation. As context data like genre, subject and keywords are not used, the recommendations generated by these systems are not specifically customized for the user’s current interests, and as a result there is lag before they can detect change in his preferences. This inability of systems to quickly recognize a shift in interest is also known as the concept drift problem. The second kind of change occurs when user preference for a particular item changes over time. When asked to re-rate movies, users may re-rate the same item differently. User preferences evolve over time: a user who has rated an item highly five years back may not hold the same opinion about the item currently. Existing algorithms analyze a user’s complete movie rating data to create a profile. As equal weights are given to all the items in the user’s profile, the profile created may not give the true picture of the user’s current preferences. These drawbacks of existing algorithms can hamper the effectiveness of the recommender systems, which can lead to missed opportunities to market products to a user and can affect user confidence towards the recommender system. Most of the previous research on the problem of concept drift has focused on applying weights to time series data. The weights applied are based on a time decaying function with higher weights applied to new items. Older data points are removed or very low weights are applied to them so as to decrease their influence on recommendations generated. This approach can be helpful in solving the problem of concept drift when the second kind of interest change occurs, i.e. when the user’s preference for a particular item changes over time and the present data points present a truer picture of the user’s interest. But this approach of using only current data points for
284
S. Saha et al.
generating recommendations further magnifies the problem of data sparsity. While previous research on concept drift in recommender systems have tried to tackle the problem by using weighted data, none of them have tried to use content data like genre, keywords etc. And without using content data, the more prevalent problem of concept drift due to new interests cannot be solved. In this paper, we propose an approach to categorize user preferences from its transaction history in a recommender system. Our approach can be used by context based, collaborative filtering based and hybrid recommender system algorithms to make better predictions. Our approach of categorizing user interests can be used as one component of a hybrid algorithm designed for solving the concept drift problem. In section 2 we describe our categorization approach for user interests. In section 3 we provide an example to further clarify our approach. In section 4 we review previous research done in the area of drifting preferences. Section 5 describes the experimental evaluation. Section 6 has the discussions on the results obtained. Finally, we conclude our paper in section 7.
2 Proposed Model for Categorizing User Interests A user’s interests are identified by analyzing the attribute or content descriptor of items associated with his transactions. In movies domain, movie genres like action, drama, comedy, animation, etc., are used to characterize a movie. If a user only selects movies belonging to drama and action genres then we can say that the user’s interests are drama and action. Similarly, in books domain, subject keywords are used to identify interests. While in music domain genres like rock, pop, country music etc are used to identify interests. Our approach is relevant for all domains where items can be differentiated on the basis of content descriptors and can be grouped into categories based on the content descriptors. In this paper, we have used movies domain to illustrate our approach. We use the terminology interests and genres interchangeably. In movies domain, the genre of the movie is the most popular content descriptor data that is used to describe movie content. A single movie can have more than one genre associated with it. For example, the movie Casablanca is associated with drama, romance and war genres. To elaborate our categorization model, we first describe the different classes we have defined to classify a user set of interests derived from his transactions, and then we explain our design for movement of interests from one class to another for a user. 2.1 Interests Categorization In our proposed approach, a user’s interests can be categorized into five classes. Sporadic: Interests categorized as “Sporadic” category for a user are those attributes that occur once in a while in the user’s transactions. “Sporadic” interests are initially considered as a deviation from the user’s regular interests and seen as transient interests, as the probability of them occurring again in future transactions is minimal. Recommender systems should ideally ignore these interests as predicting the reoccurrence of these interests in future transactions is not feasible. A user buying a gift can be seen as an example of “Sporadic” interests. A user who does not like the
Categorizing User Interests in Recommender Systems
285
science-fiction genre may buy science-fiction books as gifts for friends or family. This change in user interest should be considered as a passing interest rather than a new interest. Recommender systems that consider sporadic interests as a change in a user’s taste will end up recommending items that may not be useful to the user. The challenge is to identify those sporadic interests that have the potential to become new interests. Interests that were once “Sporadic” but don’t recur in future has to leave the system altogether. If it starts recurring it has to start afresh as “Fresh Interest” (Fig-1). New: “New” interests are those interests that are new to the user but have been occurring more often in the user’s recent transactions. For a user, few of his interests categorized as “Sporadic” interests become categorized as “New” interests after they occur more often in successive transactions. Similarly, non-occurrences of an interest existing in “New” category in successive transactions will result in the interest shifting back to the “Sporadic” category of the user. Similarly, interests in the “New” category that occur frequently in successive transactions get categorized as “Regular” interest for the user. “New” interests are of much importance from a marketer’s point of view: as the user is exploring a new area of interest, he is more likely to be influenced by a recommendation made by the system. Regular: Once the user has subscribed to some particular attributes of interest a considerable number of times, such an interest can be put in the “Regular” category. This particular type of interest is assumed to hold good for the future transactions of the user as well and repeated occurrences only reinforce the belief. It needs to be mentioned that for every categories, inclusion of an interest takes place after the corresponding threshold is overcome (similarly, breaching the threshold results in exclusion and shifting to a lower category). Old: Interests once categorized in the “Regular” Category of a user gets classified in the “Old” Category when they stop occurring in frequent successive transactions. An interest classified as “Old” interests can get reclassified as a “Regular” interest with increased occurrence in recent transactions (and thereby successfully crossing the threshold). “Old” and “New” have nearly similar preferences during prediction. Past: Successive non-occurrence of an interest in the “Old” category results it being shifted to the “Past” category. “Past” categories can have equal or more preferences compared to “Sporadic” category but they definitely have a lower preference compared to the “New” category. Interests in the “Past” category can turn back to “Old” with repeated occurrences or may have to leave the system altogether with repeated non-occurrences. An interest having once left the system if starts recurring once again later, it has to follow the channel of Fresh Interest as shown in Fig-1 below. An interest leaving the system from “Sporadic” or “Past” has to start afresh. 2.2 Movement of Interests The designing of the algorithm here takes care of the drifting effect. Repeated occurrence of interest adds more weightage to that interest. Alternatively, non-occurrence decreases the weightage. The movement of the interest depends on the threshold defined for the particular category. With repeated occurrences the interest is moved to the next higher category if it’s more than the threshold defined for the next category. Similarly, if some interest stops occurring, then its weightage initially remains
286
S. Saha et al.
constant for the next few transactions and then finally stops dropping. Both the decay percentage and the waiting period depend on the preference category to which the interest currently belongs. Thus if the users had an inclination for a given interest sometime back, which has stopped recurring then such preference initially waits for the next few transaction and then starts fading off and finally goes away.
⎯→ Sporadic ⎯⎯ ⎯ ⎯→ New ⎯⎯ ⎯ ⎯→ Fresh Interests ⎯⎯ ⎯ Regular threshold
threshold
threshold
⎯⎯ Past ←⎯ ⎯ ⎯⎯ Old ←⎯ ⎯ ⎯⎯ Regular Lost Interests ←⎯ ⎯ threshold
threshold
threshold
Fig. 1.
Let’s define the new algorithm now. We assume the data is in the chronological order, grouped by the user. Let there be U1, U2, Ui…UN user available. A particular user, denoted by Ui has moved through mi elements. Thus, the total number of ܯ records in the system is given by σே ୀଵ ݉݅ ൌ ܯൌ σ݆ൌͳ ܴ݆
Define the thresholds set T = {{I},{E}} = {{TS}, {TN}, {TR}, {TO}, {TP}} for “Sporadic”, “New”, “Regular”, “Old” and “Past” category respectively, where each of the Ts contain the threshold for both inclusion and exclusion for the corresponding category. Define the weight function f(c) where c is the category in which the genre belongs currently. The h(c,w) is an activation function taking binary values (0 or 1) where w is the pre-defined window size for successive non-occurrence of any given attribute that currently belong to a particular category c (here Sporadic, New, Regular, Old and Past) after which the decaying function gets activated. Hence the decaying function d(c,w) is the product of f(c)*h(c,w). Initially, when an attribute comes into as a fresh interest for a given user, it’s given some weightage based on the frequency of occurrence. Once the cumulative weights cross the threshold for the starting category, which is “Sporadic”, such attribute of interest is tagged as “Sporadic”. From “Sporadic” the attribute can move to “New” category and then to the “Regular” category, each time satisfying the threshold for the given category (refer Fig-1). If the attribute stops recurring indicating a loss of interest or oversaturation, its weight starts decreasing by a certain percentage after a given waiting period (pre-defined window size) based on the category to which it belongs currently. The interest moves to the lower categories each time the threshold for a given category is breached and then finally moves out of the system.
3 An Example The following example illustrates our approach more clearly. Let us consider a user who rents movies (the element under consideration) in last 10 transactions. The following table shows how the attribute of interest (here movie genres) shift across the five sets for this user on the basis of his movie selection behavior.
Categorizing User Interests in Recommender Systems
Here f(c)
287
= 0.25 for “Regular” = 0.50 for “Old” and “New” = 0.75 for “Sporadic” and “Past”
The activation function for decaying h(c,w) has the value 0 or 1 based on the following pre-defined window size for successive non-occurrences of an already occurred attribute of interest for a given user. Here w
= 4 for “Regular” = 3 for “Old” and “New” = 2 for “Sporadic” and “Past”
Thus h(“Regular”,2) = 0 while h(“Sporadic”,2) = 1. To exemplify, once a particular genre is into the “Regular” category, even if it doesn’t occur for 3 successive times, it still remains in the said category. Only after the 4th consecutive non-occurrence, the weightage starts decaying by a value of 0.25 which is the f(c) value for “Regular” category. The attribute moves to the lower one on breaching threshold for “Regular”. The threshold set T = {TS, TN, TR, TO, TP} = {{0,0},{3,3),{5,5},{2,2},(-0.5,0.5}}. In the current example, the inclusion threshold and the exclusion threshold are the same and the incremental weightage for occurrence of an attribute of interest is 1 per transaction irrespective of interest category. All the above values have been established experimentally for the given dataset. The values would depend based on the nature and content of the data. Transaction T1
T2
T3
Movie Genre Thriller, Science Fiction/ Fantasy, Action/ Adventure Action/ Adventure, Science Fiction/ Fantasy, Action/ Adventure
T4
Comedy, Romance, Kids/Family, Animation
T5
Science Fiction/ Fantasy, Action/ Adventure
Sporadic
New
Regular
Old
Past
Thriller(1), Science Fiction/ Fantasy(1), Action/ Adventure(1) Thriller, Science Fiction/ Fantasy(2), Action/ Adventure(2) Thriller(0.25), Science Fiction/ Fantasy(2) Science Fiction/ Fantasy(1.25), Comedy(1), Romance(1), Kids/Family(1), Animation(1) Comedy(1), Romance(1), Kids/Family(1), Animation(1), Science Fiction/ Fantasy(2.25)
-
-
-
-
-
-
-
-
Action/ Adventure(3)
-
-
-
Action/ Adventure(3)
-
-
-
Action/ Adventure(4)
-
-
-
288
S. Saha et al.
T6
T7
Action/ Adventure, Drama, Science Fiction/ Fantasy Romance
T8
Romance
Romance(2.25), Drama(0.75)
T9
Romance
T10
Romance
Science Fiction/ Fantasy(2.75) Science Fiction/ Fantasy(2)
Comedy(0.25), Romance(0.25), Kids/Family, Animation, Drama(1)
Science Fiction/ Fantasy(3.25)
Action/ Adventure (5)
-
-
Romance(1.25), Drama(1)
Science Fiction/ Fantasy(3.25) Science Fiction/ Fantasy(3.25) Romance(3.25)
Action/ Adventure(5)
-
-
Action/ Adventure(5)
-
-
Action/ Adventure(5)
-
-
-
Action/ Adventure(4.75)
-
Romance(4.25)
4 Previous Work Done Recommender systems are a widely researched area [3], prediction problem and security issues being the more popular issues worked upon. However, not much work has been done on the concept drift problem. Most of the previous works [4, 5, and 6] are mainly different variations of assigning weights to data. The basic approach used is to assign a greater weight to new data, and discount old data so as to decay or remove the effect of old data. In [4] decreasing weights are assigned to old data. In [5] the authors have proposed an item similarity based data weight in addition to time based data weight. They have added the new weights to handle the issue of reoccurrence of user interests. In [6] a new similarity measure is used and the authors propose using weights for items based on expected accuracy on the future preferences of the user. Most of the earlier works have formulated the problem as a prediction accuracy problem. As a result, they have tried to measure the effectiveness of their approaches through accuracy metrics like mean absolute error or precision [7]. While these metrics can capture the effectiveness of the algorithm to make rating prediction of unseen items for a user, they cannot measure the capability of the algorithm to detect interest shift due to new interest. Unlike our approach, none of the existing approaches use genre data or content data to solve the problem of concept drift in recommender systems. And we believe that any approach that proposes to tackle the problem of detecting new interest has to use genre data or content data.
5 Experimental Evaluation We experimental evaluation our proposed approach on Yahoo Webscope R4 that has details of Yahoo Movie user ratings and Movie descriptive content. . In total the training set had 174130 records, 6719 distinct users and an average movie viewership of 25.92 per user. The max no. of movies seen by a particular user was 1388 while the minimum was 1. To conduct the evaluation, we used the test-dataset once again provided by the Yahoo Webscope R4. The test dataset had a total of 8358 records, 1675 distinct users
Categorizing User Interests in Recommender Systems
289
and an average of 5 records (4.9899 to be exact) per user. The min test-record for a single user was 1 while the max was 145. We simply evaluated the test-data from the tables constructed with the training-data. If the “Regular”, “New” and “Old” category had interests that could be successfully matched to the interest for the given user in the test-dataset, then we say a hit has been made or else we say it’s a miss. The “hit ratio”, expressed as a percentage came out as 85.0009%. In other words, in about 85% cases, the user’s movie preferences predictions matched that computed from the genre preferences as per the given algorithm. To benchmark our algorithm, the closest match we found was in the “Sliding Window” technique where items that appeared in the last “n” transactions are all considered for the sake of prediction. We have considered the same in our results. We have also had the individual movie genres computed as a percentage of the total number of transactions and used the more frequently occurring genres to predict the forthcoming movies. They are discussed in more details in the results section.
6 Results and Discussions To test the results, we have benchmarked our algorithm against the “Sliding Window”. These lists of genres that occurred in the last “n” transactions have been considered to check against the genres of the movies that have been provided in the test dataset. Following is the result: Size of Sliding Window
Hit ratio as a percentage (rounded values)
3 4 5
53% 61% 68%
Thus with a sliding window encompassing last 5 movies, the prediction accuracy was computed as 68%. We grouped the genres into Very Frequent, Frequent and Least Frequent based on percentage of times they appear in the user’s transactions. Result follows: Serial No.
1 2 3 4 5
Very-Frequent Threshold Percent 75 70 66 60 50
Frequent Threshold Percent 50 40 33 35 25
Least-Frequent Threshold Percent 25 15 16 20 10
Hit Ratio %
14.69% 28.14% 43.20% 38.56% 58.35%
We find that our algorithm produces a much more refined results compared to the ones described above.
290
S. Saha et al.
We observed that users that had at least 20 records in the training set could satisfactorily have their interests categorized into the well-defined sets. We computed the attribute cardinality in the sliding window pane of size 5 and the number of attributes in the combined category of “Regular”, “Old” and “New” from our algorithm. On average, our algorithm came up with 4.98 elements while the sliding window had 4.65 elements in its preference set. It might apparently seem that the accuracy of the algorithm has been enhanced by the extra elements compared to the sliding window. However, it’s interesting to note that an increase in about 6% of the attribute size increases the prediction accuracy by 25%. We raise a few concerns here as well. We have worked only with a single decomposable characteristic of the dataset. In most practical situations, we may lose out important dimensions if we work only with a single attribute. Hence, the algorithm needs to be extended to take care of cases where the dataset has multiple reference element-set and non-homogenous attributes. The thresholds for transition and the window size of the activation function and have been experimentally evaluated in this particular example. A generalization technique for determining such limiting value is yet to be found out. We have not considered the rating of the movie, the profile of the users – which could have otherwise given some newer findings. If sufficient data is available, one can take into account the user movie watching habit as well that would enable to have different threshold, different window size for different types of users in same dataset. Having said all these, the beauty of the algorithm is its simplicity and yet the robustness. The algorithm is computationally not very intensive and requires very little resource in terms of both space and time. In most real world phenomenon, we judge an object by a unique distinguishing attribute only so the algorithm might find tremendous use with homogenous multi-attribute element for a given dataset.
7 Conclusion In this paper, we propose an approach towards categorizing user interests in a recommender system by analyzing users’ previous transactions. Every user’s interests are categorized into one of the five categories or sets namely sporadic interests, new interests, regular interests , old interests and past interests. The motivation behind our approach is provide a solution for the problem of concept drift. We are more interested in tracking a user’s changing interests rather than going for plain accuracy. We would like to extend our algorithm to take into account the issues discussed in section 5 and apply the same on popular research dataset like that of Netflix. The strength of our algorithm is ease of understanding and simplicity in construction. We would look forward to extensions in the similar direction but encompassing much more complexity in our future works.
References 1. Herlocker, J., Konstan, J., Borchers, A., Riedl, J.: An Algorithm Framework for Performing Collaborative Filtering. In: Proceedings of SIGIR, pp. 77–87. ACM, New York (1999) 2. Lam, S., Riedl, J.: Shilling Recommender Systems for Fun and Profit. In: Proceedings of the 13th International WWW Conference, New York (2004)
Categorizing User Interests in Recommender Systems
291
3. Massa, P., Avesani, P.: Trust-aware collaborative filtering for recommender systems. In: Proceedings of the Federated Int. Conf. on the Move to Meaningful Internet (2004) 4. Massa, P., Avesani, P.: Trust-aware recommender systems. In: Proceedings of the 2007 ACM Conference on Recommender Systems, Minneapolis (2007) 5. Massa, P., Avesani, P.: Trust-aware Bootstrapping of Recommender Systems. In: Proceeedings of ECAI Workshop on Recommender Systems, Italy (2006) 6. Massa, P., Avesani, P.: Trust metrics on controversial users: balancing between tyranny of the majority and echo chambers. International Journal on Semantic Web and Information Systems 7. Mobasher, B., Burke, R., Bhaumik, R., Williams, C.: Towards Trustworthy Recommender Systems: An Analysis of Attack Models and Algorithm Robustness. ACM Transactions on Internet Technology 7, 23:1–23:38 (2007) 8. Rashid, A., Karypis, G., Riedl, J.: Learning Preferences of New Users in Recommender Systems: An Information Theoretic Approach. SIGKDD Explorations 10, 90–100 (2008) 9. Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item-based Collaborative Filtering of Recommendation Algorithms. In: Proceedings of the 10th International WWW Conference, Hong Kong (2001) 10. Victor, P., Cornelis, C., Cock, M., Teredesai, A.: Key figure impact in trust-enhanced recommender systems. AI Communications 21(2-3), 127–143 (2008)
Architecture of Hascheck – An Intelligent Spellchecker for Croatian Language Šandor Dembitz, Gordan Gledec, and Bruno Blašković University of Zagreb, Faculty of Electrical Engineering and Computing, HR-10000 Zagreb, Croatia {sandor.dembitz,gordan.gledec,bruno.blaskovic}@fer.hr
Abstract. The design and development of a spellchecker for highly inflected languages is commonly regarded as a challenging task. In this paper we present the architecture of Hascheck, a spellchecking system developed for Croatian language. We describe functional elements that make it an intelligent system and discuss specific issues related to Hascheck’s dictionary size as well as its guessing and learning capabilities. Keywords: spellchecking, intelligent system, online spellchecker, natural language processing, learning system.
1 Introduction The development of a spellchecker for highly inflected languages (languages with a great variety of different word forms for each lemmatised word) is a demanding and tedious task. The Croatian language, belonging to the group of Slavic languages, like Russian, Czech, Slovak, Polish and others, is such a language. It was estimated that a dictionary of 1 to 100 million words is needed for the good coverage of Russian texts [1], while the spellchecker with approximately 100,000 words would suffice for the coverage of English texts. If the estimation for the Russian stands, it can be applied to the Croatian language as well. In this paper we describe the architecture of Hascheck - an intelligent spellchecker developed for the Croatian language. Its name stems from its Croatian name: Hrvatski akademski spelling checker. Starting as an e-mail service 18 years ago, it served the basic error-reporting needs of a handful of users at the Faculty of Electrical Engineering and Computing in Zagreb. From the start, it was able to learn from its users’ misspellings. Over time, Hascheck’s functionality was extended to include misspelling corrections; it soon became a sophisticated intelligent system. In 2003 it started operating as an online service located at http://hascheck.tel.fer.hr/. The remainder of this paper is organised as follows. Chapter 2 gives a brief introductory overview of the Croatian language. Chapter 3 describes the architecture of the spellchecker. Chapter 4 discusses some of the issues related to the development of the system and gives a brief overview of system’s usage. Chapter 5 concludes the paper. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 292–301, 2010. © Springer-Verlag Berlin Heidelberg 2010
Architecture of Hascheck – An Intelligent Spellchecker for Croatian Language
293
2 Notes on Croatian Language Croatia is a West Balkans country with the population of 4.5 million. The official language spoken in Croatia is the Croatian. The writing system is based on Latin script. Croatian orthography is (mostly) phonetically based. It uses five letters with diacritics to express Croatian sounds: “č” corresponds to English digraph “ch” in “checker”, “ć” corresponds to Dutch digraph “tj” in “Aantjes, “đ” corresponds to Italian digraph “gi” in “Giulia”, “š” corresponds to English digraph “sh” in “shop”, and “ž” corresponds to Portuguese letter “j” in “Joaquim”. In the Croatian alphabet three digraphs are treated as letters: “dž” corresponds to the English letter “j” in “job”, “nj” corresponds to the Spanish letter “ñ” in “Buñuel”, and “lj” corresponds to the Spanish digraph “ll” in Castilla. In total, the Croatian alphabet contains 30 letters - 5 vowels and 25 consonants. The Croatian sound system uses two diphthongs, short and long /ịe/. In writing, they are represented by the digraph “je” (short) and trigraph “ije” (long), respectively. The writing system generally does not use accents. Foreign names borrowed from languages with Latin script are written in Croatian in their original orthography. Names from languages with non-Latin script should be transliterated according to Croatian transliteration rules, but in modern Croatian writing practice, English transliteration of names is used very often, since it makes them instantly recognizable. Croatian is a highly inflective language. Pronouns, nouns, adjectives and some numerals decline, and verbs conjugate for person and tense. Nouns have three grammatical genders (masculine, feminine and neuter). Grammatical gender of a noun affects the morphology of other parts of speech (adjectives, pronouns and verbs) attached to it. Nouns are declined into seven cases. The language is still in the process of standardization, with three orthography standards, which differ in many details, in use.
3 Hascheck’s Architecture Starting from specific issues tied with Croatian language, stated in Chapter 2, the spellchecker for Croatian had to be: • • • • • •
orthographically tolerant and user-oriented; able to cope with Croatian inflections in an acceptable way; flexible regarding improvements of its linguistic functionality; developable with modest man-power resources, expectably with zero funding; economic in terms of computational resources (memory, processor, disk etc.); easy to maintain.
The demands set before Hascheck required it to be an intelligent system. Hascheck comprises two mutually interacting functional subsystems (Fig. 1): • real-time subsystem and • post-processing subsystem.
294
Š. Dembitz, G. Gledec, and B. Blašković
Real-time Subsystem is the system’s core. It is employed when a new text comes in for processing. As the new text arrives for analysis, the Extractor block extracts valid tokens from further processing. Then, non-recognized tokens pass through the Classifier. Those two blocks were initially the only functional blocks used by the system; as the dictionary grew, the Guesser and the Corrector were added, which extended the functionality and enabled Hascheck to suggest corrections to the user in the form of a textual report. Post-processing Subsystem performs learning when the system supervisor requests it. Learning is based on the data collected during real-time usage (statistics, logs, input text and reports). As the result of the learning process, the dictionary is updated, thus improving the spellchecker’s functionality.
Fig. 1. The Architecture of Hascheck
With all the functional blocks and subsystems at work, Hascheck demonstrates its intelligence in the following ways: • by emphasizing tokens which might be valid or invalid Croatian word-types (peculiarity concept is realized through fuzzy evaluation of unrecognized alphabetic strings in the Classifier block); • by pointing out tokens which might be legal inflected forms of words or names contained in its dictionary (morphological tagging is based on language grapheme statistics performed in the Guesser block); • by offering the most probable corrections for non-tagged tokens (based on keyboard layout and language grapheme statistics; the function is realized in the Corrector block); • by learning new word- and name-types from texts received for processing (expert post-processing supervised by humans, realized in the Learning System block).
Architecture of Hascheck – An Intelligent Spellchecker for Croatian Language
295
3.1 Classification Algorithm The string classification algorithm used by Hascheck is language independent. It is based on a string weight concept [2], where the weight is a measure derived from language grapheme statistics of variable order. The string weight concept applied in UNIX program typo [3] was used as the basis for our algorithm, but was modified: the original string peculiarity calculation drew the measure of peculiarity from the processed text; our modification draws it from the dictionary and frequency of words occurring in it. In addition, we calculated string weight using more robust binary statistics of non-positional 3-graphs, 4-graphs and 5-graphs that occur in Croatian words. We preserved the term “peculiarity” and obtained 4 peculiarity classes for alphabetic strings not contained within Hascheck’s dictionary: • extremely peculiar strings, containing at least one unknown 3-graph; • very peculiar strings, containing at least one unknown 4-graph; • moderately peculiar strings, containing at least one unknown 5-graph; • almost non-peculiar strings, i.e. strings with all known 5-graphs; We also defined three additional classes for unusual strings: • short alphabetic strings (3 letters or less); • strings containing both letters and cyphers (eg. “F16”); • strings with unusual capitalization (eg. “WordPerfect”). The peculiarity algorithm, which is simple, quick, robust and effective because of phonetic character of Croatian writing system, is not applied on unusual strings quoted above. In the report presented to the user, each unknown string receives a tag depending on the class to which it belongs. 3.2 The Dictionary The dictionary is a paramount part of the system; it is organized in three word-list files: Word-Type (WT) file, Name-Type (NT) file and English-Type (EngT) file. The initial WT-file was derived from the right side of the English-Croatian Lexicographic Corpus (ECLC), a bilingual one-million token corpus of approximately 80,000 elements [4]. It contained a significant number of idioms, where the word usage was given by examples. This brought Croatian words into their inflected forms. After a careful manual spellchecking, we produced the initial WT-file with approximately 100,000 different Croatian common word-types; under “common word” we mean words which may occur written in small letters only, or with an initial capital letter (at the start of sentence, for example), or written in capital letters only, and which have not been borrowed with its orthography from foreign languages but belong intrinsically to the Croatian language. The content of WT-file is used to obtain n-graph statistics needed by the Classifier block. The left-hand side of ECLC was used to obtain representative English word-type sample. This sample was spellchecked with several English spellcheckers [5][6]. It produced the EngT-file with approximately 70,000 different English word-types. The decision to include English in Croatian spellchecking was based on common practice: English, as the lingua franca of modern times, often comes mixed with Croatian in contemporary Croatian writing, especially in the academic environment. Because of
296
Š. Dembitz, G. Gledec, and B. Blašković
the great distance between the two languages in terms of Damerau-Levenshtein metrics [7][8], there was no fear that potential Croatian misspellings could be treated as proper English, or vice versa [9]. The NT-file contains all case-sensitive elements of writing. These are proper and other names, abbreviations, acronyms, as well as names with unusual use of small and capital letters, like LaTeX or WordPerfect, etc. Furthermore, it contains words from foreign languages that appear in Croatian writing in their original orthography. The file started as an empty file, but in the course of learning, it increased to approximately 600,000 different name-types. In the same time, the WT-file increased from the initial 100,000 to 800,000 different Croatian common word-types. A fourth file may also be considered as a part of Hasheck’s dictionary: the ErrorType file (ErrT-file), which contains all strings that have ever passed though Hascheck, but were never accepted as common word-types or name-types. So far, the file contains approximately 1.3 million different error-types. The ErrT-file is used to decide whether a token should pass through Guesser block or not. If a token is in ErrT-file, it goes directly to the Corrector block. 3.3 The Guesser The word endings that make a language highly inflected are predicable, not only morphologically but also statistically [10]. With Hascheck, we have adopted the statistical approach in very early stages of its development and incorporated it into the functionality of the Guesser, using the statistics of endings found in WT- and NT-file. Guesser is able to associate various Croatian word endings with the words from WT- or NT-files. A successfully associated token is considered to be a correct, but unknown word- or name-type. Only if the token is in ErrT-file, the association is disregarded and the token goes to the Corrector block. To preserve precision, successful association with WT-file content is considered to be correct only if the token belongs to the lowest peculiarity class. 3.4 The Corrector As the final stage of the spellchecking procedure comes the Corrector. At the very beginning, only most frequent Croatian misspellings (letters “č”/“ć”, false diphthong presentation of “ije”/“je”) were reported with corrective suggestions. When Hascheck’s text coverage stabilized above 95%, general error correction was introduced. For approximate string matching the freeware tool nrgrep was used [11]. The general error correction starts with errors of distance 1 (one letter substitution/insertion/deletion or two adjacent letters transposition), but with the increase of dictionary size it was extended to errors of distance 2. Because of the phonetic character of written Croatian, higher distances are not needed. Distance 2 also covers the largest number of English misspellings and typos, too [11]. The results of approximate string matching are pondered with respect to keyboard layout and Croatian phonetic rules, but also by taking into account the frequency of words found in Hascheck’s dictionary. The pondering dominantly influences the order in which the correction suggestions are presented to the user. If the difference between ponders is extremely high, low pondered suggestions will not be displayed at all.
Architecture of Hascheck – An Intelligent Spellchecker for Croatian Language
297
3.5 The Learning System The Learning System is activated when a sufficient number of unrecognized tokens is collected. Those tokens once again pass through all Real-time Subsystem blocks, but this time no exclusion of tokens existing in ErrT-file is done, and all tokens, regardless of their peculiarity class, pass through Guesser block. At the end of the learning process, the Learning System creates two files, wordand name-type file to be used to update the dictionary. Both files are carefully checked before update, since the human factor in learning and supervision is also a possible source of errors. After a careful consideration, new tokens enter into the dictionary. The Learning System also updates the frequency profile of WT-, NT-, and ErrT-file. In order to preserve the privacy of our users, the full input text is never stored in any form on the system.
4 Experience and Discussion Hascheck started as an e-mail embedded service in 1993, first locally, only for the staff of the Faculty of Electrical Engineering and Computing in Zagreb, but it quickly became a public service, primarily oriented towards the Croatian academic community. In the summer of 2003, the e-mail service was upgraded into a web service at http://hascheck.tel.fer.hr/. In its e-mail phase, Hascheck processed 6,321 texts that formed a corpus of 39,135,406 tokens. In that phase it was quite an exclusive service with only several hundred users from Croatia and abroad, but during this phase many books and three voluminous pieces of Croatian lexicon were processed. With the web interface Hascheck became a broadly accepted service. During the web phase, it processed more than 800,000 texts received from more than 80,000 IPaddresses from Croatia and 61 other national domains. The web-phase corpus processed until now (March 20, 2010) amounts to more than 200 million tokens; only in the last month of its service, Hascheck served more than 4,000 users and processed more than 36,000 texts. Extensive report on Hascheck’s usage can be found in [13]. 4.1 On the Dictionary Hascheck’s dictionary is constantly growing. However, it does not grow as much as it could, neither does each legal Croatian word-type form nor name-type that appeared in processed texts enter the dictionary. A reason to reject them is their closeness to frequent or potentially frequent misspellings or typos. A good example is the relation between two Croatian grammatical forms. These are the adverb “slijedeći” = “following adv.”, which is not declinable, and the adjective “sljedeći = “the following, next, subsequent, successive adj.”, which is declinable. The ErrT-file provides that the declined forms of the adverb “slijedeći”, hence grammatically impossible “wordtypes”, are the most frequent errors in Croatian writing. Therefore the adverb “slijedeći” was deleted from the dictionary, although it is a perfect Croatian word. According to our opinion, it is better to report a non-existing error, a (rare) correct use
298
Š. Dembitz, G. Gledec, and B. Blašković
of the adverb “slijedeći”, than to overlook frequent errors. There are thousands of such examples, and only human competence is able to distinguish them. The corpus-based approach to dictionary updating is another restriction to its expansion. Only word- and name-types with confirmation in Croatian texts are in the dictionary. Theoretically, it would be possible to expand the dictionary with all allowable forms of words in it. However, we never tried to do this because of theoretical and practical reasons. 4.2 On the Guessing Block Hascheck’s guessing precision is about 90%, both for word-types and name-types. Guessing errors occur when an improper but allowable ending follows a proper root, which happens in writing, or when a root for an ending is misrecognized. Hascheck’s guessing recall differs significantly between word- and name-types. While more than 80% of all common Croatian word-types ever learnt were correctly tagged before learning, the same can be stated for only 40% of name-types. Name occurrence in writing is very stochastic and unpredictable – names come from various sources, while common words occurrence and their word forms are regulated by language grammar, hence they are much more predictable. It explains the recall differences. Generally speaking, the guessing could be extended to prefixes too. It could increase the guessing recall for common word-types significantly. However, because of the fear that it might reduce the guessing precision, since many productive Croatian prefixes are semantically motivated, while suffixes are generally only grammatically motivated, we have never done it in the real system. We made several tests, but they confirmed our fear only. 4.3 On the Learning System The record structure of the Hascheck’s report is illustrated by Fig. 2.: 1. 2. 3. 4. 5. 6.
-ss- Santaninom 3 => P! -ss- Sonije 5 => Sonje? Sinije? Sočnije? Sofije? Sorije? Jonije? -ss- Spiroskog 1 => Pirotskog? Sirskog? Spiljskog? Spinskog? Sportskog? -ss- Valka 1 => Vlaka? Valja? Valla? Valma? Balka? Falka? Galka? Valjka? -ss- Vatana 1 => Vagana? Varana? Hvatana? Batana? Catana? Atana? Vataža? -ss- astronautovih 1 => p! Fig. 2. Sample of the report prepared for learning
First, the peculiarity class mark comes (“-ss-“ means “almost non-peculiar string”), then the questionable token followed by the number of times it occurred in the processed text, and finally either the suggestions for correction or the tagging mark (“P!” means that the string is successfully associated to a token from the NT- or EngT-file, while “p!” means that the string is successfully associated to the token from the WT-file). The insight into the corresponding context file confirms that association was correctly performed in both cases in Fig. 2. Line 1 refer to Carlos Santana, a world-wide
Architecture of Hascheck – An Intelligent Spellchecker for Croatian Language
299
famous musician, and the token is an adjective derived from his family name, while line 6 is an adjective derived from the Croatian common word “astronaut”, one of the rare words that are spelled equally in English and Croatian. “Spiroskog” (line 3) is genitive of the Macedonian family name Spiroski. Line 5 refers to the Vatan - Saint-Fargeau stage of Tour de France 2009, where “Vatana” is the genitive of “Vatan”. There are no obstacles to learn “Spiroskog” and “Vatana” as valid name-type tokens. However, in two cases in this example human intervention is needed: line 2 refers to Sonia Gandhi, the Indian politician, where “Sonije” is the correct Croatian genitive of the foreign personal name “Sonia”, but very close to the genitive of a frequent Croatian name “Sonja”, which is given as the first-ranked correction. It is obvious that the decision “to learn or not” is not trivial; frequency of occurrence may influence on it. We decided not to learn it, as “Sonje” is more frequent in Croatian. Line 4 refers to the frequent Dutch family name Van der Valk, where “Valka” is the Croatian genitive of “Valk”. However, “Valk” is very close to the frequent Croatian family name “Vlak”, whose genitive is suggested as the first-ranked correction. Again was the decision to learn or not to learn a difficult one. The last two examples explain why learning should be supervised by humans. Some decisions cannot rely on machine intelligence; they must be made based on human language competence. Dictionary updating is a matter of system precision. Users do not like false negatives, i.e. non-words not recognized as such by the spellchecker. False negatives are a much more serious problem than false positives, because a spellchecker with high recall, but low precision, does not serve its intended purpose. The problem can partially be resolved by introducing grammar checking and contextual spellchecking, but it is a privilege of few central languages only [18]. Nobody claims that Hascheck’s dictionary is 100% error-free. A special part of the Learning System takes care of errors that may occur there. If such an error is found, it is removed from the corresponding dictionary file. The system registers each Croatian common word-type that was deleted from the WT-file. Until now 3,724 such deletions were made, which amounts to 0.5% of the actual WT-file size. Each deletion in WT-file results in the n-graph statistics being updated. We estimate the NT-file deletions to be 1% of its actual size.
5 Comparison with Other Spellcheckers A while ago we tested 5 Croatian proprietary spellcheckers with a representative sample of errors collected through Hascheck. The false negative acceptance ranged between 2% and 6% [19]. With Croatian open-source spellcheckers, the situation is even worse. As is usual with non-central languages, all open-source engines, ispell, aspell and hunspell, use modest-size dictionary for Croatian spellchecking purposes. All examples quoted bellow can be proven by using multilingual online spellchecker Orangoo, powered by Google, aspell and TextTrust [20]. The open-source Croatian dictionary ignores: • capitalization in Croatian; it allows “amsterdama” to pass undetected, although the only correct spelling is “Amsterdama”, the Croatian genitive of “Amsterdam”.
300
Š. Dembitz, G. Gledec, and B. Blašković
• Croatian alphabet; it accepts “hydrostatičko = hydrostatic adj.”, although “y” is not a letter in the Croatian alphabet, and even worse, it rejects the correct adjective spelling “hidrostatičko”, offering “hydrostatičko” as a correction. • phonetic character of the Croatian writing system; it allows “narezci”, presumably plural of “narezak = slice of ham, sausage etc.”, but ignoring that “z” before “c” must be converted into “s”, hence the correct word plural spelling is “naresci”. • allowable Croatian endings; it allows “obukae”, although “–ae” does not exist as an ending in Croatian; the noun “obuka = training” is declined by changing the ending –a, and, in some phonetically motivated cases, by converting “k” into “c” before ending –i. The Croatian open-source dictionary has 15% of such content.
6 Conclusion In order to build a spellchecker able to deal with characteristics common to very inflective languages, we have developed Hascheck - an intelligent spellchecking system. From its early days as an e-mail embedded service to the present status of a web service, Hascheck has constantly evolved in functionality, with the addition of the guessing and correction functionality to its basic error-reporting service, as well as the implementation of a human-supervised learning system. With more than 4,000 users per month and more that 36,000 texts received for processing, Hascheck has gained a wide and loyal user community. Hascheck incorporates peculiarity concept to classify unknown tokens it received for processing, points out tokens which might be inflected forms of legal tokens stored in its dictionary and it offers the most probable corrections for tokens which can’t be associated with any known token. The implemented concept of learning new tokens from the input its users submitted for processing resulted in the octupling of its initial Croatian common-word dictionary size and the creation of a dictionary of more than half a million name-types. The experience gained in the development of this spellchecker shows that the human supervision is extremely important, since only a human supervisor can decide whether the acceptance of a new (valid) word, which is also a probable misspelling of the existing word, turns out to be beneficial or detrimental to the usefulness of the dictionary. We strongly believe that our experience and achievements in the development of the spellchecker can be useful for any highly inflected language, not only for Croatian.
References 1. Dolgopov, A.S.: Automatic spelling correction. Cybernetics 22(3), 332–339 (1986) 2. Dembitz, S.: Word generator and its applicability. In: Proc. of the International Zurich Seminar on Digital Communications: Man-Machine Interaction, Zürich, Switzerland, pp. 59–64 (1982) 3. Morris, R., Cherry, L.L.: Computer detection of typographical errors. IEEE Transactions on Professional Communications PC-18(1), 54–64 (1975)
Architecture of Hascheck – An Intelligent Spellchecker for Croatian Language
301
4. Bratanić, M.: English-Croatian Lexicographic Corpus. Bulletin of the Institute of Linguistics in Zagreb 1(1), 71–73 (1975) (in Croatian) 5. McIllroy, M.D.: Development of a spelling list. IEEE Trans. Commun. COM-30(1), 91–99 (1982) 6. Turba, T.N.: Checking for spelling and typographical errors in computer-based text. ACM SIGPLAN Notices 16(6), 51–60 (1981) 7. Damerau, F.J.: A technique for computer detection and correction of spelling errors. Communications of the ACM 7(3), 171–176 (1964) 8. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10, 707–710 (1966) 9. Dembitz, Š.: Distance between languages. In: Proc. SoftCOM 1996, Split, Croatia, pp. 219–296 (1996) (in Croatian) 10. Goldsmith, J.: Unsupervised Learning of the morphology of a natural language. Computational Linguistics 27(2), 153–198 (2001) 11. Navarro, G.: NR-grep: a fast and flexible pattern matching tool. Software Practice and Experience SPE-31, 1265–1312 (2001a) 12. Kukich, K.: Techniques for automatically correcting words in text. ACM Computing Surveys 24(4), 377–439 (1992) 13. Dembitz, Š., Knežević, P., Sokele, M.: Hascheck – The Croatian academic spelling checker. In: Milne, R., Macintosh, A., Bramer, M. (eds.) Applications and Innovations in Expert Systems VI, pp. 184–197. Springer, Heidelberg (1999) 14. Peterson, J.L.: A note on undetected typing errors. Communications of the ACM 29(7), 633–637 (1986) 15. Damerau, F.J., Mays, E.: An examination of undetected typing errors. Information Processing and Management 25(6), 659–664 (1989) 16. Zipf, G.K.: Human Behavior and the Principle of Least Effort. Addison-Wesley, Cambridge (1949) 17. CLC, Croatian Language Corpus, Institute of Croatian Language and Linguistics, Zagreb, Croatia (2010), http://riznica.ihjj.hr/ (March 14, 2010) 18. Dembitz, Š., Gledec, G., Randić, M.: Spellchecker, Wiley Encyclopedia of Computer Science and Engineering. In: Wah, B.W. (ed.), vol. 5, pp. 2793–2804. John Wiley & Sons, Hoboken (2009) 19. Dembitz, Š., Sokele, M.: Comparison of Croatian spelling checkers. In: Proc. SoftCOM 1997, Split-Dubrovnik (Croatia) – (Italy), pp. 191–200 (1997) (in Croatian) 20. Orangoo. Copyright 2007 TextTrust.com (2009), http://orangoo.com/spell/ (March 14, 2010)
Light-Weight Access Control Scheme for XML Data Dongchan An, Hakin Kim, and Seog Park Department of Computer Science & Engineering, Sogang University C.P.O. Box 1142, Seoul Korea 100-611 {channy,gkrlsl,spark}@sogang.ac.kr
Abstract. In this paper, an adaptable light-weight access control scheme is proposed for an evolutionary computing environment. The existing access control environment relies on the server to solve security problems. The current environment has become almost distributed and ubiquitous. This change has spawned the need for light-weight access control by clients, which have a resource-limited environment. Existing studies using role-based prime number labeling (RPNL) are limited by the scalability of the prime number. The problem of scalability of the prime number was addressed with the use of the strong point of RPNL and implementation of persistent XML labeling and efficient query processing. The proposal has the advantage of having an adaptable access control scheme for an existing XML labeling method. We showed the efficiency of the proposed approach through experiments. Keywords: Access Control, Labeling Scheme, XML, Resource-Limited Environment.
1 Introduction The access control environment has changed from a static to a dynamic one in these days. The current computing environment is almost distributed and ubiquitous. This change has spawned the growing need for light-weight access control not by servers but by clients that have a resource-limited environment. Previous studies on access control have focused on safety rather than efficiency. However, the issues of maintaining security in a dynamic XML environment have attracted interest from researchers in the field. The current security mechanism preferred is the use of an encryption method for client-side access control. However, this method has setbacks, including the problem of encryption key management or limitations of fine-grained access control. This paper investigates a safe and efficient method of handling dynamic XML data, which is a rising standard of data storage and transmission in devices. Examples of its applications are in personal digital assistants or portable terminals, which have limited resources. XML access control and previous XML filtering have very similar features because XML access control rules can be represented using a XPath language, such XPath queries. This provides us with the possibility of efficient query processing of XML access control rules. This paper attempts to improve capability by first suggesting a light-weight access control processing method with a small overhead for secure R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 302–311, 2010. © Springer-Verlag Berlin Heidelberg 2010
Light-Weight Access Control Scheme for XML Data
303
results. The study then proceeds to determine potential points of optimization in every process step to offset overheads from added access controls. The rest of this paper is structured as follows. Section 2 presents related works in the area of XML access control and labeling scheme. Section 3 introduces the algorithm and techniques of the proposed method. Section 4 shows the experimental evaluations from the implementation and the processing efficiency of the proposed framework. Finally, conclusions are made in Section 5.
2 Related Work Traditional XML access control enforcement mechanism [2, 3, 6, 7, 8, 10] is a viewbased enforcement mechanism. The semantics of access control to a user is a particular view of the documents determined by the relevant access control policies. It provides a useful algorithm for computing the view using tree labeling. However, in addition to its high cost and maintenance requirements, this algorithm is also not scalable for a large number of users. To overcome view-based problems, Murata et al. [15] proposed a filtering method to eliminate queries that do not satisfy access control policies. Luo et al. [14] rewrote queries in combination with related access control policies before passing these revised queries to the underlying XML query system for processing. However, the shared Nondeterministic Finite Automata (NFA) of access control rules is created by a user (or one in a user’s role). Thus, the shared NFA involved a number of unnecessary access control rules from the user’s query point of view, requiring more time-consuming decisions during which the user’s query should have already been accepted, denied, or rewritten. In the hierarchical data model, an authorization defined by a security manager is called explicit and an authorization defined by the system, on the basis of an explicit authorization, is called implicit. An implicit authorization is used with an appropriate “propagation policy” for the benefit of storage. With the assumption that the optimized propagation policy varies under each of the different environments, “most-specific-precedence” is generally used. A “denial-takes-precedence” is used to resolve a “conflict” problem that can be derived from the propagation policy by such implicit authorization. “open policy”, which authorizes a node that does not have explicit authorization, and “closed policy”, which rejects access, are used because positive and negative authorizations are also used together. The “denial-takesprecedence” and “closed policy” policies are generally used to ensure tighter data security [15]. For the query processing of a dynamic XML document, a labeling technique that can be easily applied to insertion and deletion elements is required. Some existing labeling techniques lack document updating capability and instead search whole XML document trees again to re-calculate the overall label of the node, thereby resulting in higher costs [15]. The XML labeling technique has emerged as a consequence of the appearance of the dynamic XML document. This is a typical technique in the prime number labeling scheme applied to information that rarely affects other labels. This technique assigns a
304
D.C. An, H.I. Kim, and S. Park
label for each node, a prime number, to represent the ancestor-descendant relation. It is designed to leave the label of other nodes unaffected when updating the document. However, it has a higher updating cost because it searches a range of the XML trees again and re-records updated order information during the updating process [5, 11, 12, 17, 19].
3 Light-Weight Access Control The proposed environment for the light-weight access control for dynamic XML data is shown in Fig. 1. When a query is requested, it depicts medical records that require accurate real-time query answers by checking the authority and role of valid users via an access control policy. The medical XML documents in Fig. 2 are explained under a specific environment. Data stream sources
Client
Input queries
Query Processing
Server XML Documents
ACR
Client Broadcasting
Query Processing
ACR
Fig. 1. Light-weight access control environment
Fig. 2. Medical XML documents
3.1 Group-Based Prime Number Labeling Labeling Notation. The labels of all nodes were constructed using four significant components (l, L1, L2, and L3), which are unique.
Light-Weight Access Control Scheme for XML Data
305
1. Level component (l) – This represents the level of nodes in the XML document. The level of the tree is marked from root to leaf but the level of the root is null. 2. Ancestor label component (L1) – The component that succeeds the label of the parent node is inherited. It eliminates the level component from the parent node label. By succeeding the label of the parent node, the exact location of the node can be identified. 3. Self-label component (L2) – This represents the the self-label of node in the XML document. 4. Group-based prime number product component (L3) – This represents the group-based prime number product of accessible roles for the node. A unique label is created by four components, which are concatenated by a “delimiter” (.). Role-Based Prime Number Labeling (RPNL) Scheme. The labeling scheme[1] for an XML document is as follows: l L1.L2.L3 1. The root node is the first level. The root node is labeled “null” because it does not have a sibling node and a parent node. 2. The first child of the root node is labeled “1a1.L3” (N1). L3 pertains to the prime number product of the accessible role for the node. The second child of the root node is labeled “1b1.L3” (N2). l and L2 are concatenated because the label of the parent node (L1) is “null”. The third component (L3) is optional in this level. If the third component is accessible for all roles, the third component can be omitted. 3. The first child of the second level N1 is labeled “2a1.a1.L3” (NN1). The second child of the second level N2 is labeled “2b1.a1.L3” (NN2) because the label of the parent node (L1) is inherited. The third component (L3) is labeled with a prime number product for the accessible roles of the node. 4. The first child of the third level NN1 is labeled “3a1a1.a1.L3” (NNN1). The second child of the third level NN2 is labeled “3b1a1.a1.L3” (NNN2) because the labels of the parent node (L1, L2) are inherited. 5. The first child of the third level NNN1 is labeled “4a1a1a1.a1.L3” (NNNN1). The second child of the third level NNN2 is labeled “4b1a1a1.a1.L3” (NNNN2) because the labels of the parent node (L1, L2) are inherited. 6. The third component (L3) is a prime number product of the accessible role for the node and is generated by the following: - Date (210): product of 2, 3, 5, 7 ← the prime number of accessible roles - Doctor (30): product of 2, 3, 5 - Bill (14): product of 2, 7 - Patient (210): product of 2, 3, 5, 7 - Diagnosis (30): product of 2, 3, 5 - Dname (6): product of 2, 3 - Sex (210): product of 2, 3, 5, 7 - Pname (42): product of 2, 3, 7
306
D.C. An, H.I. Kim, and S. Park
Fig. 3 shows the labeled XML when the above RPNL labeling scheme is applied. The root node is labeled with an empty string. The self-label of the first child node is “a1”, the second child is “b1”, the third is “c1”, and the fourth is “d1”. In Fig. 3, the label of the node concatenates its parent label and the self label. If “2a1.a1.L3” is a prefix of “3a1a1.a1.L3,” “2a1.a1.L3” is an ancestor of “3a1a1.a1.L3”. In case the level difference is ‘1’, the upper level node is the parent node of the lower level.
Group1
Group2
Fig. 3. RPNL of medical XML documents
The RPNL [1] has many strong points but the product of the prime number for the access control has a weak point, which is the scalable problem of prime number product. If there are four roles in the access control rule, the product is 210(2x3x5x7). One byte can present until a maximum of 210. If there are 25 roles, a number greater than a trillion bytes is needed because the prime number product of the 25 roles is extremely large. A solution for the problem of scalability will be discussed in detail in the next paragraph. Group-Based Prime Number Labeling Scheme. The proposed labeling scheme is similar to the RPNL scheme but without step 6. The third component (L3) is not a prime number, and the group-based prime number product of accessible roles for the node is generated by the following: - Group1 (surgery): R1, R2, R3, R4 - Group2 (pediatrics): R5, R6, R7, R8 - Role assigned group-based prime number Group1: R1 (1.2), R2 (1.3), R3 (1.5), R4 (1.7) Group2: R5 (2.2), R6 (2.3), R7 (2.5), R8 (2.7) In Fig. 3, the group using the surgery and pediatrics nodes was defined because of the scalability of the prime number product. In this scenario, one group consisted of four roles each (e.g., R1, R2, R3, R4). Similar to RPNL, if patient(2) and doctor(3) are able to access the node, the node’s label has ‘6’. A group ID of ‘1.6’ (if group ID is ‘1’) can be attached in front of ‘6’. More specifically, in the labeling of the group-based prime number, the third component (L3) notation is
Light-Weight Access Control Scheme for XML Data
307
[Product of Group1 nodes.Product of Group2 nodes… Product of Groupn nodes] - max number of group is group n - ‘0’ means no admission node For example, accessible roles - R1, R2, R7, R8 Group1 – product of R1 and R2, 2 x 3 = 6 Group2 – product of R7 and R8, 5 x 7 = 35 Therefore, the node label for the access control is ‘6.35’, and it requires 2 bytes. In the existing RPNL method, if a node is able to access eight roles (e.g., 2, 3, 5, 7, 11, 13, 17, 19), then the product would be 9,699,690, and it requires 3 bytes for the product of the numbers. However, it is sufficient to hold 1 byte for each group (total 1 byte for four roles), and only 2 bytes for two groups are needed. Thus, the groupbased prime numbers are more space-consuming than the existing method. Another advantage of the group-based prime number labeling is that it can be attached to other labeling schemes; for example, the last component can be added by ORDPATH[16], GRP[13], and LSDX[9]. 3.2 Query Processing by Group-Based Prime Number Labeling The architecture of the proposed access control is shown in Fig. 4. The query processing procedure in Fig. 4 can be considered in two steps.
Fig. 4. Proposed access control architecture
The role check of the user is accomplished in step 1 through the “role-based prime number” of [1]. The next access authority is checked in step 2 using the “role privacy” of [1]. Once a query from a mobile terminal is registered, access authority is reviewed by step 1 by checking the role-based prime number of the query terminal node. That is, access to step 1 is permitted when the remainder of L3 (group-based prime number product of the accessible role) divided by the role of the user becomes zero. Finally, accessibility is checked by step 2, referring to the “role privacy” of [1]. Moreover, as indicated in Section 2, query access is rejected by “denial-takes-precedence” [15].
308
D.C. An, H.I. Kim, and S. Park
4 Evaluation 4.1 Label Size and Time In this section, the applicability of the approach was compared in various ways. For the first set of experiments, one PC with an Intel Pentium IV 2.66GHz CPU and a main memory of 1 GB using the MS Windows XP Professional OS was used. The advanced group-based prime number labeling scheme was also implemented in JAVA (JDK1.5.0) and SAX from Sun Microsystems as the XML parser was used. For the database, XMark datasets were used to generate XML documents [18]. The performances of GRP [13], LSDX [9], and ORDPATH [16], which support dynamic XML data, were analyzed. An XMark was utilized to generate the first 10 XML documents consistent with the number used by [13]. Labels from the XML documents were generated using the proposed dynamic labeling scheme and compared with the two other schemes in terms of total label lengths. The label size in the proposed scheme is about three times shorter on average compared to the GRP scheme, two times shorter than the LSDX scheme, and almost equal to ORDPATH in Fig. 5.
8 GRP LSDX Proposed Labeling Scheme ORDPATH
7
5 4 3 2 1 0 1
2
3
4
5
6
7
8
9
10
Documents Size(MB)
Fig. 5. Label size for query processing
Fig. 6. Label size for access control
9 8
GRP
7
LSDX Proposed Labeling Scheme
6
ORDPATH
L a b e l T im e (se c )
L a b e l S iz e (M B )
6
5 4 3 2 1 0 1
2
3
4
5
6
7
Documents Size(MB)
Fig. 7. Label time
8
9
10
Light-Weight Access Control Scheme for XML Data
309
The scalability of the group-based prime number is superior to the max 75%, which is better than the bit list method, and to the max 60%, which is better than prime number method (Fig. 6). The proposed schemes also provide a choice unit of groupbased prime number product for adaptable labeling size, including 4 bits, 1 byte, and 2 bytes, according to number of roles or groups. Fig. 7 indicates that the proposed labeling time is superior to the others. 4.2 Memory Usage and Query Processing Time The same hardware systems were used in the second experiments. For the data, the following SigmodRecord.xml was used. Table 1. SigmodRecord.xml Size
467KB
# distinct tags
11
Depth
5
#text nodes
8,383
# elements
10,022
Memory usage and the overhead of the access control were analyzed. Memory usage was analyzed for preprocessing time. Fig. 8 and 9 indicate the results of each method. The proposed method uses the same memory without reference to the depth of access control rules, whereas [4]’s method uses more memory with reference to the depth of access control rules. Through the experiments, the proposed method was shown to be superior to [4]’s method because of faster access decision.
254000
DTD Registration
DTD Registration(in predicate)
ACR Registration
ACR Registration(in predicate)
Query Registration
Query Registration(in predicate)
385000
ACR Registration(in predicate)
Query Registration
Query Registration(in predicate)
380000 M em ory U sage(bytes)
M em ory U sag e(by tes)
252000
ACR Registration
250000 248000 246000 244000
375000
370000
365000
242000 240000
360000
3
4
5
6
Depth of Access Control Rules
Fig. 8. Memory usage of proposed method
3
4
5
6
Depth of Access Control Rules
Fig. 9. Memory usage of [4]
In the third experiment, an average processing time was applied (Fig. 10). The average query processing time was compared in two cases: one applied the access control method proposed in this paper and the other did not. The processing time is represented by an average time to minimize error of measurement. Referring to access control information does not affect the system performance.
310
D.C. An, H.I. Kim, and S. Park
pre-processing
query processing
400
Processing T ime(ms)
350 300 250 200 150 100 50 0 Not Apply to Access Control Rules
Apply to Access Control Rules
Fig. 10. Average processing time
5 Conclusions A light-weight access control for dynamic XML data was presented in this paper. The limitations of the existing access control and labeling schemes for XML data, assuming that documents were in an ubiquitous computing environment, were also highlighted. An evolutionary computing environment requires an adaptable access control scheme. The existing access control environment focuses on the server to solve security problems. However, there is a growing need in the current computing environment for light-weighted access control not by the server but by the client, who has a resource-limited environment. This is because the recent environment is almost distributed and ubiquitous. Previous studies using the RPNL are limited by the scalability of the prime number. Using the strong point of RPNL, a persistent XML labeling and an efficient query processing were implemented. The proposed method has the advantage of an adaptable access control scheme for an existing XML labeling method. In this paper, a group-based prime number labeling method that improved role-based prime number labeling was proposed. In particular, the group-based prime number labeling was able to address the scalability of the prime number product. A light-weight access control for dynamic XML data using group-based prime number labeling and fast query processing engine was also proposed. Experimental evaluations clearly demonstrated that the proposed approach was efficient. An efficient and secure query processing in ubiquitous sensor networks will be the focus of our future studies.
References 1. An, D.C., Park, S.: Access Control Labeling Scheme for Efficient Secure XML Query Processing. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.) KES 2008, Part II. LNCS (LNAI), vol. 5178, pp. 346–353. Springer, Heidelberg (2008) 2. Bertino, E., Castano, S., Ferrari, E., Mesiti, M.: Specifying and Enforcing Access Control Policies for XML Document Sources. WWW Journal (2000)
Light-Weight Access Control Scheme for XML Data
311
3. Bertino, E., Ferrari, E.: Secure and Selective Dissemination of XML Documents. TISSEC 5(3) (2002) 4. Bouganim, L., Ngoc, F.D., Pucheral, P.: Client-Based Access Control Management for XML Documents. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 84–95 (2004) 5. Cai, J., Poon, C.K.: OrdPathX: Supporting Two Dimensions of Node Insertion in XML Data. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) Database and Expert Systems Applications. LNCS, vol. 5690, pp. 332–339. Springer, Heidelberg (2009) 6. Damiani, E., Vimercati, S., et al.: Securing XML Document. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, p. 121. Springer, Heidelberg (2000) 7. Damiani, E., Vimercati, S., et al.: A Fine-grained Access Control System for XML Documents. ACM Trans. Information and System Security 5(2) (2002) 8. Damiani, E., Fansi, M., Gabillon, A., Marrara, S.: A general approach to securely querying XML. Computer Standards & Interfaces 30(6), 379–389 (2008) 9. Duong, M., Zhang, Y.: LSDX: A new Labeling Scheme for Dynamically Updating XML Data. In: Australasian Database Conference (2005) 10. Fan, W., Fundulaki, I., Geerts, F., Jia, X., Kementsietsidis, A.: A View Based Security Framework for XML. In: AHM (2006) 11. Kim, J.Y., Park, S.M., Park, S.: A New Labeling Scheme without Re-labeling Using Circular Concepts for Dynamic XML Data. In: Proceeding of CIT (1), pp. 318–323 (2009) 12. Liang, W., Takahashi, A., Yokota, H.: A Low-Storage-Consumption XML Labeling Method for Efficient Structural Information Extraction. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) Database and Expert Systems Applications. LNCS, vol. 5690, pp. 7–22. Springer, Heidelberg (2009) 13. Lu, J., Wang Ling, T.: Labeling and Querying Dynamic XML Trees. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds.) APWeb 2004. LNCS, vol. 3007, pp. 180–189. Springer, Heidelberg (2004) 14. Luo, B., Lee, D.W., Lee, W.C., Liu, P.: Qfilter: Fine-grained Run-Time XML Access Control via NFA-based Query Rewriting. In: CIKM 2004, pp. 543–552 (2004) 15. Murata, M., Tozawa, A., Kudo, M.: XML Access Control Using Static Analysis. In: ACM CCS, New York, pp. 73–84 (2003) 16. O’Neil, P., O’Neil, e., et al.: ORDPATHs: Insert-Friendly XML Node Labels. In: Proceeding of the SIGMOD, pp. 903–908 (2004) 17. Sans, V., Laurent, D.: Prefix Based Numbering Schemes for XML: Techniques, Applications and Performances. In: Proceedings of the 34th International Conference on Very Large Data Bases, pp. 1564–1573 (2008) 18. Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: XMark: a benchmark for XML data management. In: VLDB (2002) 19. Wu, X., Li, M., Hsu, L.: A Prime Number Labeling Scheme for Dynamic Ordered XML Trees. In: ICDE (2004)
A New Distributed Particle Swarm Optimization Algorithm for Constraint Reasoning Sadok Bouamama SOIE Laboratory, University of Tunis, Higher school of technology and informatics of Carthage [email protected]
Abstract. Within the framework of constraint reasoning we introduce a newer distributed particle swarm approach. The latter is a new multi-agent approach which addresses additive Constraint Satisfaction problems (∑CSPs). It is inspired by the dynamic distributed double guided genetic algorithm (D3G2A) for Constraint reasoning. It consists of agents dynamically created and cooperating in order to solve problems. Each agent performs locally its own particle swarm optimization algorithm (PSO). This algorithm is slightly different from other PSO algorithms. As well, not only do the new approach parameters allow diversification but also permit escaping from local optima. Second,. Experimentations are held to show effectiveness of our approach. Keywords: Constraints satisfaction problems, particle swarm optimization, guidance, NEO-DARWINISM theory.
1 Introduction One of the most wonderful and attractive field of artificial intelligence is the Constraint Satisfaction Problems (CSP). In fact, World Real Problems such as resource allocation, frequency allocation, Vehicle routing problem, etc, are easily formulated as a CSPs. Unfortunately these problems are NP-hard. That is why a foremost research challenge is tilting new solvers and methods for CSPs resolution. A CSP is a triple (X, D, C) whose components are defined as follows: • •
X is a finite set of variables {x1, x2, ... xn}. D is a function which maps each variable in X to its domain of possible values, of any type, and Dxi is used to denote the set of objects mapped from xi by D. D can be considered as D = {Dx1, Dx2, …,Dxn};C is a finite, possibly empty, set of constraints on an arbitrary subset of variables in X. These constraints are represented in Extension or in Intention. A CSP solution is an instantiation of all variables with values from their respective domains. The instantiation must satisfy all constraints. This solution is really costly to get. Nevertheless, In fact, with some real life problems, the solution is either too expensive to get or even nonexistent. In addition, in some cases we need to express preferences and evolution of constraints in time. In such a case, one should assign a weight for each constraint and try to maximize the sum of satisfied constraint R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 312–321, 2010. © Springer-Verlag Berlin Heidelberg 2010
A New Distributed Particle Swarm Optimization Algorithm for Constraint Reasoning
313
weights. These problems are referred to as additive CSPs (∑CSPs) [12]. They could be defined as a quadruple (X, D, C, f); where (X, D, C) is a CSP and f is an objective function which maps every instantiation to a numerical value. The latter represents the sum of violated constraint weights. It is tempting to note that ∑CSPs make up the framework of this paper. ∑CSPs [12, 15] have been dealt with by complete or incomplete methods. Complete methods are able to provide an optimal solution. Unfortunately, the combinatorial explosion thwarts this advantage. However incomplete methods, such as Genetic Algorithms (GA) [15], and Particle swarm optimization (PSO) [10] have the property of avoiding the trap of local optima, they also sacrifice completeness for efficiency. The present work arises from one of the incomplete but distributed GAs based method known as Dynamic Distributed Double Guided Genetic Algorithm (D3G2A). The latter has been successfully applied to many extensions of CSPs [2, 3, 4]. This paper focal point is PSO. Our interest is motivated by the fact that PSO are proven to be useful in hard optimization problems [13, 14]. This paper aims at presenting a new distributed PSO based approach and is organised as follows: The first section recalls rapidly the PSO. The second one presents the Dynamic Distributed Double Guided Particle Swarm Optimization; the basic concepts, the agent structure and the global dynamic. The third section details both experimental design and results. Finally, concluding remarks and possible extensions to this work are proposed.
2 Particle Swarm Optimization and Particle Swarm Optimization for CSP Particle swarm optimization (PSO) is developed by Kennedy and Eberhart [10]. It consists of an evolutionary computation (EC) technique in which each potential solution is called “particle”. Accordingly, the generic PSO [14] algorithm could be initialized with a random solution population. After that, and for many generations, it searches for the optimum. The generations updating and the populations’ evolutions are based on prior generations. In addition to common EC principles, PSO borrows the artificial swarming intelligence. In fact, PSO estimates that population evolution could be helped by socially shared information [5, 10, 14, 15]. Consequently, the particle movements are managed by reference to both the local particle intelligence and the global swarm intelligence. In [11] Lin formulated this evolution as shown in equation 1. r Vi(t ) = α (individualintelligence) + β (global intelligence) (1)
r
Where α and β are two random parameters. Vi(t ) is the particle’s velocity. The latter is an n-dimensional vector that moves particle Pi at time t. In fact, in PSO the particles are floated through the search space by following the current known optimal particles [10]. The particle updates are accomplished according to the equations 2 and 3. It is tempting to note here that velocities are mainly affected by particle's own knowledge and the neighbors' experience [14].
r r Vi (t ) = w × Vi (t ) + α ( Plbest − xid ) + β ( Pgbest − xid )
(2)
314
S. Bouamama
r xid = xid + Vid
(3)
Each particle’s new velocity is computed by reference to equation 1. The calculus considers the previous velocity Vri ( t ) , the particle's location at which the best fitness (objective function value) reached up to now (Plbest), and the population global neighborhood at which the best fitness reached up to now (Pgbest). The particle’s position in the search space, denoted by xid, is updated using the particles’ last position and its velocity (equation 3). The Inertia weight w determines the influence of the previous velocity. This weight operates to speed up the particle’s convergence by controlling alternatively the capacity of the particles to explore and exploit the search space. PSO have been proved to be worth emphasizing in many fields of optimization [14]. Unfortunately, and like other EC techniques, many weaknesses are registered in literature. First of all, the PSO parameters are specific to the problem to solve and have to be reworked in each problem category [14]. This problem dependency of PSO parameters motivates future developing of PSO based algorithms. On the other hand, PSO may possibly be trapped in local optima or become slower if it approaches an optimal solution [10]. In this case one could refer to a good parameters management. The latter could be thwarted by combinatorial explosion. So, one had better consider other parameters to the PSO algorithms.
3 D3GPSO: The Dynamic Distributed Double Guided Particle Swarm Optimization Algorithm Inspired by works in [2,3,4,9], our work consists in developing a distributed PSO approach for constraint reasoning. In [9], the CPU time is in major part the time for all particles to be initialized as feasible solutions. Consequently on the other hand, the method is based on preserving feasibility of solutions. The first problem is that each particle searches the whole space but only keeps tracking feasible solutions. This assessment is meant to find the optimum in feasible space. However, if an instantiation is inexistent in the initial search space, there is no way to obtain it by the optimization process. Thus, it is sometimes necessary to have a random sub-process in order to generate the possibly missing potential solutions. The first known improvement mechanism, in this case, is the diversification of the search process in order to escape from local optima [12]. No doubt, the simplest mechanism to diversify the search is to consider a noise part during the process. For all these reasons, and by reference to [2,3,4,12], we propose to enhance the movement of particles by a random providing operator. The latter is called guidance probability Pguid. The search process executes a random movement with a probability Pguid and will be guided and intensified with a probability 1- Pguid [2,3,4]. The random movement for a particle could be done according to equation (4).
xid = random× xid
(4)
3.1 Basic Principles Unsurprisingly, birds swarms could be spread in the same country to find sustenances. As soon as one of the swarms finds a source of nourishment, all swarms could pursue
A New Distributed Particle Swarm Optimization Algorithm for Constraint Reasoning
315
it. Furthermore, in Nature, there are many bird species. Each species consists of several individuals having common characteristics. Each species performs an ecological task that represents its ecological niche. An interesting number of methods have been settled in order to favor the building of both species and ecological niches [2,3,4] in genetic algorithms. Goldberg sets that the sexual differentiation based on specialization via both the building of species and the exploitation of ecological niches provides good results [6]. Our approach draws basically on these concepts. Our main brainwave holds on analogy with natural swarming theory, and adheres both species and ecological niche theories. The main inspiration consists in partitioning the search space into sub-spaces. Each agent, called Species agent, has in charge the responsibility of one sub-space. A given sub-space consists of particles having their fitness values in the same range. This range, said FVR (for fitness values range), is the specificity of the Species agent SpeciesFVR. Species agents are interact, in order to reach an optimal solution for the problem. For this reason, each Species agent performs its own PSO algorithm applied on its sub-space. The latter is double guided by both template concept [15] and min-conflict heuristic [2,3,4]. A mediator agent is required between the society of Species agents and the user, essentially to detect the best partial solution reached during the dialogue between the Species. This agent, called Interface, may also possibly create new Species agents. 3.2 Agent Structure Each agent has a simple structure: its acquaintances (the agents it knows and with which it can communicate), a local knowledge composed of its static and dynamic knowledge, and a mailbox where it stores the received messages to be later processed one by one. 3.2.1 Species Agent A Species agent has got as acquaintances the other Species agents and the Interface agent. Its static knowledge consists of the ∑CSP or the CSOP data (i.e. the variables, their domains of values and the constraints), the specificity (i.e. the fitness function range) and its local PSO parameters. Its dynamic knowledge consists of the swarm, which varies from one generation to another. 3.2.2 Interface Agent An Interface agent has as acquaintances all the Species agents. Its static knowledge consists of the ∑CSP or the CSOP data. Its dynamic knowledge includes the best solution. 3.3 Objective Function We define an augmented objective function (AOF) g which will be used by the optimization process [2,3,4].
g ( ps ) = f ( ps ) + λ * ∑ ( PCi * Ii ( ps ))
(5)
i
The Augmented objective function g can be defined as the function that is the sum of the objective function on a particle and the penalties of features that exist within.
316
S. Bouamama
Where ps is a potential solution, λ is a parameter to the algorithm called Regularization parameter. It is a parameter that determines the proportion of contribution that penalties have in the fitness function value. PCi repsresents the penalties counter for such a solution. Ii(ps) is an indicator for the solution. It is equal to 1 if ps satisfies all the constraints and it is equal to 0 otherwise. In case of ∑CSP (X,D,C,f) the objective function for a potential solution ps can be defined as; f(ps)=ΣWCi
(6)
Where Ci is a violated constraint and WCi is its cost. 3.3.1 Global Dynamic The Interface agent randomly generates the initial swarm and then partitions it into sub-swarms according to their specificities i.e. the fitness value range FVR. Then a sub-swarm takes component as the particles having their FV in the same range FVR. Note that in the approach of Hu and Eberhart [9], all the particles are repeatedly initialized until they satisfy all the constraints. Hence, our approach has a lesser time initialization process. After that, the former creates Species agents to which it assigns the corresponding sub-swarms. Then the Interface agent asks these Species to perform their optimization processes. So, before starting its own optimization process, i.e. its own behavior, each Species agent, SpeciesFVR, initializes all templates and penalty counters corresponding to its particles. After that, it carries out its PSO process on its initial sub-swarm, i.e. the search sub-space that the Interface agent has associated to it at the beginning. For each particle of the returned sub-swarm, SpeciesFVR computes their fitness function values (FV). Consequently, two cases may occur. The first one corresponds to a particle having its FV in the same range as the other particle in the sub-swarm. In the second case i.e. FV is not in the same range FVR (the specificity of the corresponding SpeciesFVR). Then, the particle is sent to another SpeciesFV. If such agent already exists, otherwise it is sent to the Interface agent. The latter creates a new agent having the range of FV as specificity and transmits the quoted particle to it. Whenever a new Species agent is created, the Interface agent informs all the other agents about this creation and then asks the new Species to perform its optimization process. Central to the term is to note that message processing is given a priority. So, whenever an agent receives a message, it stops its behavior, saves the context, updates its local knowledge, and restores the context before resuming its behavior. The latter describes its PSO sub-process. Each Species agent will apply its behavior on its sub-swarm until the stopping creterion. The latter could be an attained maximum number of iterations or an unattainable minimum error criterion. We adopt the first one to our case. 3.3.2 Template Concept and Min-conflict Heuristic The template concept is introduced by Tsang [15]. Templates have a proved effectiveness [15]. In our approach, each particle is attached to a template. The latter is made up of weights referred to as templatei,j. Each one of them corresponds to a variablei,j where i refers to the particle and j to the position. The new particle position is influenced by the best reached fitness. In our case, the parameters α and β could be calculated by using equations 7 and 8.
A New Distributed Particle Swarm Optimization Algorithm for Constraint Reasoning
α = template
lbest
/(template
lbest
+ template
gbest
β = template /(template + template ) gbest
lbest
)
317
(7) (8)
gbest
From now on the two parameters α and β will be function of templates. Their worth is that they articulate the probability for a particlei to propagate its knowledge. This confirms the fact that the best particles have more chance to be pursued by others. Indeed, that is more natural. For each one of its particles, SpeciesFVR uses the min-conflict-heuristic first to determine the variable involved in the worst FV. Then the agent selects the best value from this variable domain. This value must give the lightest template value. Finally to instantiate this variable with this value. If all the Species agents attain the stopping criterion, they will, successively, transmit the best particle from their sub-swarms, linked to its FV to the Interface agent. The latter determines and displays the best particle namely the solution giving the best FV. 3.3.3 Local Optima Detector and Penalty Operator LOD (for local optima detector) consists of a number of iterations in which the neighboring does not give improvement, i.e. if the FV of the best particle remains unchanged for a specific number of generations, we can conclude that the Species optimization sub-process is trapped in a local optimum. It is a proved-effectivness parameter added to genetic algorithms in the case of constrained reasoning [2,3,4]. To enhance the approach, LOD has to be, automatically, equal to one if the unchanged FV is lesser than the last stationary FV. Otherwise the LOD will remain unchanged i.e. LOD is a parameter to the whole optimization process and it will be locally and dynamically updated by each agent. Let us mention that every SpeciesFVR has to save its best particle function value FV for the next generations. This will be very useful not only in the case of stationary fitness values, but also to select the best particle in the sub-swarm. In fact, if the best FV remains unchanged for LOD generation, the process will be considered as trapped in a local optimum. Thus for all the particles having this FV, the related template will be incremented by one. Consequently, penalties will be used to update the particle templates. 3.4 Dynamic Approach It is worth mentioning that our approach is based on both NEO-DARWINISM theory founded by Charles Darwin in 1859 and the laws of nature « The Preservation of favored races in the struggle for life » [7]. In swarming theory, This phenomenon could be described, by a group of ducks which could change their position to follow their best member position. This movement coud be very worthy. In fact, they suppose that the new source of nutrition could be well again. In nature, such a phenomenon can be described by the the birds migration ( ever if the curent location is commendable). So, from now on LOD will be function of fitness and of a newer operator called ε. This operator is a weight having values from 0 to 1.
318
S. Bouamama
In the optimization process, each species agent proceeds with its own PSO process. Indeed, before starting the optimization process agents have to count their own LOD on the basis of their fitness values.
4 Experimentations 4.1 Introduction The used set of problems is academic. It consists of randomly generated ∑CSPs [4]. In this case, our approach will be compared to the dynamic distributed double guided genetic algorithm (D3G2A) [2,3,4] which presents good results compared to other approaches. It is tempting to note that all implementations are done with ACTALK [4], a concurrent object language implemented above the Object Oriented language SMALLTALK-80. This choice of Actalk is justified by its convivial aspect, its reflexive character, the possibility of carrying out functional programming as well as object oriented programming , and the simulation of parallelism. 4.2 Experimental Design : Randomly Generated ∑CSPs Randomly generated binary ∑CSP samples [4] are utilized to experiment our appraoch. The generation is guided by classical CSP parameters: number of variables (n), domain size (d), constraint density p (a number between 0 and 100% indicating the ratio between the number of the problem effective constraints to the number of all possible constraints, i.e., a complete constraint graph) and constraint tightness q (a number between 0 and 100% indicating the ratio between the number of forbidden pairs of values (not allowed) by the constraint to the size of the domain cross product). As numerical values, we use n = 20, d = 20. Having chosen the following values 0.1, 0.3, 0.5, 0.7, 0.9 for the parameters p and q, we obtain 25 density-tightness combinations. For each combination, we randomly generate 30 examples. Therefore, we have 750 examples. It is tempting to note that random costs have been assigned to each constraint. Moreover, we have performed 10 experimentations per example and taken the average without considering outliers. For each combination densitytightness, we also take the average of the 30 generated examples. On the subject of PSO parameters, all the experimentations use a number of iteration (NI) equal to 500, a size of initial population equal to 20, inertia weight was [0.5 +(Rnd/2.0)]. For the AOF, λ equal to 10. LOD have an initial value equal to 3, and ε is eqaul to 0.6. These values are chosen after many experimentations and they seem to give the better compromise [2,3,4]. 4.2.1 Experimental Results The performances of the approaches will be compared as follow: • For the same value of objective function, we compute the time elapsed by each approach to attain this value. It is significant to note that the objective function has values between 1 and 55. Consequently, we choose the fixed values.
A New Distributed Particle Swarm Optimization Algorithm for Constraint Reasoning
319
• For the same CPU time, we compute the fitness value attained by the two approaches. 3
Form the CPU time point of view (given in seconds in figure 1), D GPSO requires less or equal time for most of all the examples set. Note that in the works of [2,3,4], 3 2 D G A is shown to give concurent results to other literature known approaches. This confirms very well the role of the random particles repositioning. The latter diversifies the search. Intensification of the search process is given by guidance. The role of agents interaction in speeding up the optimization process is also to be considered as a betterment factor.
Fig. 1. Given CPU times for fixed value of objective function
Regarding the solution quality, D3GPSO always finds the same or better solution than D3G2A (Figure 2). The most important results are perceived for the most strongly constrained and tightest set of problems. This is the result of the diversification and the intensification used in our approach.
Fig. 2. Fitness values given for fixed CPU times
Note that, compared to the centralized approaches, our approach may scale up well especially in the case of relatively difficult problems. This fact, is explained by the position of the multiagent system in the optimization process. Note that in [2,3,4], using the multiagent systems is shown to be able to improve the result of the
320
S. Bouamama 3
2
optimization process. Compared to D G A, only a little betterment was settled on. This could be explained by the fact that in both approaches, the agents’ interaction will make the optimisation process faster. Moreover, thanks to the diversification and intensification used in search, a good solution quality is guaranteed.
5 Conclusion and Perspectives We have developed a newer PSO-based approach. The latter is called dynamic 3 distributed double guided particle swarm optimization and denoted by D GPSO. This approach is a dynamic distributed double guided PSO. It is enhanced by three new parameters. The latters are guidance probability Pguid, local optima detector LODand the weight ε used by Species agents to determine locally their own PSO process parameters on the basis of their specificities. Applied to randomly generated problems 3 2 and compared to D G A, our new approach is experimentally shown to be better in terms of solution quality and CPU time. We have come to this conclusion thanks to both diversification and guided intensification. The first one increases the algorithm convergence by escaping from local optima attraction basin. The latter helps the 3 algorithm to attain optima. Consequently D GPSO gives more chance to the optimization process to visit all the search space. The particle movement is sometimes random, aiming at diversifying the search process, and sometimes guided in order to increase the best of the objective function value. As by reference to the Neo-Darwinism, The optimization process of Species agents will no longer be the same depending on their specificities (the objective function ranges: FVR). This operation is based on the species typology. The sub-swarm of a species agent can be considered “strong” or “weak” with reference to its FVR. For a strong species, it is better to increase LOD. However, when dealing with a weak species, LOD is decreased. The occurrence of these measures not only diversifies the search but also explores wholly its space. 3 Further works could be focused on the extension of D GPSO to solve continuous literature known and real life hard problems. No doubt further refinement of this approach would allow its performance to be improved.
References 1. Bellicha, A., Capelle, C., Habib, M., Kökény, T., Vilarem, M.: Exploitation de la relation de substitualité pour la résolution des CSPs. Revue d’intelligence artificielle RIA, vol. 3 (1997) 2. Bouammama, S., Ghedira, K.: A Family of Distributed Double Guided Genetic Algorithm for Max_CSPs. The International Journal of Knowledge-based and Intelligent Engineering Systems 10(5), 363–376 (2006) 3. Bouammama, S., Ghedira, K.: Une nouvelle génération d’algorithmes génétiques guidés distribués pour la résolution des Max_CSPs. Revue des sciences et technologie de l’information, serie Technique et science informatique, TSI 27(1-2), 109–140 (2008) 4. Bouammama, S., Ghedira, K.: A Dynamic Distributed Double Guided Genetic Algorithm for Optimization and Constraint Reasoning. International Journal of Computational Intelligence Research 2(2), 181–190 (2006)
A New Distributed Particle Swarm Optimization Algorithm for Constraint Reasoning
321
5. Chu, M., Allstot, D.J.: An elitist distributed particle swarm algorithm for RF IC optimization. In: The ACM Conference on Asia South Pacific Design Automation, Shanghai, China, SESSION: RF Circuit Design and Design Methodology (2005) 6. Craenen, B., Eiben, A.E., van Hemert, J.I.: Comparing Evolutionary Algorithms on Binary Constraint Satisfaction Problems. IEEE Transactions on Evolutionary Computation 7(5), 424–444 (2003) 7. Darwin, C.: The Origin of Species, Sixth London edn. (1859/1999) 8. Hereford, J.M.: A Distributed Particle Swarm Optimization Algorithm for Swarm Robotic Applications. In: IEEE Congress on Evolutionary Computation (2006) 9. Hu, X., Eberhart, R.: Solving Constrained Nonlinear Optimization Problems with Particle Swarm Optimization. In: 6th World Multiconference on Systemics, Cybernetics and Informatics, Orlando, Florida, USA (2002) 10. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks, pp. 1942–1948. IEEE Service Center, Piscataway (1995) 11. Lin, C., Hsieh, S., Sun, Y., Liu, C.: PSO-based learning rate adjustment for blind source separation. In: IEEE International Symposium on Intelligent Signal Processing and Communication Systems (2005) 12. Schiex, T., Fargier, H., Verfaillie, G.: Valued constrained satisfaction problems: hard and easy problems. In: 14th IJCAI, Montreal, Canada (1995) 13. Schoofs, L., Naudts, B.: Swarm intelligence on the binary constraint satisfaction problem. In: IEEE Congress on Evolutionary Computation (2002) 14. Schutte, J.F., Reinbolt, J.A., Fregly, B.J., Haftka, R.T., George, A.D.: Parallel global optimization with particle swarm algorithm. International Journal for Numerical Methods in Engineering 61, 2296–2315 (2004) 15. Tsang, E.P.K., Wang, C.J., Davenport, A., Voudouris, C., Lau, T.L.: A family of stochastic methods for Constraint Satisfaction and Optimization. Technical report, University of Essex, Colchester, UK (November 1999)
Simulation of Fuzzy Control Applied to a Railway Pantograph-Catenary System Simon Walters University of Brighton, Brighton, U.K. [email protected]
Abstract. Throughout Europe, environmental concerns are increasingly resulting in accelerated development of high-speed rail networks. The PantographCatenary (PAC) power collection system has been retained for compatibility reasons, but the normally-reliable passive system can suffer increasing failure of contact at elevated speeds due to wave motion of the catenary. The results include loss of power and premature component failure due to formation of electric arcs. The paper discusses simulation of a fuzzy logic based active PAC control system using the MATLAB-Simulink suite of software. A flexible control system model is developed and initial test simulations are described. Future refinements are proposed for improved system modelling and control. This work is funded under an Interreg IV project, “PAntograph Catenary Interaction Framework for Intelligent Control (PACIFIC)”.
1 Introduction The on-going rise of European high-speed rail travel is expected to continue for the foreseeable future, mostly due to strong environmental pressures. There is already considerable reliance on such rail transport for freight as well as passenger transport; for example, the Channel Tunnel Rail Link. There is concurrently a strong drive towards the adoption of faster railway operating speeds to reduce travel times. Train breakdowns are a fact of life, affecting not just the train concerned but also potentially all of the following traffic. Therefore, these occurrences must be minimised, through proficient system design coupled with effective planned and emergency maintenance. The Pantograph-Catenary (PAC) system is used throughout Europe for supplying the power to railway trains. Whilst offering a balance of cost-effectiveness versus reliability, this important link in the railway power supply chain is also a listed cause of train failure faults. At increasing train speeds, the reliability of power collection with conventional PAC systems can decrease substantially, with a concomitant increase in component wear and accelerated failure. The Interreg IV funded “PAntograph Catenary Interaction Framework for Intelligent Control (PACIFIC)” project focuses on techniques for improvement of PAC systems. The MATLAB/Simulink modelling and simulation techniques, described in this paper form a basis for the project consortium to evaluate selected control systems. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 322–330, 2010. © Springer-Verlag Berlin Heidelberg 2010
Simulation of Fuzzy Con ntrol Applied to a Railway Pantograph-Catenary System
323
Initially, an established industry-standard PAC system is described, leading tto a simplified mechanical mod del. The model is then used to provide details of pllant dynamics for a selected con ntrol system design, which, in turn, uses a fuzzy infereence system (FIS) as a controllerr. Fuzzy logic was chosen for its relative ease of validattion coupled with its capaciity for reliably encapsulating expertise. The MA ATLAB/Simulink suite of so oftware was chosen for its industry-standard compatiible modular structure.
2 A Model of a PAC System Figure 1 shows a PAC system and the mechanical PAC system model initially uused for this simulation work. This T model has been successfully used previously, [1], [2], [3], [4] and [5].
Fig. 1. (a) PAC Sy ystem (b) Elementary PAC System Mechanical Model
Although a few more adv vanced models exist, variables appeared to be less-well defined for some more compllex models and it was deemed prudent to commence w work with a more elementary mo odel. The catenary is modelled as a spring, with rate eqqual to kave, which, depending on o the required complexity of the simulation, may be linnear or non-linear. The pantog graph is modelled as a dual-mass system incorporatting known masses representing g the head and frame, respectively, along with approprriate springs and dampers. The system parameters and their values, where known, are shown in Figure 2 and Tablle 1, respectively. Analysis of this system model allows the construction of a system of equatioons, shown in Figure 3, which may m be used to facilitate modelling and simulation. The system of equationss was rearranged for expression in state-space form in the MATLAB workspace, as sh hown in Figure 4.
324
S. Walters
Fig. 2. System Parameters Table 1. System Parameter Values Qu.
Parameter
V Value
Units
Qu. Parameter
Mh
P head mass
9.1
kg
ks
Mf
P frame mass
17.2
kg
xh
Unknown
?
bvh
xf
Unknown
?
bvf
xcat
Unknown
?
uup
kh
P head spring
7 x 103
k
Value
Unnits
P shoe spring
82.3 x 103
Nm m-1
Ave cat spring ffi i t P head damper ffi i P frame damper ffi i Upward force
1.535 x 106
Nm m-1
130
Nm m-1s
30
Nm m-1s
?
N
Nm-1
Fig. 3. System Equations
Simulation of Fuzzy Control Applied to a Railway Pantograph-Catenary System
325
x&1 = x2
x& 2 =
bvh 1 k b x4 − vh x2 + h x3 + mh mh mh mh
⎛ ks 2 ⎞ ⎜ ⎟ ⎜ k + k − (k h + k s )⎟ x1 ⎝ ave s ⎠
x&3 = x4
x& 4 = y=
⎛ bvh bvf ⎞ bvh k k 1 ⎜ ⎟+ uup + x2 − h x3 + h x1 u − x4 ⎜ ⎟ mf mf mf ⎝ mf mf ⎠ mf k ave x1 k s + k ave a Fig. 4. System Equations rearranged for MATLAB
3 Control System Analysis with MATLAB Control Systems Toolbox The previously described state-space model was used to produce equations – hence a Simulink block representing the dynamics of the PAC system; a transfer function block and a direct state-space block were both produced for later use in a Simulink model of the whole Active PAC system. The code for this, and expected output, is shown below in Figure 5 and 6, respectively.
Fig. 5. MATLAB Code for Extracting Control Data from the PAC Model
326
S. Walters
F 6. Output from MATLAB Code Fig.
4 Active PAC Fuzzy Control System Simulation The preceding work was ussed to facilitate the construction of a complete control ssystem model for the elemen ntary PAC system, including a fuzzy controller. Figurre 7 shows a feedback control system design which has featured in previous work and which was deemed suitablee as a basis for this work, [4].
Fig. 7. Block Diagram m of Closed-loop Active PAC Control System (After: [4])
5 Development of thee Fuzzy Inference System (FIS) The Controller block, featu ured in Figure 7, was created using the MATLAB Fuuzzy Inference System (FIS) Ediitor, [6]. Input and output membership functions and fuuzzy rule-base were specified in accordance with the assessed requirements of the project, with the resulting FIS struccture exported as a named variable to the MATLAB woorkspace. Figures 8a and b sho ow the top level editing screen and Rule Editor, respectiively. Following initial experim ments with a Mamdani model, a Sugeno model was choosen for speed and suitability fo or control applications. Five rules were considered to bbe a good compromise of compllexity for initial tests.
Simulation of Fuzzy Control Applied to a Railway Pantograph-Catenary System
327
Fig. 8. The MATLAB FIS Editor – (a) Top-level Editing Screen (b) Rule Editor
Figure 9a and b depict the input (contact force sensor) and output (linear motor actuator) membership functions, respectively. Five membership functions at the input and output were considered adequate for initially representing the system; the output singletons were placed at values of: 50, 75, 100, 125, and 150 (Newtons of contact force). After editing, the resulting FIS was then exported to the MATLAB workspace as a variable, ready for deployment as a fuzzy controller in Simulink.
Fig. 9. The FIS Editor – Membership Functions(a) Input (b) Output
6 MATLAB-Simulink Simulation The implemented FIS of Section 5 was combined with the control system model of Section 4, the two being built into a new Simulink model in a ‘.mdl’ file, [7], [8], [9]. To simulate the effect of contact line irregularities, a random number generating block was incorporated in the model before the contact force sensor and set to produce variation between +/-10 (Newtons of force). Extra gain blocks were added to facilitate experimentation with different components and controllers. The model is shown in Figure 10.
328
S. Walters
Fig. 10. Simulink Model Using Fuzzy Controller
Although this was a new model with estimated simulation parameters, it was possible to obtain results that seem to validate this implementation. Figure 11 shows a Contact Force plot; the Desired Force is the central line, at a constant 100N, whilst the actual force is shown oscillating about it. Figure 12 shows the notional displacement associated with the contact force and Figure 13 gives actuator force.
Fig. 11. Plot of Results
Fig. 12. ‘Displacement’ Associated with Contact Force
Fig. 13. Actuator Force Required to Maintain Contact Force at Catenary
Simulation of Fuzzy Control Applied to a Railway Pantograph-Catenary System
329
It is also possible to view the operation of the fuzzy control block of the system, whilst the simulation is underway, and this makes for interesting viewing as it shows the membership functions and rule-base functioning together to control the contact force, as it happens; Figure 14a shows the Rule Viewer. The ‘surface view’ of the FIS is shown in Figure 14b for information. The control surface is non-linear due to the particular combination of membership functions and rules. It is acknowledged that the membership functions and rule configurations may be extensively refined for this application, and this process will be presented in future work, along with other modifications.
Fig. 14. (a) Rule Viewer – Typical Output (b) The Control Surface
7 Conclusions and Future Work • •
•
A mechanical engineering model and suitable set of differential equations were derived to represent a pantograph-catenary (PAC) system. The transfer function and state-space representation of the PAC system were derived. The aforementioned work was used to formulate a Simulink model of the PAC system, for which a base-line fuzzy inference system (FIS) was conceived and added. An initial set of simulations indicates that the model is working and that the characteristics of the PAC system have successfully been represented. Future work will aim to add further realism to the simulation model by featuring, for example, non-linearities of the catenary line arrangement. In addition, the fuzzy controller will be improved by investigating: simple refinement of parameters, application of fuzzy PID control and/or neuro-fuzzy optimisation.
Acknowledgement The support of the European Union’s Regional Development Fund is gratefully acknowledged. Interreg IV project: “PAntograph Catenary Interaction Framework for Intelligent Control (PACIFIC)”.
330
S. Walters
References 1. O’Connor, D.N., Eppinger, S.D., Seering, W.P., Wormley, D.N.: Active Control of a High-Speed Pantograph. Journal of Dynamic Systems, Measurement and Control 119, 1–4 (1997) 2. Huang, Y.J.: Discrete Fuzzy Variable Structure Control for Pantograph Position Control. Electrical Engineering 86, 171–177 (2004) 3. Huang, Y.J., Kuo, T.C.: Discrete Pantograph Position Control for the High Speed Transportation Systems. In: Proceedings of the IEEE International Conference on Networking, Sensing and Control (2004) 4. Nise, N.S.: Control Systems Engineering, 5th edn., International Student edn. John Wiley & Sons Inc., Chichester (2008) 5. Bandi, P.: High-Speed Rail Pantograph Control System Design (Sourced from web search) (2009) 6. The MathWorks: Fuzzy Logic Toolbox User’s Guide V.2. The MathWorks (2001) 7. The MathWorks: Simulink Control Design V.2. The MathWorks (2006) 8. The MathWorks: Getting Started with Simulink 6. The MathWorks (2007) 9. The MathWorks: MATLAB and Simulink Built-in Help Facility, The MathWorks (2009) 10. Atherton, D.: Control Engineering. Ventus Publishing ApS, BoonBooks (2009) 11. Dorf, R.C., Bishop, R.H.: Modern Control Systems, 11th edn. Pearson, London (2008)
Floor Circulation Index and Optimal Positioning of Elevator Hoistways Panagiotis Markos and Argyris Dentsoras Machine Design Lab., Dept. of Mechanical Engineering and Aeronautics, University of Patras, 26500, Patras, [email protected], [email protected]
Abstract. In modern buildings, the position(s) of elevator system(s) affects strongly the efficiency of people circulation especially during periods of peak demand. The present study introduces a simple model that correlates circulation data (building type, population size and density, space use, etc.) with structural/architectural data (net usable space, circulation space, structural intrusions, facilities etc) of a typical floor of a building. A circulation index is defined as function of these data and its value is calculated for every cell of the grid that partitions all usable spaces. Euclidean norms are used for calculating weighted mean distance values and for locating the point on floor’s surface that corresponds to minimal mean walking distance to/from the floor’s usable spaces. A heuristic search algorithm combines calculations of distance values, constraints and empirical knowledge for fine tuning hoistway position around this point. A case study of a typical building floor served by a single-elevator system exemplifies the proposed approach. Keywords: Elevator, building, transportation index, optimal position.
1
Introduction
In modern buildings, elevator systems serve as means that cover transportation and circulation needs during peak and regular periods of a typical working day. Targeting for optimized building designs leads to highly analyzed building layouts where the use of building subspaces, the selection of facilities and the interaction with outdoor environment are of crucial importance. Recent technological advances have made feasible the production, installation and operation of elevators that present high degrees of efficiency, reliability, safety and comfort. The establishment of robust design methods in combination with proper simulation environments have contributed also towards this direction. The operation of an elevator system is strongly related to its ability to respond efficiently to transportation demands posed by people that circulate to/from and within a building. The three (3) most important operational parameters for an elevator system are elevator speed, number of cars and capacity and all of them are strongly connected to circulation data which include, among other, the type and purpose of the building, the height of its typical floor, the number of floors, R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 331–340, 2010. c Springer-Verlag Berlin Heidelberg 2010
332
P. Markos and A. Dentsoras
the floor surface and the existence of other transportation means that operate in parallel with elevator systems. Usually, design of an elevator system is of routine type and starts with the determination of the aforementioned three operational parameters. This is a cumbersome task since multiple circulation data of different types should be taken into account. There are tables, measurements, statistical data, formulas, regulations, empirical rules, etc. that get involved in the design process and, by applying proper calculation and/or selection procedures, provide more circulation data and, eventually, the required parameter values. In the literature, there are few textbooks and handbooks that provide reliable design guidelines [1], [2], [3], while major manufacturers have their own manuals and design guides. Regarding the relationship between circulation data and elevator operational parameters, there are few worth-to-be-mentioned articles. Except of a number of references devoted to control issues, the rest are mostly papers for various optimization problems such as circulation, spatial arrangement of elevators, etc. In a paper by Marmot and Gero [4], an empirically based model of elevator lobbies in multi-storey office buildings is proposed. The authors formulate a technique for the analysis of elevator lobbies based on a number of variables which include floor space index area, structure and core area, lobby area, circulation area within the building core, total circulation area and net rentable office area. Results obtained after processing real data include the relationship between a number of area types and the index area, relationships for the establishment of performance standards for elevator lobbies and equations relating circulation and other areas to the total index area for the building. In a paper by Matsuzaki et al. [5], an algorithm is presented that provides a solution of the facility layout problem for multiple floors. The capacity of elevators is taken into consideration and the proposed algorithm optimizes the number and location of elevator with the consideration of the assignment of each carriage to the elevators. Results are presented that demonstrate the effectiveness of the proposed algorithm. Goetschalckx and Irohara [6] present several alternative formulations for the block layout problem of a multi-floor facility where the material handling transportation between floors is executed through elevators. Schroeder discusses [7] elevatoring techniques, their respective merits and shortcomings and the optimum application ranges for the possible configurations are shown. It is concluded that double-decking becomes attractive at 30 floors and is still feasible at 90 floors. Sky lobby elevatoring is advantageous when used in buildings with more than 40 to 50 floors. The present paper takes into consideration the work done by Marmot and Gero [4] and introduces a simple model that correlates circulation data (population size, demands during peak periods, etc.) with structural/architectural data (net usable space, circulation space, structural intrusions, facilities etc) for a typical floor in a building. It introduces a concept for optimized building circulation and is the first part of a larger project that aims at establishing a circulation-data-based methodology for positioning elevator hoistways in multistorey buildings that present multiplicity and diversity of floor layouts. Within
Floor Circulation Index and Optimal Positioning of Elevator Hoistways
333
this context and by considering a typical floor, a circulation index is initially introduced and defined as function of these data. Then, for every cell of a square grid that partitions usable spaces, the value of the circulation index is determined. Subsequently, Euclidean norms and weighted mean values are used for locating the point of optimal positioning of the elevator system so that the mean walking distances to/from the floor’s usable spaces are kept minimal. A heuristic search algorithm combines distance values, constraints and empirical knowledge for fine tuning hoistway position around this point. A case study of a typical floor served by a single-elevator system exemplifies the proposed approach.
2 2.1
Positioning of Elevator Hoistways and Optimality of Circulation Data The Circulation Index
Consider a typical floor of an office building (see figure 1 for a simplified example). On this floor, three types of spaces may be distinguished: a. net usable space (white-coloured surfaces in the figure), b. circulation space (grey-coloured surfaces in the figure) and c. all other types of space such as structural intrusions, various facilities, etc. (shown by hatched surfaces and black lines in the figure).
Fig. 1. A typical floor of an office building
If G is a square grid that covers and partitions the floor surface S, the location of each grid cell c belonging to S is given as c(x, y), where x and y are the coordinates of cell’s lower left corner. By assuming adequate grid resolution, it may be concluded that every cell will belong to one and only one of the aforementioned surface types and will inherit its basic circulation characteristics. However, cells of different subspaces may present considerably different
334
P. Markos and A. Dentsoras
circulation characteristics if their use is different. For example, the cells of office 1 present different characteristics of those of office 2; the first is a secretary office and the second houses an executive manager (see figure 1). Also subspaces that host many people for short time is logical to inherit circulation characteristics which imply denser population and larger possibility for elevator usage. There may be also cases where group of cells within the same subspace will be differentiated from each other due to varying degrees of circulation intensity. In figure 1, in conference room 1, the population in subspace S1 (speaker’s space) is very small; in contrary, the population in subspace S2 occupied by the attendants is higher. For every cell c(x, y) of the usable space a circulation index may be defined as a function of its circulation characteristics as follows: F (u, b, n, x, y) = uba(n)
S(x, y) St (x, y) p
(1)
where: – u, u ∈ [0, 1] is the possibility that a person on c(x, y) uses the elevator during peak period, – b is an index of building type, with: b = 1 for single-purpose buildings, b = 1.08 for diversified single-[purpose buildings and b = 1.2 for diversified buildings – a(n) is a function of number of building floors n with: a(n) = 1 when n ≤ 10, a(n) = 1.04 when 10 < n ≤ 20, a(n) = 1.12 when 20 < n ≤ 30 and a(n) = 1.2 when 30 < n – S(x, y) is the surface in (m2 ) of the space the grid cell belongs to – p is the density of population of the space the grid cell belongs to in (m2 / person) – St (x, y) is the index of the type of the working people in the space the grid cell belongs to, with St (x, y) = 1 for simple working persons, St (x, y) = 0.5 for simple managers and St (x, y) = 0.3 for executive managers If Fmax is the maximum value of F in Su , then the normalized circulation index for cell c(x, y) is given by: Fnorm (u, b, n, x, y) = F (u, b, n, x, y)/Fmax 2.2
(2)
The Search for Solutions
A reasonable hypothesis regarding circulation space Sr , with Sr ⊂ S and Sr ∩ Su = , would consider each one of its grid cells as candidate for being part of the rectangular surface to be occupied by the elevator system hoistway (symbol Sh ). There are, however, constraints posed by regulations, standardizations, structural limitations, performance requirements, floor planning, etc. [1], [2], [3] which, when combined with empirical design rules, not only restrict the number of feasible solutions but - when properly used - may provide assistance in obtaining the optimal of them. Below, some of these constraints and rules are listed:
Floor Circulation Index and Optimal Positioning of Elevator Hoistways
335
– Hoistway space Sh is always adjacent to circulation space Sr – If Sh does not penetrate into Su , then, all cells adjacent to one of its sides should belong to spaces other than Sr . (The position H1 of hoistway space in figure 1 is characteristic for the case) – Sh may occupy part of Su if and only if all cells adjacent to one of its sides are also adjacent to cells belonging to circulation space Sr . (See position H2 in the same figure) – The distance from any grid cell of Su to the geometric center of the hoistway surface should not exceed 60 (m) – There should be free space in front of hoistway that must be at least equal to its surface – Elevator door(s) should always face circulation space – It is advisable for the hoistway to be adjacent with as much usable spaces as possible The above set may form a small system of rules and constraints that could be used for guiding the search for obtaining the final solution for the problem formally defined as follows: ”For a given configuration of space and circulation data of a typical floor of a building, find the position of the elevator hoistway(s) so that the mean walking distance to/from all grid cells of usable space is minimal”’ For solving the problem, first it is assumed that the circulation index values have already been calculated (see expression 2) and assigned to all grid cells of Su .Then, for every grid cell r(xr , yr ) that belongs to floor surface, the arithmetic mean of all Euclidean distances for all cells of Su will be given as: n (xr − xci )2 + (yr − yci )2
dm (r) =
i=1
n
(3)
A circulation index has been assigned to every cell of Su . Then, through this index, the arithmetic mean may be weighted and according to expressions (2) and (3): n
dw (r) =
i=1
[1 − Fnorm (ui , b, n, xi , yi )][ (xr − xci )2 + (yr − yci )2 ] n
(4) [1 − Fnorm (ui , b, n, xi , yi )]
i=1
Weighted means may be calculated for all cells of the grid that partitions the floor and, subsequently, the minimal of them and the corresponding point rm of Sr that provides that minimal can be easily determined. Point rm is a basis for fine tuning the final position of elevator hoistway. This implies that the hoistway will be initially placed so that rm is the lower left corner of its lower left cell. Then the hoistway should be subjected to a consistency test in order to verify that all rules and constraints are satisfied. It is advisable that multiple variations
336
P. Markos and A. Dentsoras
Fig. 2. Alternative hoistway orientations around point rm
Fig. 3. Fine tuning of hoistway posiition in the area of point of minimum weighted Euclidean mean
of hoistway orientation are generated in order to increase the number of feasible alternatives and reduce the possibility of solution rejection. In figure (2) the hoistway a1 and its seven (7) variations a2 ..a4 , b1 ..b4 are depicted in the region of point rm . Although variation a1 does not satisfy the constraints, variation b3 does and, therefore, consists a solution to the problem. If a solution cannot be found, then an iterative process should commence which eventually will position hoistway to an allowed position. The final hoistway location must be as much as closer to rm as possible. The iterative process takes
Floor Circulation Index and Optimal Positioning of Elevator Hoistways
337
place in a search space that is a subset of the floor grid and its ”‘center”’ coincides initially with rm . Search is based on the values of the weighted mean distances of space cells that are neighbours of rm . Every such cell is tested for consistency with respect to rule base in an increasing order weighted mean distance value. Figure 3 shows schematically an implementation example of the method for a square elevator hoistway. If 0 is the cell whose lower left corner coincides with point rm , then it is obvious that none of the four (4) variations a01 ..a04 can provide a solution. Then, starting from 0 and by excluding all cells that leave point rm outside hoistway space, the iteration process generates eight (8) candidate cells. Since the weighted mean for cell 1 is the lower among all means of candidate cells, cell 1 is the next position for the hoistway and its variations. By starting now with this cell, the process is repeated for two more cycles. At cycle three, variations a32 and a33 provide solutions to the problem.
3
The Algorithm - An Example Case
The proposed method may be structured as an algorithm whose basic steps are given below:
begin Step 1 - for all cells r of space Sr calculate dw (r) if dw (r) < (dw (r))min set (dw (r))min = dw (r) Set r as rm end Step 2 - set rm as test point rt Step 3 - form list of variations Vrt for rt Step 4 - while not(variation satisfies constraints posed by T ) form list Lrt of allowed points around rt sort list Lrt ascending according to dw of its points set first point of Lrt as rt if not(rt in Lv ) form list of variations Vrt for rt for all variations in Vrt if variation satisfies constraints posed by T exit while end for set rt in list of visited points Lv end while Step 5 - report rt and variation as solution end. The above algorithm was applied for solving the problem of positioning of a single-elevator hoistway in the space of a typical floor of a small office building (see figure 4). For determining the dimensions and shape of elevator’s hoistway,
338
P. Markos and A. Dentsoras
Fig. 4. Typical floor of a small office building of type b = 1 and a(n) = 1 with n = 10
the elevator’s rated capacity was taken into account, according to EN 81-1 and EN 81-2 standards. This capacity was calculated by considering building’s population, number of passengers arriving in main terminal during morning, the average period of time between successive arrival of the elevator at the main terminal and the average number of passengers in each trip. It is assumed that for the present case, the typical floor belongs to a 10 floor building. The total usable and facilities area is equal 990 (m2 ) for the whole building. With a population density of 5 (m2 ) per person, the building population is 198 (persons). By assuming that 10 (%) of the total occupants are absent, the total population is 178.2 (persons) and the number of passengers that arrive in main terminal during morning up-peak is 26.7 (persons). The number of trips is equal to 10 and the number of passengers per trip is 2.67 (persons). This leads to a rated capacity of 4 (persons) and a maximum car area equal to 0.90 (m2 ). Since the examined floor belongs to office building, regulations impose use of elevators that can serve persons with disabilities according to EN 81-70 standards. Finally a total hoistway area of 1.90 (m) x 2.10 (m) is considered with the assumption that an electrical elevator is used. In figure 4, except of the dimensions of all spaces, circulation data are given for each usable space. These data correspond to case 1 (see table 1). Execution of step 1 of the aforementioned algorithm for the values shown in the figure, provides the value of minimal mean weighted distance that is equal to 5.73 (m) and positions the corresponding point at coordinates (4.2,8) (see figure 5). Variations of circulation data will provide different locations for that point, as it is shown in Table 1. When running the algorithm for these variations, it was noticed that the calculated position point of elevator hoistway was very sensitive in even small changes of floor circulation data.
Floor Circulation Index and Optimal Positioning of Elevator Hoistways
339
Table 1. Variations of circulation data and location of point of minimal weighted mean distance Case 1 2 3 4
U0 u ST 0.75 0.5 1 1 1 1 0.75 0.3
U1 u ST 0.8 1 1 1 1 1 1 1
U2 u ST 0.9 0.3 1 1 1 1 1 1
U3 u ST 1 1 0.5 0.3 1 1 0.6 0.5
U4 u ST 1 1 0.6 0.5 1 1 1 1
Dist. P ointcoords.(m) (m) x − coord. y − coord. 5.73 4.2 8.00 4.83 12.4 6.50 5.64 5.00 9.3 6.02 6.7 3.50
Fig. 5. Fine tuning of hoistway position for case 1 of Table 1
In figure 5, case 1 of Table 1 is considered for fine tuning hoistway position. Point 0 corresponds to initial point of minimal weighted mean distance and point 15 with coordinates (4,4) is the final position for the elevator hoistway that is obtained by applying the aforementioned algorithm for fifteen (15) cycles. Point 15 corresponds to three alternative acceptable positions for the hoistway. The choice of the final one among them depends upon additional floor planning constraints.
340
4
P. Markos and A. Dentsoras
Conclusions
Given that the main role of an elevator system is to provide efficient transportation means for people circulating to/from and within the building, its design should be optimal with respect to circulation data. The present paper describes a simple model that quantifies this data for a typical building floor and introduces a circulation index that uniquely represents the intensity of this data. The index is used for determining - with the help of a simple heuristic algorithm the optimal hoistway position on floor surface. The analysis and the case study have shown that as soon as floor surface is partitioned via a grid of adequate resolution, it is easy to calculate circulation index for all its usable spaces. Additionally, the computational cost for calculating weighted mean Euclidean distances for every grid cell of floor surface and for locating the point of minimal of them is relatively low. Finally, it has been also proven (see Table above) that the location of the point of minimal weighted mean distance is very sensitive to variations of circulation data of the different subspaces. To search for solutions for the final position of hoistway, the algorithm uses as heuristic function the weighted mean distance and judges the suitability of a proposed solution by using a simple system that encompasses constraints and empirical design rules. The application of the algorithm (see case study above) provided solutions for all considered variations of circulation data with acceptable computational cost. The proposed model is being further elaborated for enriching its rule base and for considering multiple types of floors with varying layouts and surfaces. For the latter case, a 3D optimization model is currently under construction.
References 1. Strakosch, G.R.: The Vertical Transportation Handbook, 3rd edn. Wiley, Chichester (1998) 2. Barney, G.C.: Elevator Traffic Handbook: Theory and Practice. Taylor & Francis, London (2003) 3. Services CIOB, Transportation Systems in Buildings. Chartered Institution of Building Services Engineers (1993) 4. Marmot, A., Gero, J.S.: Towards the development of an empirical model of elevator lobbies. Building Science 9(4), 277–287 (1974) 5. Matsuzaki, K., Irohara, T., Yoshimoto, K.: Heuristic algorithm to solve the multifloor layout problem with the consideration of elevator utilization. Computers and Industrial Engineering 36(2), 487–502 (1999) 6. Goetschalckx, M., Irohara, T.: Formulations and optimal solution algorithms for the multi-floor facility layout problem with elevators. In: IIE Annual Conference and Expo. 2007 - Industrial Engineering’s Critical Role in a Flat World - Conference Proceedings, pp. 1446–1452 (2007) 7. Schroeder, J.: Sky lobby elevatoring: A study on possible elevator configurations. Elevator World 37(1), 90–92 (1989)
Rapid Evaluation of Reconfigurable Robots Anatomies Using Computational Intelligence Harry Valsamos1, Vassilis Moulianitis2, and Nikos Aspragathos1 1
Mechanical and Aeronautics Engineering Dept. University of Patras, 26500, Rio, Achaia, Greece 2 Dept. of Product and Systems Design Engineering, University of Aegean, 84100, Ermoupolis, Syros, Greece [email protected], [email protected], [email protected]
Abstract. Designing a reconfigurable manufacturing robotic workcell is a complex and resource demanding procedure. In this work a multi criteria index is introduced, allowing the designer to evaluate the various anatomies achieved by a reconfigurable manipulator, and to define the area in the manipulator’s configuration space where a task can be accomplished with good performance under the selected performance measure. An adaptive neuro-fuzzy inference system is trained, in order to rapidly produce the index value for arbitrary anatomies achieved by the manipulator. The system is tested using a case study reconfigurable manipulator, and the derived results determined by the system after its training are presented and compared to the actual index value for calculated for the relevant anatomy. Keywords: Reconfigurable robots, ANFIS, Anatomy selection.
1 Introduction The design of robotic manufacturing workcells has been under constant research due to the rapidly increasing usage of robotic systems in the manufacturing industry. In the design stage, the engineer has to address several key issues such as the matching of a robot type to a given task, the optimal positioning of the task in the robot’s workspace for the robot to present the best performance, the sequencing of the task(s) to achieve shorter cycle time etc. These considerations are task oriented, resulting in the design of a single purpose robotic workcell with optimal performance. Several approaches and methods are proposed in the relative literature addressing the problems of robotic workcell design [1,2,3,4]. In the last decades, as the reconfiguration paradigm was identified as a key feature for the enhancement of the manufacturing productivity [5] by allowing the rapid adaptation, reconfigurable robots have increasingly attracted the interest of researchers [6,7]. The design of a robotic workcell including reconfigurable robots is a far more challenging task than the design of the corresponding one composed of fixed anatomy R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 341–350, 2010. © Springer-Verlag Berlin Heidelberg 2010
342
H. Valsamos, V. Moulianitis, and N. Aspragathos
robots. Firstly, the design under the assumption that a single task or a variety of similar ones should be addressed by the workcell is not applicable since the main advantage of a reconfigurable system is its ability to address a wider range of different tasks by altering its anatomy and software. Therefore, the first and foremost problem to be considered is the determination of the optimal anatomy for a given task. This particular problem was addressed in the literature and several interesting methods were proposed [8,11,12]. For both the optimal anatomy and task planning, these procedures are extremely time consuming and require significant amounts of computational power, due the sharp increase in the search space. This happens because for each emerging anatomy, the manipulator presents a different workspace both in terms of volume and shape but most importantly in terms of the variation of robot performance. An engineering tool is required to evaluate an emerging anatomy in terms of performance for the particular task, such as manipulability measure, MVR etc, and then the determination of the configuration space area where the performance of the manipulator is “good”. Since the evaluation of the anatomy is time consuming an adaptive neuro-fuzzy inference system (ANFIS) is trained towards the rapid calculation of the performance measure. ANFIS has been used in various applications including engineering [8], medicine [9], economics etc. In the present paper a robotic performance measure is proposed allowing the evaluation of emerging anatomies of a reconfigurable manipulator based on manipulability. A neuro-fuzzy inference system is presented, allowing the rapid determination of the proposed measure for each emerging anatomy, as a means to its rapid calculation, allowing its rapid implementation to optimal design processes. An example for a six DOF reconfigurable robot manipulator is presented showing the advantages of this procedure and concluding remarks are closing this paper.
2 The Proposed Global Kinematic Measure A local robotic kinematic measure y depends on the particular manipulator configuration θ, given by a non linear function:
y = f (θ) . Where θ = [θ , θ , ..., θ 1
2
]
T
n
(1)
is a vector of joint variables. Since most kinematic measures
are local, efforts have been made to derive global indices characterizing the manipulators performance in the whole volume of its workspace [13] or for a given task [14,15]. Such measures are very commonly used in the relative literature presenting methods and tools for the optimal design and planning of manipulator tasks. The procedure usually met in such optimization problems for fixed anatomy robotic systems today requires the determination of an area in the manipulator’s workspace where if a task is placed the robot shows the best possible performance [16]. Typically this is achieved by computing a global index in terms that characterize the manipulator’s performance for the given task, requiring the subsequent recalculation of the index for a different task. Such a procedure is time and computational resources demanding.
Rapid Evaluation of Reconfigurable Robots Anatomies
343
The design and operation of reconfigurable robots present a far greater challenge. Their changeable anatomy results in the different workspaces for a given reconfigurable structure presenting different performance behavior in each one. For a reconfigurable robot, the variation of its anatomy has to be taken into account in the formulation of the measure function, since the m anatomical parameters θ = [θ , ..., θ p
p1
]
T
pm
directly affect
its value; therefore the measure is given as:
y = f (θ p , θ) .
(2)
For each anatomy denoted by θp, the determined values of y create a hyper-surface in the n-dimensional configuration space. Figure 1 represents the curve that the values of y represent in the configuration space for a system of 1 d.o.f.
Fig. 1. The curve constructed by the values of θ for a 1 d.o.f. mechanism
In order to derive the introduced global measure for evaluating the current anatomy θp of the manipulator, the following indices are defined, as illustrated in fig. 1: • The overall mean value of the measure achieved by the robot in its configuration space for the current anatomy, y • The mean value of the m highest values achieved by the robot in its configuration space, ymax , given by: (3)
m
ymax =
∑y i =1
i max
.
m
344
H. Valsamos, V. Moulianitis, and N. Aspragathos
Where the number of m is determined by the designer and the highest values are determined by simply sorting all available values of y in the configuration space • The distance δ ymax between y and ymax , is given by:
δ ymax = ymax − y . • from
(4)
The factor δ y is the distance of the highest value of y recorded
ymax
These three indexes can be used to structure a multi criteria measure to evaluate the behavior of the current anatomy of the robot in its configuration space for the selected measure y. The overall mean value provides the starting point of the anatomy’s overall performance. A good anatomy should have a higher value than others achieved by the reconfigurable manipulator. The distance δ ymax provides an insight of the configuration space area where the current anatomy will present a “good” performance if the task is placed within. The largest the area, the bigger the number of configurations contained therein meaning that the anatomy presents larger areas in its workspace where it can perform tasks with good performance. However, there is always the danger that the anatomy may present a few very high values while the others are closer to y . This could cause the value of δ ymax to become rather high, but the number of configurations contained within it to be rather low, due to the sharp increase the extremely high values cause to ymax . In order to amend this situation from such an anatomy a high evaluation score, a limiting factor is introduced. A low value of this index implies that the highest values appear in a more balanced way around ymax , which in turn implies that the area denoted by
δ ymax , contains a larger number of configurations, while a high value implies the opposite. The proposed measure is therefore structured by
y , δ ymax and δ y . A “good”
anatomy is one that achieved high values for the first tow criteria and a low value for the last one, in comparison to other anatomies achieved by the reconfigurable system.
3 The ANFIS System for the Rapid Calculation of the Measure 3.1 Score Calculation In order to take into account the interaction among the three criteria and to favor the anatomies that present the above requirements the discrete Choquet integral [17] is used as the measurement of score which is a generalization of the weighted i
arithmetic mean [17]. Assuming an anatomy θ p with a profile of criteria i C i = { x1i , x2i , x3i } = { y i , δ ymax , δ y i } the Choquet integral is defined as:
Rapid Evaluation of Reconfigurable Robots Anatomies
(
) (
)
Cu ( θ p ) := ∑ x ij ⎡u A ( C j ) − u A ( C j +1 ) ⎤ . ⎣ ⎦ j =1 n
345
(5)
u ∈ FC , FC denotes the set of all fuzzy measures on C which is the set of the three criteria, and u is a monotonic set function on C defined as: where,
u : 2 C → [0,1] with u (∅ ) = 0
( ⋅)
and
u (C ) = 1 .
indicates a permutation of C which is the set of criteria such that
A ( C j ) = {C j ,..., Cn } , A ( Cn +1 ) = ∅ .
(6)
x j ≤ ... ≤ xn ,
Since there are three criteria six (6) fuzzy measures must be defined. The order of the criteria presented in this paper shows the importance of them. So, u ({C1} ) = u ({ y } ) has the highest value of the fuzzy measures that correspond to the subsets with cardinality one. The fuzzy measures of these subsets must satisfy the following:
u ({C1} ) < u ({C2 } ) < u ({C3 } ) In order to favor the anatomies that present high values for
(7)
y and δ ymax and low
value for δ y the fuzzy measures that correspond to the subsets with cardinality two must satisfy the following:
u ({C1} ) + u ({C3 } ) < u ({C1 , C3 } )
(8)
u ({C1} ) + u ({C2 } ) > u ({C1 , C2 } )
u ({C2 } ) + u ({C3 } ) < u ({C2 , C3 } ) In addition, all fuzzy measures must fulfill the following constraints:
( ) ∀ i, j = 1, 2,3 and i ≠ j 0 < u ({C } ) < u ({C , C } ) < 1 0 < u ({Ci } ) < u {Ci , C j } < 1 j
i
(9)
j
So far, the complexity of the procedure to find the best anatomy is increased and is time consuming. In order to lower the time of this procedure an adaptive neuro-fuzzy inference system is trained to produce the score rapidly according to the pseudojoints angles. 3.2 System Training The proposed approach uses a set of k random anatomies θ p and mj (j=1…k) random configurations θ for the jth anatomy in order to produce training data set for an adaptive
346
H. Valsamos, V. Moulianitis, and N. Aspragathos
neuro-fuzzy inference system (FIS). The Sugeno-type system has as inputs the anatomical parameters θ p that postulate the anatomy and rapidly derives the index of the current anatomy. Every input of the FIS system has three triangular fuzzy sets defined nθ p
in [-π/2, π/2] while the outputs ( 3
,where
nθ
p
is the number of anatomical parame-
ters) of the system are constant numbers. In fig. 3 the mean error for a hundred epochs for training a system using 4096 samples of anatomies and 200 random samples of configurations for each anatomy. Set of random configurations θ
Index calculation
(
Cu y ( θ p , θ )
Set of random anatomies θ p
New Index
Cu ( θ p
new
)
Training Data
( )⎤⎦
set ⎡θ p , Cu θ p
)
⎣
New anatomy
ANFIS
θp
new
Fig. 2. Training Data Set derivation and fuzzy inference system generation
0.15
0.14
Error
0.13
0.12
0.11
0.1
0.09
0
20
40
60
80
100 Epochs
120
140
160
180
Fig. 3. Training error for two hundred epochs
200
Rapid Evaluation of Reconfigurable Robots Anatomies
347
4 Case Study In order to validate the correct operation of the ANFIS for the rapid determination of the measure for emerging anatomies achieved by a reconfigurable manipulator, an arbitrary 6 d.o.f. reconfigurable robot was structured using three active joints, rigid links, six pseudo joints and a spherical joint, representing the usual structure of most industrial manipulators today where the axes of rotation of the last three joints intersect and a given point. Such an arrangement is favored due to the fact that an analytical solution to the inverse kinematics can be obtained. The reference manipulator is illustrated in fig. 4.
Fig. 4. The case study 6 d.o.f. reconfigurable manipulator
The robot’s reconfiguration to a new anatomy is achieved by the resetting of the pseudo joints to a new angle. Pseudo joints [18,19] are passive connector modules that when placed in a modular reconfigurable robot’s lattice allow the rapid reconfiguration of its anatomy to a new one, without dismantling it. They allow the manipulator to achieve anatomies that are currently not favored in robotic design, i.e. presenting angles formed by consecutive joint axes different to the standard 00 or 900 of current fixed anatomy robot manipulators. The angle of the pseudo joints for this work varies within the range of [-900,+900]. The particular set up of the robots structure, i.e. the sequence and orientation of pseudo joints, links and active joints was chosen so that the manipulator would be able to achieve a wide range of different anatomies which would include both anatomies that are “in line” with current practice in manipulator design as well as anatomies that are not favored by current practice. The kinematic measure selected for this case study is the well known Yoshikawa’s manipulability index, given by:
w (θ p , θ) =
J (θ p , θ) ⋅ J
T
(θ , θ) p
(10)
348
H. Valsamos, V. Moulianitis, and N. Aspragathos
Where, θp is the vector whose elements are the pseudo joints setting representing the current anatomy of the reconfigurable system and θ is the vector whose elements are the joint coordinates for the current configuration (posture) of the robot as its end effector reaches a point in its workspace. Yoshikawa proposed the well known manipulability index as a measure of the ability of a manipulator to move its end effector in its workspace in terms of speed and also as a measure of how far the current position of the end effector lies from a singular configuration of the robot [20]. In order to train the system 2000 random samples of anatomies and 200 random samples of configurations per anatomy were derived. Using the 200 configurations the w , δ wmax (in a window of the 10 highest values) and δ wmax values is calculated for every anatomy providing the training sets. In order to calculate the score according to formula (5) for every anatomy the fuzzy measures are defined according to formulas (7), (8), (9) and are shown in Table 1. Using these training sets the adaptive neuro-fuzzy inference system is trained and tested using 30 random samples. Table 2, presents the actual training data. The results are shown in fig 5. Table 1. Definition of the fuzzy measures for the case study
Set
{C1} {C2 } {C3 } {C1 , C2 } {C1 , C3} {C2 , C3 }
Fuzzy Measure 0.4 0.35 0.3 0.7 0.8 0.75
Table 2. Actual training data. (Pseudo joint settings (anatomy) in degrees)
Θp1
Θp2
Θp3
Θp4
Θp5
Θp6
90 90 90 30 30 30 30 90 90
90 30 90 30 90 30 90 30 90
90 90 90 90 90 90 30 30 30
90 90 30 30 30 90 90 30 30
90 30 30 90 90 30 30 90 90
90 90 90 90 90 90 90 90 90
Index value 0.1144 0.2033 1.4880 0.8455 0.3916 0.1415 0.5211 0.6054 0.7904
Rapid Evaluation of Reconfigurable Robots Anatomies
349
3 Trained Results Calculated Results
Evaluation Score
2.5
2
1.5
1
0.5
0
0
5
10
15 Samples
20
25
30
Fig. 5. Comparison between the results of the trained FIS system and the calculated ones
In this example all scores are calculated in 1.42 sec using the trained system while they need 147.79 sec using equation (5). The time needed to calculate the score depends on the number of configuration samples used, while the score estimation by the FIS is independent. The mean (absolute) error of results comparing the trained system results and the calculated results was -0.038 (0.2426) presenting an acceptable performance of the trained system.
5 Conclusions A multi criterion index is proposed to address the problem of evaluating different anatomies emerging through the reconfiguration of a robotic system, at the early stages of the reconfigurable workcell design. The index allows the end user to evaluate the various anatomies based on their respective overall kinematic performance therefore assisting the choice of the most suited anatomy to the task. Additionally, the index also allows the determination of “good performance” area in the manipulator’s configuration space, where if the task is placed the selected anatomy will exhibit good kinematic performance. This helps to reduce the search space for latter parts of the workcell design process such as placing the task in the optimal location in the manipulator’s workspace and scheduling it. In order to reduce the required time and complexity for determining the index value for all possible anatomies, an ANFIS system is created and trained using a smaller number of samples. After the system training, it produces the index values for all possible anatomies achieved by the manipulator. The use of the system allows for a steep decrease in the overall computational time and load required in order to obtain the results for each new anatomy, severely improving the design process performance.
350
H. Valsamos, V. Moulianitis, and N. Aspragathos
References 1. Petiot, J., Chedmail, P., Hascoet, J.: Contribution to the Scheduling of Trajectories in Robotics. Robotics and Computer Integrated Manufacturing 14, 237–251 (1998) 2. Zacharia, P.T., Aspragathos, N.A.: Optimal Robot Task Scheduling based on Genetic Algorithms. Robotics and Computer – Integrated Manufacturing 21(1), 67–79 (2005) 3. Dissanayake, M., Gal, J.: Workstation planning for Redundant Manipulators. Int. J. Production Research 32(5), 1105–1118 (1994) 4. Aspragathos, N., Foussias, S.: Optimal location of a robot path when considering velocity performance. Robotica 20, 139–147 (2002) 5. Hoepf, M., Valsamos, H.: Research Topics in Manufacturing – The I*PROMS Delphi Study. In: IEEE INDIN 2008 Conference on Industrial Informatics (2008) 6. Paredis, C., Brown, B., Khosla, P.: A rapidly deployable manipulator system. In: Proceedings of the International Conference on Robotics and Automation, pp. 1434–1439 (1996) 7. Badescu, M.: Reconfigurable Robots Using Parallel Platforms as Modules. PhD Thesis, The State University of New Jersey (2003) 8. Chen, S.-H.: Collaborative Computational Intelligence in Economics ISRL, vol. 1, pp. 233–273 (2009) 9. Yang, G., Chen, I.-M.: Task-based Optimization of Modular Robot Configurations: Minimized Degree of Freedom Approach. Mechanism and Machine Theory 35, 517–540 (2000) 10. Chen, I.-M.: Rapid Response Manufacturing Through a Rapidly Reconfigurable Robotic Workcell. Robotics and Computer Integrated Manufacturing 17, 199–213 (2001) 11. Kahraman, C., Gólbay, M., Kabak, O.: Applications of Fuzzy Sets in Industrial Engineering. A Topical Classification Studies in Fuzziness and Soft Computing 201, 1–55 (2009) 12. Yardimci, A.: A Survey on Use of Soft Computing Methods in Medicine. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 69– 79. Springer, Heidelberg (2007) 13. Gosselin, C., Angeles, J.: A Global Performance Index for the Kinematic Optimization of Robotic Manipulators. Transactions of the ASME 113, 220–226 (1991) 14. Aspragathos, N., Foussias, S.: Optimal location of a robot path when considering velocity performance. Robotica 20, 139–147 (2002) 15. Valsamos, H., Nektarios, T., Aspragathos, N.A.: Optimal Placement of Path Following Robot Task using Genetic Algorithms. In: SYROCO 2006 (2005) 16. Valsamos, H., Aspragathos, N.: Determination of Anatomy and Configuration of a Reconfigurable Manipulator for the Optimal Manipulability. In: ASME/IFToMM International Conference on Reconfigurable Mechanisms and Robots, London, pp. 497–503 (2009) 17. Grabisch, M.: The application of fuzzy integrals in multicriteria decision making. European Journal of Operation Research 89, 445–456 (1996) 18. Valsamos, H., Aspragathos, N.A.: Design of a Versatile Passive Connector for Reconfigurable Robotic Manipulators with Articulated Anatomies and their Kinematic Analysis. In: I*PROMS 2007 Virtual Conference (2007) 19. Valsamos, H., Moulianitis, V., Aspragathos, N.: A Generalized Method for Solving the Kinematics of 3 D.O.F. Reconfigurable Manipulators. In: I*PROMS 2009 Virtual Conference (2009) 20. Yoshikawa, S.: Foundation of Robotic Analysis and Control. The MIT Press, Cambridge (1990)
Incremental Construction of Alpha Lattices and Association Rules Henry Soldano1 , V´eronique Ventos2 , Marc Champesme1 , and David Forge2 1
2
L.I.P.N, UMR-CNRS 7030, Universit´e Paris-Nord, 93430 Villetaneuse, France LRI, UMR-CNRS 8623, Universit´e Paris-Sud, 91405 Orsay, France
Abstract. In this paper we discuss Alpha Galois lattices (Alpha lattices for short) and the corresponding association rules. An alpha lattice is coarser than the related concept lattice and so contains fewer nodes, so fewer closed patterns, and a smaller basis of association rules. Coarseness depends on a a priori classification, i.e. a cover C of the powerset of the instance set I, and on a granularity parameter α. In this paper, we define and experiment a Merge operator that when applied to two Alpha lattices G(C1 , α) and G(C2 , α) generates the Alpha lattice G(C1 ∪C2 , α), so leading to a class-incremental construction of Alpha lattices. We then briefly discuss the implementation of the incremental process and describe the min-max bases of association rules extracted from Alpha lattices.
1
Introduction
Galois lattices (or concept lattices) explicitly represent all subsets of a set I of instances that are distinguishable according to a given language L, whose elements will be called terms or patterns. A node of a concept lattice is composed of a set of instances (the extent) and a term (the intent). The intent is the most specific term which recognizes the instances of the extent, and the extent is the greatest set of instances so recognized. This means that a fraction of the language (the set of intents) represents all the distinguishable subsets of instances. Such an intent is also denoted as a closed term, and as a frequent closed term when only closed terms, whose extents are large enough according to a bound minsupp∗|I|, are considered[8]. In Data Mining the problem of finding frequent itemsets, an essential part of the process of extracting association rules, is therefore reduced to finding closed frequent itemsets. Many research efforts are currently spent in finding efficient algorithms for that purpose[12]. More recently, in order to deal with semi-structured data, more expressive patterns than itemsets have been heavily investigated [7,5]. Again closure operators have been used to efficiently extract closed patterns, as for instance attribute trees [1,11]. Still the number of closed patterns is often too high when dealing with real-world data, and some way to select them has to be found [6]. Recently, Alpha Galois lattices (Alpha lattices for short) have been defined [16]. Each node of an Alpha lattice corresponds to an Alpha closed term i.e. a closed term that satisfies a local frequency constraint, therefore allowing a better control on the selection of closed terms. The definition R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 351–360, 2010. c Springer-Verlag Berlin Heidelberg 2010
352
H. Soldano et al.
of Alpha lattices relies on the assumption that there exists an a priori categorization of the data in such a way that the set of instances I is covered with a set C of subsets of I here denoted as classes, and each denoting some type or category. We then susbtitute α-extents for extents: an α-extent is obtained as the aggregation of large enough (with respect to a bound α) parts of classes. As a consequence, two closed terms that have different extents may share the same α-extent (the converse is false) thus resulting in only one α-closed term. The value of the α parameter (ranging in [0, 1]) denotes a degree of coarseness or granularity in our perception of the data. Note that, as association rules represent (approximate) inclusion of extents, by changing the notion of extent we obtain α-association rules with respect to α-extensions, at various levels of granularity. α-closed terms w.r.t. such a cover C form a subset of the closed terms w.r.t. I, generalize frequent closed terms and are structured in an Alpha lattice. Our main contribution in this paper concerns the flexibility in data exploration allowed by an incremental construction of Alpha lattices. More precisely we define here class incrementality as the opportunity to update an Alpha lattice when classes are added to the current ones, and so to extend the set of α-closed terms by extending the cover C (and so extending the set of instances to consider). For that purpose we have to extend the subposition operator of Formal Concept Analysis (FCA, [4]) previously used in [14] to efficiently build the concept lattice related to the instance set I1 ∪ I2 from the concept lattices respectively related to I1 and to I2 . This leads to a class-incremental construction of the Alpha lattice GCα performed by iteratively merging the various frequent lattices {C } Gα i corresponding to each class Ci of C. The paper is organized as follows. In section 2 we present preliminaries and previous work. Section 2.1 and 2.2 briefly present Galois lattices and Alpha lattices. In section 2.3 we discuss α-association rules extracted from a reduced Alpha lattice to which has been applied a further global frequency constraint. Section 3 describes the incremental construction of Alpha lattices. In section 3.1 we present the extensional merging of Galois lattices and in section 3.2 we apply the Merge operator to Alpha lattices and describe their class-incremental construction. Section 4 discusess implementation issues on the incremental process together with the construction of a min-max basis of α-association rules. The last section summarizes the paper and concludes.
2 2.1
Preliminaries and Previous Work Galois Lattices
Detailed definitions, results and proofs regarding Galois connections and lattices may be found in [2,3], and in the field of Formal Concept Analysis in [4]. In the rest of the paper we denote as Galois lattice the formal structure that we define below. We recall that a mapping w from an ordered set M to M is called a closure operator iff for any pair (x, y) of elements of M we have a) x ≤ w(x) (extensity), b) if x ≤ y then w(x) ≤ w(y) (monotonicity), and c) w(x) = w(w(x))
Incremental Construction of Alpha Lattices and Association Rules
353
(idempotency). An element of M such that x = w(x) is called a closed element of M w.r.t. w. The formal definition of Galois connection and Galois lattice are given below: Definition 1 (Galois connection and Galois lattice). Let int : E → L and ext : L → E be maps between two lattices (E, ≤E ) and (L, ≤L ). Such a pair of maps form a Galois connection if for all e, e1 , e2 in E and for all t, t1 , t2 in L: 1. e1 ≤E e2 ⇒ int(e2 ) ≤L int(e1 ) 2. t1 ≤L t2 ⇒ ext(t2 ) ≤E ext(t1 ) 3. e ≤E ext(int(e)) and t ≤L int(ext(q)) Let G = {(e, t) ∈ E × L | e = ext(t) and t = int(e)} and let ≤ be defined by: (e1 , t1 ) ≤ (e2 , t2 ) iff e1 ≤E e2 . Then G = (E, int, L, ext) is a lattice called a Galois lattice. Note that int ◦ ext and ext ◦ int are closure operators on (E, ≤E ) and (L, ≤L ). In what follows, L is a lattice of terms expressing some statement on instances and ≤L is a less-specific-than relation. E is a lattice extracted from P(I) where I is a set of instances and ≤E is defined as ≤E =⊆. ext(t) will be the greatest element of E whose elements satisfy t and int(e) will be the most specific term t satisfied by all the elements of e. As a particular case, let us consider E = P(I) and L = P(A) where A is a set of binary attributes (or items) defined on L, and so t1 ≤L t2 ⇐⇒ t1 ⊆ t2 . Then int(e) is the set of attributes common to all the instances in e and ext(t) is the set of instances which have all the attributes of t. The corresponding Galois lattice is known as a concept lattice[4]. In this paper we are mainly concerned with variations on E and so we consider, with no loss of generality, that L = P(A). As a consequence we will use ⊆ as ≤L , ∩ as the lower bound operator and ∪ as the upper bound operator. The following example will be used throughout the paper. Example 1. We consider animal species divided in three classes: mammals (C1 ), insects(C2 ) and birds(C3 ). A set of properties allows to describe each species. Table 1 describes the data. Amongst these animals some are, to some extent, Table 1. T ab(i, j) = 1 if the j th attribute belongs to the ith instance Animals cow flying sq. tiger rabbit platypus bat ant bee wasp condor duck ostrich pigeon
Flies 0 1 0 0 0 1 0 1 1 1 1 0 1
Wings 0 0 0 0 0 1 0 1 1 1 1 1 1
Feath. 0 0 0 0 0 0 0 0 0 1 1 1 1
Hairs 1 1 1 1 1 1 0 0 0 0 0 0 0
Vivip. 1 1 1 1 0 1 0 0 0 0 0 0 0
Teats 1 1 1 1 1 1 0 0 0 0 0 0 0
Ovip. 0 0 0 0 1 0 1 1 1 1 1 1 1
354
H. Soldano et al.
exceptional to their class: they lack properties common to their class or do possess properties seldom found in their class. Regarding mammals, the petauristini (known as flying squirrel) flies1 , the platypus is not viviparous and the bat both flies and has wings. Also most birds fly, but the ostrich does not. 2.2
Alpha Galois Lattices
In what follows here, we start from the concept lattice G = (P(I), int, L, ext) In [5] the authors use projections as a way to obtain smaller lattices by reducing the language L. [9] independently introduces extensional projections to modify the ext function. The latter yields a new equivalence relation on L which is coarser than the standard one. More precisely we consider here the equivalence relation ext ≡ext L defined on L by t ≡L t ⇐⇒ ext(t) = ext(t ). Coarseness and projections are defined hereunder together with a theorem [9] presenting projected Galois lattices: Definition 2 (Coarseness). Let ≡1 and ≡2 be two equivalence relations defined on L, then ≡2 is said to be coarser than ≡1 iff for any pair (t, t ) of elements of L we have t ≡1 t ⇒ t ≡2 t . Definition 3 (Projection). p : E → E is a projection on an ordered set (E, ≤E ) iff for any pair (x, y) of elements of E: p(x) ≤E x (minimality), x ≤ y ⇒ p(x) ≤E p(y) (monotonicity), p(x) = p(p(x)) (idempotency) Projections on P(I) are denoted as extensional projections. We further consider as a lattice E any extensional projection p(P(I)) (including P(I) itself). Theorem 1 (Extensional order on Galois connections). Let (int, ext) be a Galois connection on L and E, proj be a projection on E, E1 = proj(E) and ext1 = proj ◦ ext. Then: (int, ext1 ) define a Galois connection on L and E1 , and therefore a Galois lattice G1 = (E1 , int, L, ext1 ). 1 ≡ext is coarser than ≡ext L L G1 is said to be coarser than G. We further denote G1 as a projected lattice and write G1 = proj(G). Each node (t, e) of G has a projection proj(t, e) = (int(proj(e)), proj(e)) in G1 . Starting from E = P(I), we hereunder define Alpha Lattices by defining a projection that relies on the association of one (possibly several) type to each individual of I. The corresponding groups of instances are denoted as classes and a set of classes is denoted as a cover. Note that a cover may be partial, meaning that there may be instances of I that do not belong to any class. Definition 4 (Alpha extent). Let I be a set of instances, C be a cover of P(I), t a term of L and α ∈ [0, 1], then: 1
More precisely a Petauristini glides, but here we consider it as a flying mammal.
Incremental Construction of Alpha Lattices and Association Rules
355
projαC (e) = {i ∈ e | ∃C ∈ C, with i ∈ C and | e ∩ C |≥ |C|.α}, and extCα (t) = projαC ◦ ext(t) projαC is a projection and EαC = projαC (P(I) is a lattice. Hence, changing ext in this way gives rise to a new Galois connection in which extents are made of sufficiently large parts of classes. The corresponding Galois lattice GCα is called an Alpha lattice. When α is equal to 0, we have the original concept lattice. Figure 1 represents the alpha lattice of Mammals, Insects and Birds of Table 1 with α = 0.6.
{Insects, Birds,Mammals } Fig. 1. The Alpha lattice G0.6
2.3
Frequent α-Closed Terms and Alpha Association Rules
In Alpha Galois lattices we apply a local frequency constraint in such a way that intents frequent in at least one class appear in the lattice. However, a side effect is that an Alpha Galois lattice may still be very large, especially when using small values of alpha. Fortunately it is possible to add a global constraint: we will only consider nodes whose α-extent is large enough, eliminating nodes that represent few instances. For any real number f with 0 ≤ f ≤ 1, we consider the projection proj f |e| on Eα such that proj f (e) = e whenever |I| ≥ f and proj f (e) = ∅ otherwise. The result of such a filter is again a Galois lattice Gfα that corresponds to the topmost part of Gα and denoted as an Iceberg Alpha lattice.
356
H. Soldano et al.
The partial order induced on terms by the Galois connection can be related to a set of implication rules. More precisely extI (t1 ) ⊆ extI (t2 ) means that the implication t1 → t2 holds for all instances of I. Now, in Alpha Galois lattices, whenever extα (t1 ) ⊆ extα (t2 ) we will say that the α-implication t1 →α t2 holds on the pair (I, C). Hereunder we adapt the definitions of support and confidence for association rules to Alpha association rules by changing extents to α-extents: Definition 5. An α-association rule is a pair of terms t1 and t2 , denoted as t1 →α t2 . The support and confidence of an α-association rule r= t1 →α t2 are defined 1 ∪t2 )| α (t1 ∪t2 )| as α-supp(r) = |extα (t and α-conf(r) = |ext |I| |extα (t1 )| The α-association rule r= t1 →α t2 holds on the pair (I, C) whenever αsupp(r) ≥ minsupp and α-conf (r) ≥ minconf .
3
Incremental Construction of Alpha Lattices
We will see here that an Alpha lattice may be computed by merging frequent lattices through an extensional merging operator. First, however we present some formal results leading to the extensional merging operator. 3.1
Extensional Merging of Galois Lattices
Proposition 1. Let G = (E, int, L, ext) be a Galois lattice, proj1 and proj2 be two extensional projections on E, and ext1 = proj1 ◦ ext, ext2 = proj2 ◦ ext. If proj1 and proj2 are such that for any e ∈ E, we have e = proj1 (e) ∪ proj2 (e), then: 1. For any node (t, e) of G, let t1 = int(ext1 (t)) and t2 = int(ext2 (t)) be closed terms of G1 and G2 , then t = t1 ∩ t2 and e = ext1 (t1 ) ∪ ext2 (t2 ). 2. For any nodes (t1 , e1 ) from G1 and (t2 , e2 ) from G2 , let t = t1 ∩ t2 then (t, ext(t)) is a node of G and ext1 (t1 ) ∪ ext2 (t2 ) ⊆ ext(t). Proof: a) First we show that a term t1 closed w.r.t int ◦ ext1 , is also closed w.r.t int ◦ ext. We have ext1 (t1 ) = proj1 ◦ ext(t1 ) ⊆ ext(t1 ) because proj1 is a projection. As a consequence of condition 2 defining Galois connections, we have then int(ext(t1 )) ≤L int(ext1 (t1 )) = t1 . As int ◦ ext is a closure operator we have also t1 ≤L int(ext(t1 )). So t1 = int(ext(t1 )) is closed w.r.t int ◦ ext. In the same way a term t2 closed w.r.t. int ◦ ext2 is also closed w.r.t. int ◦ ext. We conclude that t1 ∩ t2 is closed w.r.t. int ◦ ext. For any term t, by considering e = ext(t) and applying the first condition of the proposition, we have ext(t) = ext1 (t) ∪ ext2 (t). Now from the properties of the Galois connection we have ext1 (t) = ext1 (int ◦ ext1 (t)) = ext1 (t1 ) and ext2 (t) = ext2 (int ◦ ext2 (t)) = ext2 (t2 ). So ext(t) = ext1 (t1 ) ∪ ext2 (t2 ). b) Let t = t1 ∩ t2 , we have seen before that t is closed w.r.t. int ◦ ext. We also have t ≤L t1 and as a consequence ext1 (t1 ) ⊆ ext1 (t) and so ext1 (t1 ) ⊆ ext(t). In the same way we obtain that ext2 (t2 ) ⊆ ext(t). As a consequence ext1 (t1 ) ∪ ext2 (t2 ) ⊆ ext(t).
Incremental Construction of Alpha Lattices and Association Rules
357
A consequence of this proposition is that the nodes (t, e) of the Galois lattice G = (E, int, L, ext) are obtained by intersecting the intents of the nodes (t1 , e1 ) of the projected Galois lattice proj1 (G) = (E1 , int, L, ext1 ) with the nodes (t2 , e2 ) of proj2 (G) = (E2 , int, L, ext2 ), and keeping as the extent, when two resulting intents are identical, the greatest union of original extents e1 and e2 . This defines the M ergeE operator. From an algorithmic point of view, Proposition 1 leads to a generic algorithm in the spirit of the Merge algorithm for subposition detailed in [14]: Definition 6 (Extensional merging operator). Let G = (L, ext, E, int) be a Galois lattice. Let proj1 and proj2 be two projections on E such that for any e ∈ E, we have e = proj1 (e) ∪ proj2 (e). Then the extensional merging operator M ergeE defined above following Prop. 1 is such that : G = proj1 (G) M ergeE proj2 (G) 3.2
Class Incrementality of Alpha Lattices
In what follows we use the alpha lattices associated to the covers C1 and C2 to build the alpha lattice associated to the cover C1 ∪ C2 . Let us denote as GCα the Alpha lattice associated to the cover C. Let proj1 and proj2 defined as follows, for any e ∈ EαC = projαC P(I): proj1 (e) = {i ∈ e | ∃C ∈ C1 , such that i ∈ C}, and proj2 (e) = {i ∈ e | ∃C ∈ C2 , such that i ∈ C} Then, proj1 and proj2 are projections on EαC , and we have GCα1 = proj1 (GCα ) and GCα2 = proj2 (GCα ). Furthermore we clearly have for any e ∈ EαC , e = proj1 (e) ∪ proj2 (e) and we obtain therefore the following proposition: Proposition 2. Let C a cover of EαC and C1 , C2 two subsets of C such that C = C1 ∪ C2 , then GCα = GCα1 M ergeEαC GCα2 So we can compute an Alpha Galois lattice stepwise by first computing, the Fre{C } {C } quent lattices Gα 1 , ..., Gα n associated to each class C1 , .., Cn and then repetitively using M erge. The main advantage of such a scheme is class-incrementality, i.e., the ability to update an Alpha lattice associated to the set C when a new class C is added to C.
4
Implementation
To implement the Merge operator we started from the subposition algorithm for concept lattices described in [13]. Our implementation of Merge operator is an adaptation of the implementation of this subposition operator, as found in the software Galicia2 , in which we substitute α-extents for extents. In practice, the main difficulty concerns the dual of the procedure Update-Order, as described in [14], that performs extents intersections and relies on distributivity of P(I). As projections do not preserve distributivity, we had to modify this procedure. 2
http://sourceforge.net/projects/galicia/.
358
H. Soldano et al.
(a) Insects
(b) Birds {Insects}
Fig. 2. The frequent lattices G0.6
{Birds}
and G0.6
{Insects,Birds}
Fig. 3. The Alpha lattice G0.6 . The 0.6-closed term {Oviparous, Has Wings} {Insects} (node (2) in Figure 2(a)) (node 9) is obtained by intersecting a closed term of G0.6 {Birds} (node (0) in Figure 2(b)). with a closed term of G0.6
Figure 3 shows the Alpha lattice associated to the cover {Insects, Birds} obtained {Insects} {Birds} by merging G0.6 with G0.6 represented in Figures 2(a) and 2(b). Implication rules, together with association rules, are often built using closed frequent itemsets [17]. The basic idea is that a node in the concept lattice corresponds to an equivalence class of terms, all sharing the same extent. In particular, the intent of the node, i.e., the unique greatest term, has the same extent as all the smallest terms (also called generators). We obtain then for each node several implication rules whose left part are these generators, and whose right part is the intent of the node. For instance, let us consider the node 26 of Figure 1, the two generators of the corresponding equivalence class are HasHairs and HasT eats. We obtain then the two implication rules HasHairs → {HasHairs, HasTeats} and {HasTeats} → {HasHairs, HasTeats} that are rewritten as {HasHairs} → {HasTeats} and {HasTeats} → {HasHairs}. This set is further extended to as-
Incremental Construction of Alpha Lattices and Association Rules
359
sociation rules, whose confidences are smaller than 1 and whose right part are intents of more specific nodes[8]. The resulting set produces the so called minmax basis of association rules. In order to compute the min-max basis of Alpha association rules, starting from the Iceberg Alpha lattice (or equivalently from the set of Alpha closed terms), we have straightforwardly adapted the method proposed in [10]. All these features are implemented in the software AlphabetaGalicia3 .
5
Conclusion
In Alpha lattices the extent of a term is restricted according to constraints depending on a a priori categorization of instances in classes, and on a degree α, so resulting in a smaller lattice[16]. In the present paper we have investigated the construction of Alpha lattices when new classes are added incrementally. We have first proposed a general method allowing to merge two Galois lattices that are extensional projections of an original Galois lattice and then described its application to the incremental construction of Alpha lattices. The discussion of a general strategy to maintain Alpha lattices whenever both new classes and new instances added to current classes are provided is outside the scope of the paper. However we can give here the general line of such a strategy: whenever new instances have to be added to a current class C, the Alpha lattice is projected in order to discard the contribution of the class C, the frequent lattice of C is updated as proposed in [15] and finally the two lattices are merged as Alpha lattices. We have also discussed some implementation issues of the software AlphabetaGalicia that allows to incrementally build Alpha lattices and to extract Alpha association rules.
References 1. Arimura, H.: Efficient algorithms for mining frequent and closed patterns from semi-structured data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 2–13. Springer, Heidelberg (2008) 2. Barbu, M., Montjardet, M.: Ordre et Classification, Alg`ebre et Combinatoire 2. Hachette Universit´e (1970) 3. Birkhoff, G.: Lattice Theory. American Mathematical Society Colloquium Publications, Rhode Island (1973) 4. Ganter, B., Wille, R.: Formal Concept Analysis: Logical Foundations. Springer, Heidelberg (1999) 5. Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: Delugach, H.S., Stumme, G. (eds.) ICCS 2001. LNCS (LNAI), vol. 2120, pp. 129–142. Springer, Heidelberg (2001) 6. Janetzko, D., Cherfi, H., Kennke, R., Napoli, A., Toussaint, Y.: Knowledge-based selection of association rules for text mining. In: de M´ antaras, R.L., Saitta, L. (eds.) ECAI, pp. 485–489. IOS Press, Amsterdam (2004) 3
http://www-lipn.univ-paris13.fr/~ champesme/alphabetagalicia.
360
H. Soldano et al.
7. Liquiere, M., Sallantin, J.: Structural machine learning with Galois lattice and graphs. In: ICML 1998. Morgan Kaufmann, San Francisco (1998) 8. Pasquier, N., Taouil, R., Bastide, Y., Stumme, G., Lakhal, L.: Generating a condensed representation for association rules. Journal Intelligent Information Systems (JIIS) 24(1), 29–60 (2005) 9. Pernelle, N., Rousset, M.-C., Soldano, H., Ventos, V.: Zoom: a nested Galois lattices-based system for conceptual clustering. J. of Experimental and Theoretical Artificial Intelligence 2/3(14), 157–187 (2002) 10. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Intelligent structuring and reducing of association rules with formal concept analysis. In: Baader, F., Brewka, G., Eiter, T. (eds.) KI 2001. LNCS (LNAI), vol. 2174, pp. 335–349. Springer, Heidelberg (2001) 11. Termier, A., Rousset, M.-C., Sebag, M., Ohara, K., Washio, T., Motoda, H.: Dryadeparent, an efficient and robust closed attribute tree mining algorithm. IEEE Trans. Knowl. Data Eng. 20(3), 300–320 (2008) 12. Uno, T., Asai, T., Uchida, Y., Arimura, H.: An efficient algorithm for enumerating closed patterns in transaction databases. Discovery Science, 16–31 (2004) 13. Valtchev, P., Missaoui, R.: Building concept (Galois) lattices from parts: generalizing the incremental methods. In: Delugach, H.S., Stumme, G. (eds.) ICCS 2001. LNCS (LNAI), vol. 2120, pp. 290–303. Springer, Heidelberg (2001) 14. Valtchev, P., Missaoui, R., Lebrun, P.: A partition-based approach towards building Galois (concept) lattices. Discrete Mathematics 256(3), 801–829 (2002) 15. Valtchev, P., Missaoui, R., Godin, R., Meridji, M.: Generating frequent itemsets incrementally: two novel approaches based on Galois lattice theory. J. of Experimental and Theoretical Artificial Intelligence 2/3(14), 115–142 (2002) 16. Ventos, V., Soldano, H.: Alpha galois lattices: An overview. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 299–314. Springer, Heidelberg (2005) 17. Zaki, M.J.: Generating non-redundant association rules. In: Intl. Conf. on Knowledge Discovery and DataMining, KDD 2000 (2000)
Intelligent Magnetic Sensing System for Low Power WSN Localization Immersed in Liquid-Filled Industrial Containers Kuncup Iswandy, Stefano Carrella, and Andreas K¨ onig Institute of Integrated Sensor Systems, University of Kaiserslautern Erwin-Schroedinger-str. 12, 67663 Kaiserslautern, Germany {kuncup,carrella,koenig}@eit.uni-kl.de http://www.eit.uni-kl.de/koenig
Abstract. Wireless sensor networks (WSN) have become an important research domain and have been deployed in many applications, e.g., military, ambient intelligence, medical, and industrial tasks. The location of wireless sensor nodes is a crucial aspect to understand the context of the measured values in industrial processes. Numerous existing technologies, e.g., based on radio frequency (RF), light, and acoustic waves, have been developed and adapted for requirements of localization in WSN. However, physical constraints of the application environment and each localization technology lead to different aptness. In particular, for liquid media in industrial containers, determining the location of every sensor nodes becomes very challenging. In this paper, a localization concept based on intelligent magnetic sensing system using triaxial anisotropic magnetoresistive (AMR) sensor with appropriate switched coils combined with a centralized localization algorithm based on iterative nonlinear mapping (NLM) is presented. Here, our system is extended by low power and fast localization based on triangulation for feasible local position computation. The experimental results, both in the air as well as in liquid filled stainless steel container, delivered in the average an absolute localization error in the order of 6 cm for both NLM and triangulation. In future work, we will scale our approach to industrial container size required for beer brewing industry and increase the accuracy and speed by timely electronics and calibration. Keywords: Underwater wireless sensor networks, localization, non-linear mapping, triangulation, triaxial AMR sensor.
1
Introduction
Recently, employment of wireless sensor nodes proves to be an effective technique for a wide range of applications, e.g., environmental and manufacturing process monitoring, medical diagnosis, military, and ambient intelligence. One crucial aspect shared by all of these applications is the location of sensor nodes. Localization in wireless sensor networks (WSN) is a process to estimate the location of each node. Without the location information, the measured data of R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 361–370, 2010. Springer-Verlag Berlin Heidelberg 2010
362
K. Iswandy, S. Carrella, and A. K¨ onig
the whole process detected and reported by WSN can be meaningless. If the accurate location of the event is known, the information of events detected and reported by WSN can be meaningfully assimilated and responded to. Therefore, a need of reliable localization methods emerges from this requirement. Many localization techniques implemented in WSN have been proposed to allow estimating node location assisted by a set of anchor nodes. This anchor is defined as a node that is aware of its own location, either through a GPS or manual deployment at a point with known position [5,9]. However, the usage of GPS has become impractical solution in many applications, e.g., in underwater, and mostly is not a feasible solution from an economic perspective. Some alternative technologies used for WSN localization can be classified based on the underlying physical principles, e.g., radio frequency (RF), light, acoustic waves, and magnetic fields [2,4,6,7,8,9,10]. The applicability of localization technologies depends on the constraints of environment. By means of the radio module used for communication purposes, the localization algorithms can be computed by using the radio energy to indicate the received signal strength (RSSI). However, the usage of RF for localization requires more energy and can result in high attenuation of radio wave caused by materials, e.g., steel, copper, or liquid. The capability of light technology, in particular IR and UV light, could be decreased when implemented in the rough environment, which contains opaque materials such as smoke, dust, or muddy liquid. Possible impact to the industrial process must be considered as well, especially for UV light. The effect of reflection or scattering of air bubbles can reduce the capability of acoustic waves for localization in underwater. Magnetic fields are suitable for WSN localization in the liquid media, since the magnetic fields can penetrate materials such as water and numerous stainless steels, without suffering from high attenuation or distortion. The concept of magnetic tracker has been widely used for medical applications by locating the magnet inside the body [11]. Recently, the employment of magnetometers for localization of underwater sensors has been reported in [7], where the magnetometers were used to track the magnet dipole of a vessel. The trajectory of the vessel movement and the sensors was simultaneously estimated by an extended Kalman filter. In this paper, we present and extend our localization method based on an intelligent magnetic sensing system using triaxial Anisotropic Magnetoresistive (AMR) sensor combined with range-based localization algorithm [12]. The aim of this paper is to design a low power localization concept for wireless sensor nodes, which can be applied in the rough environment, in particular, for industrial container filled by a liquid.
2
Methodology of WSN Localization Based on Magnetic Sensing System
The approaches of localization can be distinguished by their assumptions about the network configuration and mobility, the distribution of the calculation process, and hardware capabilities. Four types of networks can be distinguished,
Intelligent Magnetic Sensing System for Low Power WSN Localization
363
i.e., (1) static nodes, (2) static non-anchor and mobile anchor nodes, (3) mobile non-anchors and static anchor nodes, and (4) mobile all of nodes. In addition, the computing aspect of location has to be considered, either using one machine or using several distributed machines. In the centralized localization approach, all sensor nodes transmit their data to a central station, and the computation is then performed there to determine the location of each node. Multidimensional Scaling (MDS) and Non-Linear Mapping methods are grouped in the centralized approach. The distributed approach relies on self-organization of the nodes in WSN. In this approach, the sensor node estimates its position based on the local data collected from the neighbor nodes. Triangulation and multilateration methods are predominantly applied for localization in the distributed manner. More detail about localization methods can be found in [5,9]. Our localization concept is based on distance-based approach (also called range-based) using a triaxial AMR magnetic sensor and with switched transmitting coils applied as the static anchors [12]. The design and placement of the coils are arranged in such a way that high flux densities (i.e., larger than 0.5
Magentic flux density in T
-2
Magnetic flux density for a coil with d=13cm, I=5A, n=100, and negligible length
10
sensor node
coil
-4
10
-6
10
-8
10
0
0.5
1
1.5
2
2.5
3
Distance on main axis of coil in m
(a) Curve of magnetic flux density
(b) Coils at a container
Fig. 1. (a) A curve of magnetic flux density with regard to distance on main axis of coil and (b) a placement of the coils at a container
Fig. 2. An illustration of magnetic flux density in sensor node position depending on coil positions, magnetic field emitted by coils, and sensor position
364
K. Iswandy, S. Carrella, and A. K¨ onig ternary quasi-DC Coil 1
Coil 2
Coil 3
Measure the magnetic field strength of coils Measure the magnetic field strength of coils
+
+
Transform the measurement into distance Transform the measurement into distance value value
+
+
Coil N
Localization algorithm Localization algorithm one cycle for one position of sensor nodes
(a) Ternary quasi-DC current switching
(b) Localization procedure
Fig. 3. (a) One cycle of the ternary quasi-DC current through the multicoil used to determine one position of wireless sensor nodes and (b) a general procedure of localization using intelligent magnetic sensing system
mT) in all possible sensor locations are avoided due to the maximum applicable field of Sensitec AFF755B AMR sensors. Figure 1(a) shows the curve of magnetic flux density with regard to the distance on the main axis of coil. The coil arrangement can be in co-planar lines or circles around the container as shown in Fig. 1(b). The number and size of the designed coils depend on the size of an environment or a container, where each of the sensor nodes equipped with the orthogonal triaxial AMR sensor can measure the emitted magnetic fields from at least four coils to enable 3D localization. An illustration of magnetic flux density in sensor node position is shown in Fig. 2. Since flipping of AMR sensors to eliminate offset requires high currents and consumes power, the polarity of the coil current is swithced instead. Therefore, ternary quasi-DC currents through the coils are employed to generate magnetic fields. The current signal pattern is time multiplexed for the coils so that the fields are distinguishable from one another as shown in Fig. 3(a). Only the raw data measured by AMR sensors from the whole campaign is saved by each of sensor nodes, when there is no urgent need for each of the sensor nodes to compute its location during measurement. This approach will minimize power, since the sensor nodes usually carry batteries of limited capacity, but it needs larger EEPROM memory. Using the ternary quasi-DC current, two different direction of magnetic field strength values of one coil are measured and each of the measurements requires 2 bytes of memory. The centralized localization is adopted based on an iterative non-linear mapping (NLM) to compute the location of sensor nodes in the postmortem evaluation. Another solution for low power localization is based on triangulation method from distributed localization algorithm, which is less in computational effort than NLM. Figure 3(b) shows a general procedure of our localization concept based on non-linear mapping and triangulation methods. More detail about these steps is given in the next subsections.
Intelligent Magnetic Sensing System for Low Power WSN Localization
2.1
365
Range-Based Localization Method Based on Magnetic Sensors
In our localization concept, the coil positions are a priori determined by static placement around the infrastructure and used as absolute coordinates. The distance between sensor nodes and the anchors needs to be measured. Three magnetic sensors measure the magnetic field strength from emitting coils. The magnetic flux density values are calculated from the measured signals of one coil. The absolute value Bt is computed from three magnetic flux density values and is defined as follows Bt = (Bx2 + By2 + Bz2 )1/2
(1)
The absolute value of one coil is then used to compute the distance between the sensor node and the coil as an RSSI equivalent by approximation of BiotSavart equation. A distance matrix is built from the distance values, i.e., between the sensor nodes and coils and between one coil to other coils. This distance matrix will be used by the non-linear mapping algorithm to compute the relative coordinates of sensor nodes and coils. 2.2
Centralized Localization Algorithm Based on Sammon’s Non-linear Mapping
The range-based localization algorithm computes the position of sensor nodes by using distance values, which indicate the distance between sensor nodes and coils. Here, an iterative non-linear mapping algorithm is adapted to our concept to determine the location of wireless sensor nodes in a postmortem evaluation. Non-Linear Mapping (NLM) method was introduced by Sammon in 1969 [3]. Basically, this method is used for data visualization and classification, by projecting the high dimensional data ÊM into lower dimensional data ÊL . The NLM is done by computing the interpoint distance between data in high dimensional data and in the same time preserving the interpoint distance while mapping into low dimensional data. In [1], this algorithm has been successfully applied for WSN localization based on RSSI values, without dimension reduction. In current 3-dimensional localization problem, both dimensions are same, M = L = 3. The quality of mapping is achieved by minimizing the total error (cost function) between distance obtained from both the dimensions. This cost function is given by 1 E = N −1 N i=1
j=i+1
d∗ij
N −1 i=1
N (d∗ij − dij )2 ; d∗ij j=i+1
(2)
where N is number of nodes, d∗ij is euclidian distance between mi and mj in ÊM and dij is euclidian distance between li and lj in ÊL . The computational complexity of NLM is O(n2 ) times a number of iterations. The coordinates of sensor nodes and coils obtained by NLM are relative coordinates, which need the linear conformal transformation to obtain the correct
366
K. Iswandy, S. Carrella, and A. K¨ onig
coordinates of sensor nodes. Pseudo inverse is applied to solve the linear conformal transformation matrix containing the translation and rotation from the computed coordinates to the actual coordinates of anchors. This transformation matrix is then employed to find the correct coordinates of sensor nodes. 2.3
Distributed Localization Algorithm Based on Triangulation
Similar to NLM, the triangulation method has been widely used for visualization as well as localization of wireless sensor network [13,5]. For 3D localization of sensor nodes, the triangulation required only four distance values, which are computed from the magnetic flux density of the nearest four coils. Since our localization concept based on multi-coil, a sorting procedure is required to select the strongest four magnetic flux density values corresponding to the closest coils. Figure 4 shows the 3D localization based on the triangulation with different arrangement of coils. The distance between a sensor node and a coil can be computed as follows d2i = (x − xi )2 + (y − yi )2 + (z − zi )2 ,
(3)
where (xi , yi , zi ) is the i-th coil coordinates and (x, y, z) is the computed sensor node coordinates. By using the elimination between four distance equations, three values of the sensor node coordinates can be solved. The computational complexity of this method is O(n). As its computational effort is smaller than NLM, this localization method can be possibly done in the sensor nodes.
Coil 1
Coil 6
Coil 4
Coil 3
d3
d4
Coil 2
Coil 5 d1
Coil 2
d2
d4 d2
Sensor node
d1
Coil 3
Coil 4
(a)
Coil 5
d3
Coil 1
Coil 6
(b)
Fig. 4. Localization based on triangulation method with different arrangement of coils: (a) parallel and (b) circular
3
Experimental Results
For practical reasons, the triaxial sensor nodes for our localization concept was designed in a wired and wireless version as shown in Fig. 5. The size of the sensor module is about 3 cm of height and 2.5 cm diameter. In our experiments, the prototype of the wired version was investigated and a data acquisition board with 16 bits of ADC was applied, whereas the wireless one was implemented based on
h = 3 cm
diameter = 2.5 cm
AMR Sensor X-axis
Inst. Amp.
AMR Sensor Y-axis
Inst. Amp.
AMR Sensor Z-axis
Inst. Amp.
MCU ADC (10 bits)
Intelligent Magnetic Sensing System for Low Power WSN Localization
433 MHz wireless Comm.
Mica2Dot Remote Sensor
Antenna
367
Mica2 + MB500 Base Station
Antenna
RF
RS232
AMR Sensor X-axis
Inst. Amp.
AMR Sensor Y-axis
Inst. Amp.
AMR Sensor Z-axis
Inst. Amp.
DAQ Board (16 bits ADC)
Localization Algorithm (Position and Orientation) using Matlab USB
Fig. 5. Wireless version of the triaxial AMR sensor (left) and block diagram of wireless sensor node communicating with the base station and wired sensor node with the DAQ board and computer/Matlab for localization (right)
MicaMote platform with 10 bits of ADC. Both versions performed well, where the wireless sensor node slightly has error about maximum 10 cm higher per axis than the wired one. This can be explained, since the wired sensor node has higher ADC resolution than the wireless one. Furthermore, the experiments for localization were conducted by using the wired sensor node. Six coils were applied in our first experimental setup. The wired version sensor node was located in the different 20 locations in the air (without the container) with the volume of 1.5 m x 1.5 m x 1.5 m. The result of measured localization with an average error with respect to actual positions by manual determination of sensor node was achieved for coordinates of 1.9 cm, 1.6 cm, and 2.4 cm for |Δx|, |Δy|, and |Δz|, respectively. For another test, which is similar to the real environment of industrial application, the sensor node was placed inside of the stainless steel container filled by water as shown in Fig. 6. The sensor node was placed and test in 10 different positions and the same 10 positions with roughly dealing for rotation property in
Fig. 6. A miniature size of the stainless steel container from the same material used in the industry (left) and the coil arrangement surrounding the container (right)
K. Iswandy, S. Carrella, and A. K¨ onig
actual position AMR sensor (NLM) AMR sensor (triangulation)
1.5
1.5
666
1
10 10 10
Z (m)
8 8 8 7 77
0.5
2 22
9
99
16 16 16
1
444
5 5 5
3 33
1717 17
0.5
111
0 1.5
1.5
1
1
0.5 0
0
X (m)
(a)
14 14 14
15 15 15
19 19 19
12 12 12
13 1313
11 1111
0 1.5
1.5
1
1
0.5
0.5
Y (m)
20 20 20
18 18 18
Z (m)
368
Y (m)
0.5 0
0
X (m)
(b)
Fig. 7. Visualization of real sensor node coordinates (green) and computed sensor coordinates by NLM (red) and triangulation (blue): (a) first 10 positions and (b) second 10 positions after arbitrary rotation Table 1. Localization errors in 20 positions of sensor node in the container filled by water Measurement run 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |Δi| for i = x, y, z σ|Δi| for i = x, y, z
Actual position in [cm] x y z 28 45 67.6 28 45 90 125.5 48 67.6 125.5 48 90 120 98 67.6 120 98 90 43 96.5 67.6 43 96.5 90 82 74 67.6 82 74 90 28 45 67.6 28 45 90 125.5 48 67.6 125.5 48 90 120 98 67.6 120 98 90 43 96.5 67.6 43 96.5 90 82 74 67.6 82 74 90
NLM error in [cm] Δx Δy Δz 0.3 1.3 0.4 -1.5 0.8 1.5 -7.3 0.1 3.3 -5.2 -0.7 4.1 0.1 -2.6 7.0 0.2 -1.2 2.5 3.1 -3.1 7.8 2.5 -0.2 4.4 2.6 -2.3 6.6 3.5 -1.5 5.7 0.9 2.8 -0.6 -1.5 0.4 2.2 -5.4 -2.5 1.7 -6.5 -0.3 3.2 1.3 -2.9 5.8 0.6 -1.3 3.2 1.2 -2.4 8.7 3.4 -0.7 4.7 2.5 -0.6 6.4 2.1 -1.0 5.1
Triangulation error in [cm] Δx Δy Δz -1.4 1.1 -0.5 -5.0 -0.2 -1.2 -1.4 1.5 0.6 -4.6 -3.7 5.4 5.9 -3.4 6.1 3.1 -1.2 2.2 2.1 -0.9 2.8 -4.1 3.1 6.4 5.9 -2.5 5.3 5.4 -0.5 4.1 -5.0 1.5 -2.5 -5.2 0.1 -2.4 -1.8 2.2 0.3 4.3 -1.3 0.1 4.7 -3.1 5.7 2.8 -1.1 3.0 -3.9 -2.7 7.9 -2.7 2.1 6.2 0.7 -0.3 6.8 5.0 -0.5 4.4
2.6 1.4 4.1 3.7 1.6 3.7 2.1 1.0 2.5 1.6 1.1 2.5
Intelligent Magnetic Sensing System for Low Power WSN Localization
369
the range of [25 ,45 ]. The results of using NLM algorithm in these experiments were achieved an average absolute error of 2.6 cm, 1.4 cm, and 4.1 cm for |x|, |y|, and |z|, respectively. Figure 7(a) shows the visualization of first 10 positions of sensor node and Fig. 7(b) shows the visualization of second 10 positions with different sensor node rotation to check for rotation invariance properties. In addition, the triangulation method was employed for the comparison to the NLM algorithm on the same data. The NLM algorithm achieved slightly better localization errors than the triangulation. However, the triangulation is faster and has much less computational effort than the NLM. The triangulation can be considered to be implemented inside each of the wireless sensor nodes as a tradeoff between power and localization accuracy. The measurement and computation of localization results are given in detail in Table 1. Deviations in measurement of the magnetic field strength lead to deviations in computed distances and the subsequent localization. Currently observed deviations probably are caused by the employed cables, missing calibration of sensitivity parameter of AMR sensors, offset and gain deviation in instrumentation amplifiers, and manual determination of the actual sensor node position, especially in the container. In addition, the orthogonal placement of the magnetic sensors in the sensor board still need to be ensured to increase the accuracy for the proposed localization.
4
Conclusions
In this paper, we presented and extended a new concept of localization in WSN applied in industrial containers filled with liquid. The employment of intelligent magnetic sensing based on triaxial AMR sensors with a scalable array of emitting coils switched to ternary quasi-DC states offers a viable solution to the heretofore unamenable localization problem in containers. The advantage of this localization concept is that theoretically unlimited number of wireless sensor nodes can be deployed in the container. The coil design can be scaled to the container size and industrial requirements. A triaxial AMR-based magnetic wireless sensor has been developed as a prototype PCB-module, which is currently optimized for low power. In particular, in this paper a distributed localization algorithm based on triangulation has been successfully applied and compared with the previous centralized one based on iterative non-linear mapping [12]. The triangulation proved to be feasible for local implementation and on-line position computation in the wireless sensor nodes, since this method is faster and has much less computational effort and power consumption than the NLM algorithm. Localization in a demonstrator with a cubic volume of 150 cm side length has been achieved in the mean with less than 6 cm absolute localization error, and duration of currently 6 seconds per cycle due to relays switching of the coils. The concept proved to be viable, though numerous implementation imperfections in the proof-of-principle demonstrator system limit the performance. In future work, the switching electronics for controlling the ternary quasi-DC coil
370
K. Iswandy, S. Carrella, and A. K¨ onig
currents has to be improved for speed. State-of-the-art power electronics will allow localization cycles in the order of miliseconds, which fully meets current industrial requirements. Issues as mobility properties of sensor nodes, different container size, and the synchronization aspect of coils switching and sensor node timing have to be considered. Concurrently, integration issues using dies and MEMS technology for minimum size of sensor nodes will be pursued. Acknowledgments. This work was supported by the Federal Ministry of Education and Research (BMBF) in the program mst-AVS, project PAC4PTROSIG, under grant 16SV3604.
References 1. Rao, L., Iswandy, K., K¨ onig, A.: Cost Effective 3D Localization for Low-Power WSN Based on Modified Sammon’s Stress Function with Incomplete Distance Information. In: 14th Online World Conference on Soft Computing in Industrial Applications (WSC14), November 17-29 (2009) 2. Lenz, J., Edelstein, A.S.: Magnetic Sensors and Their Applications. IEEE Sensor Journal 6(3), 631–649 (2006) 3. Sammon, J.W.: A Nonlinear Mapping for Data Structure Analysis. IEEE Trans. Computers C-18(5), 401–409 (1969) 4. Plotkin, A., Paperno, E.: 3-D Magnetic Tracking of a Single Subminiature Coil with a Large 2-D Array of Uniaxial Transmitter. IEEE Trans. Magnetics 39(5), 3295–3297 (2003) 5. Srinivasan, A., Wu, J.: A Survey on Secure Localization in Wireless Sensor Networks. In: Furht, B. (ed.) Wireless and Mobile Communications. CRC Press/Taylor and Francis Group, London (2007) 6. Cui, J.H., Kong, J., Gerla, M.: The Challenges of Building Scalable Mobile Underwater Wireless Sensor Networks for Aquatic Applications. IEEE Network - Special Issue on Wireless Sensor Networking 20(3), 12–18 (2006) 7. Callmer, J., Skoglund, M., Gustafsson, F.: Silent Localization of Underwater Sensors Using Magnetometers. J. Advances in Signal Processing 2010, 1–8 (2009) 8. Teymorian, A.Y., Cheng, W., Ma, L., Cheng, X., Lu, X., Lu, Z.: 3D Underwater Sensor Network Localization. IEEE Trans. Mobile Computing 8(12), 1610–1621 (2009) 9. Chandrasekhar, V., Seah, W.K.G., Choo, Y.S., Ee, H.V.: Localization in Underwater Sensor Networks - Survey and Challenges. In: 1st ACM Int. Workshop Underwater Networks (WUWnet 2006), pp. 33–40 (2006) 10. Priyanath, N.B., Chakraborty, A., Balakrishna, H.: The Cricket Location-Support System. In: 6th ACM Int. Conf. Mobile Computing and Networking (2000) 11. Prakash, N.M., Spelman, F.A.: Localization of Magnetic Marker for Gi motility studies: An in Vitro Feasibility Study. In: 19th Annu. Int. Conf. IEEE on Engineering in Medicine and Biology Society, vol. 6, pp. 2394–2397 (1997) 12. Carrella, S., Iswandy, K., Lutz, K., K¨ onig, A.: 3D-Localization of Low Power Wireless Sensor Nodes Based on AMR-Sensors in Industrial and AMI Applications. In: 15th ITG/GMA-Fachtagung Sensoren und Messsysteme, pp. 522–529 (2010) 13. Lee, R.C.T., Slagle, J.R., Blum, H.: A Triangulation Method for the Sequential Mapping of Points from N-Space to Two Space. IEEE Trans. Computers 26(3), 288–292 (1977)
An Overview of a Microcontroller-Based Approach to Intelligent Machine Tool Monitoring Raees Siddiqui, Roger Grosvenor, and Paul Prickett The Intelligent Process Monitoring and Management Centre Cardiff School of Engineering, Newport Road, Cardiff, UK [email protected], [email protected]
Abstract. This paper presents a milling machine tool cutting process monitoring system based upon microcontroller architecture. The work was undertaken to explore the degree of autonomous intelligent decision making that such an approach can support. It is based upon the embedding of a number of linked algorithms within a microcontroller architecture. The practical implementation and results of the dynamic digital filtering algorithms thus employed are presented and consideration is given to how their outputs can be combined to provide high level process monitoring functions. The same technique can be used for monitoring applications for processes that depend upon health monitoring functions requiring simultaneous multiple parameter analysis. Keywords: dsPIC Microcontrollers, intelligent process monitoring, embedded autonomous systems.
1 Introduction This paper considers the monitoring of a Computed Numerically Controlled (CNC) machine tool cutting operation using autonomous techniques to diagnose the cutting process condition in real-time. Fault diagnosis is undertaken based on the deployment of analysis tools within the process. To facilitate this the system collects and analyzes process signals close to their source using embedded devices. The core of this approach is the use of digital signal Programmable Interface Controller (dsPIC) microcontrollers, making use of the available processing power, computational speed, integrated peripherals and communication modules. The resulting distributed process monitoring system is a viable solution to a wide variety of applications. Considerable work has been undertaken by the Intelligent Process Monitoring and Management (IPMM) group at Cardiff University in developing solutions using dsPIC microcontrollers. These are 16-bit devices with a built-in digital signal processor along with typical microcontroller features. Their power and speed when combined with their front-end processing capabilities, various peripherals and communication interfaces make them ideal for distributed applications. The main objective of this research was to develop and implement data acquisition, processing and analysis techniques using minimal electronic components and sensors. The dsPIC microcontroller-based architecture shown in Figure 1 was designed and deployed to achieve this objective. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 371–380, 2010. © Springer-Verlag Berlin Heidelberg 2010
372
R. Siddiqui, R. Grosvenor, and P. Prickett.
Fig.1. Developed microcontroller-based monitoring system architecture
This system is based on a three-tier architecture which utilizes ‘Front End Nodes’ (FENs) to enable application specific data acquisition and signal processing at the first tier. Selected parameters are analysed in a FEN for any fault symptoms using embedded procedures and applications. Signal acquisition and processing are performed at this level to take advantage of the processing capabilities of dsPIC and to exploit its potential in tool condition and other monitoring tasks. FEN generated information is shared using Controller-area Network (CAN) bus communications between the Nodes at this tier. This means that a more informed decision can be made about the health of the process autonomously at this level. This can minimize the number of false alarms being made. The second tier forms a connectivity node which interfaces the FENs and provides Ethernet and Global System for Mobile Communications (GSM) connectivity. When conditions require it this tier can also provide an additional source of data processing and provide synchronisation allowing data from various FENs to be combined and analysed. This tier also links the FENs to a server-side PC via the internet, forming the third tier, where further analysis tools can be used on a personal or industrial computer. 1.1 Embedded Tool Condition Monitoring Systems (TCMS) Many process monitoring systems and most machine tool condition monitoring systems have been designed to operate using a PC; up to now very few have been implemented on an embedded platform. Researchers considering the development of TCMS have typically utilised a multi-sensor approach with data acquisition using a data acquisition card installed in a personal computer [1]. The sensors deployed have included accelerometers, dynamometers, microphones and AE sensors. These systems
An Overview of a Microcontroller-Based Approach
373
have begun to evolve onto Digital signal Processing (DSP) platform development boards using sophisticated signal processing algorithms to support decision making using tools such as neural networks [2]. Field Programmable Gate Array (FPGA) solutions are also emerging [3&4] producing a capable but low cost options. In the latest reported version the developed FPGA based tool condition monitoring system uses algorithms to detect tool breakage using spindle motor current [5]. This system is said to be adaptable for different machining processes based upon a sequence of controlled cycles and manually entered maximum thresholds to “teach” the system; the evolution of this into a “self teaching” system is now possible. Due to the cost effectiveness of the new generation of microcontroller devices it is economically feasible to embed a number of them within a machine or process to form a distributed sensor network [6]. Of particular relevance is the option provided to deploy decision making modules close to the source of the information [7]. The use of multiple nodes supports enhanced monitoring functions with the nodes equipped with a specific behavioral model. The fault detection process then becomes a collective operation. As an additional feature the use of multiple independent detection points provides a degree of redundancy that can support fault detection even with the loss of elements of the system. The application of this approach has resulted in a tool breakage monitoring system based on the frequency analysis of the spindle load signal using hardware-based reconfigurable filters [8&9]. The results from all the nodes were combined to provide a decision on the overall tool condition. The further development of this work using a dsPIC based approach is the subject of this paper. 6
2 The Developed Monitoring System Previous tool health monitoring research has indicated that the amplitude of certain frequencies acquired from the spindle load signal increase due to the occurrence of a broken cutter. In the Spindle Monitoring node represented in Figure 2, FEN-based processes for signal conditioning and anti-aliasing filtering are used to condition the signal, making it suitable for interfacing with the microcontroller. The technique simultaneously uses variable data sampling rates and a table of predefined filter coefficients to dynamically adjust its pass band characteristics to match the range of interest. The Digital IIR filters used in this research were developed and implemented in software using the Multiply-Accumulate (MAC) class of DSP instructions [10]. This is a very flexible approach that does pose some limitations on the input signal which needs to be conditioned accordingly, as shown in Figure 2. The operation of this system is considered in the following sections which also describe the design and implementation of a signal processing system utilizing Multiband Infinite Impulse Response (IIR) filters. 2.1 Infinite Impulse Response Filter operation IPMM group research has developed some innovative approaches to developing frequency analysis techniques, including a sweeping filter technique for machine tool signal analysis [8] using programmable analogue filters. The main drawback to the resulting approach was the need to introduce and manage the required programmable
374
R. Siddiqui, R. Grosvenor, and P. Prickett.
analogue filters. This was seen at the time as being limited by the level of available speed and power of the current microcontroller technology. In hindsight these limitations were actually helpful in that they required the development of a more intelligent approach to filter design and operation.
Fig. 2. Spindle Monitoring FEN-based signal acquisition and analysis procedure
Designing and implementing the digital filters was approached using a menudriven and intuitive user interface provided by the device manufacturer [10]. The basic assumption made in this work was that under normal cutting conditions the cutting force applied by a typical milling cutter should be periodic with its tooth passing frequency. This tooth passing frequency depends upon the tool rotation frequency and the number of teeth present on the tool. Simply put the characteristics of a cutting process can be represented by the forces generated by each tooth removing material. Under normal conditions each tooth will remove a similar amount of material; variations from normal will result in a change in the frequency spectrum. In the “tool breakage detection mode” the centre frequencies for two of these filters were the machine tool rotation frequency (fr) and the tooth passing frequency (fp). The third filter was set following analysis of the likely effect of a broken tooth. This varies, but in particular the effect of the missing tooth was investigated and it was determined that frequencies relating to the “number of teeth less one” were obviously informative. In the following examples a four toothed cutter is considered, meaning that the third filter’s pass band was set to the third harmonic of the machine tool rotation frequency (3fr ). The required tool-related information, such as spindle speed and number of teeth was accessible in real time from connected Parameter Monitoring
An Overview of a Microcontroller-Based Approach
375
node using the architecture shown in Figure 1. It was thus possible to set the sweeping filters continuously using parameters input from other FENs. IIR Filters for these frequencies were thus designed and the coefficients were programmed into the dsPIC microcontroller. It is important to stress that the intelligent utilization of information allows this approach to operate. The result of a broken tool arising as indicated by this process is represented in Figure 3.
Fig. 3. Typical outputs of the monitoring system for healthy and broken tools
The increased processing speed of the dsPIC allowed the development of digital filters. In the initial set up there were three different IIR bandpass filters simultaneously implemented in the microcontroller. Each filter’s pass band was dynamically determined and adjusted in real time depending upon the cutting parameters. As cutting parameters vary, either by choice or due to events arising in the process, the monitoring functions adjust accordingly. The spindle speed information provided by the Parameter Monitoring node was then used by the microcontroller in determining any changes needed to the working coefficients for the bandpass filters. The system makes the necessary adjustments to the filter coefficients as spindle speed varies from 250 to 6000 rpm. The required tool-related information, such as spindle speed and number of teeth was accessible in real time from connected Parameter Monitoring node FEN using the architecture shown in Figure 1. It was thus possible to set the sweeping filters continuously using parameters input from other FENs. Filter coefficients for different ranges of sampling rate/spindle speed were calculated and embedded within the microcontroller in the form of a look-up table. 2.2 Relative Energy Index (REI) The output of the IIR filter is a time domain sinusoidal signal of the centre frequency of the filter whose amplitude varies in accordance with the energy contained by that frequency component. Since direct amplitude varies periodically between maximum and minimum, it cannot easily be used for diagnostic routines. To more effectively utilize the filter output, a measure called the Relative Energy Index (REI) was defined by taking the difference between maximum and minimum output values for “n”
376
R. Siddiqui, R. Grosvenor, and P. Prickett.
consecutive samples, where “n” equates to one tool rotation and was dynamically set to the current speed. The REI is calculated after each tool rotation and a separate value is calculated for each filter output. These can be analysed to detect tool breakage using a technique that observes the three frequencies in parallel during the performance of a cutting process. The spindle load data is acquired and buffered in segments of one tool rotation. During the data acquisition stage of the next sample, the processing of the previously acquired sample is completed. The processing includes digital filtering and updating the minimum and maximum values. After a segment of data is acquired and processed; the peak to peak difference (defined as the REI indicator for that revolution) is calculated and local decision making is performed using indicators to confirm the health of the tool, as outlined in Section 3. This technique ensures that data processing is completed in shortest possible time and results are available immediately after the revolution is completed. To illustrate the basis of this method consider the action of a four toothed end mill. A healthy cutter will register four even cuts per revolution, with relatively little variation between cutting force. The relative amplitude values for the “healthy cutter” tool rotation frequency will be small and will not vary significantly, as seen in Figure 4.
Fig. 4. The Tool Rotation Frequency output for a healthy and broken cutter
The relative amplitudes for a broken cutter however can be seen to increase markedly. This is due to the fact that previously tool rotation alone was responsible for generating this frequency whereas now, the once per revolution action occurring when using the broken cutter is also adding to the strength of this frequency. By monitoring the REI of this filtered frequency component, the tool breakage can easily be detected, as indicated in Figure 5. This ensures that the monitoring technique focuses on the filter’s optimum performance to ensure the best possible results with the minimum number of possible false alarms. This process is underpinned with the establishment of a threshold, shown in Figure 5, crossing which can be taken as an indication of an abnormal variation in cutting. It should be noted however that there are many one-off
An Overview of a Microcontroller-Based Approach
377
events, such as machining an inclusion, breaking-in or breaking-out of a workpiece and step changes in the depth of cut that could cause a significant variation in tool force and to the relative amplitude arising during a single tool rotation.
Fig. 5. The REI output for a healthy and broken cutter
An example of this may be observed at the start of the cut shown in Figure 5. This could potentially be misdiagnosed by the REI technique alone since the REI value crosses the threshold. But, since these disruptions are very much short term, they will not cause similar changes to the frequency based analysis of the load, nor will they remain in subsequent REI levels, since they are not cyclic. The diagnostic process can thus be linked to sustained increases in REI, which, when confirmed by changes to the frequency components, can be used as a trigger for an alarm signal to warn of tool breakage. The result is a system that takes two inputs and reaches decisions which are more robust than the two sources taken individually. This is an important feature of the deployed system and one that is emphasized by the addition of further monitoring techniques. 2.3 Tool Rotation Energy Variation (TREV) TREV is calculated along the same lines as the calculation of REI, using the acquired spindle load signal which is processed directly, instead of using the filtered signals. The FEN architecture shown in Figure 2 is again used here. A pre-amplification circuit at the signal conditioning stage included a blocking high pass filter deployed to remove the cutting force variations arising due to tool entry, tool exit and changes in other milling parameters. This made the signal more stable and reliable for time domain processing. A typical output can be seen in Figure 6. It was observed that, as with the REI, the overall signal variations for each tool rotation were very small for a healthy cutter and relatively large for broken tooth cases. A virtual node was implemented in the system for find this parameter which provided the broken tool indication after one revolution. This could trigger the sequence to verify tool breakage via the frequency component calculations.
378
R. Siddiqui, R. Grosvenor, and P. Prickett.
Fig. 6. The Tool Rotation Energy Variation output for a healthy and broken cutter
3 Monitoring System Operation and Decision Making The nature of the deployed frequency domain analysis techniques and the dynamic digital filtering methodology allow the Spindle Monitoring node to perform a number of sensing functions simultaneously using multiple frequency estimation applications. The assessments made by the individual monitoring methods are combined according to Table 1. In this table ‘0’ represents a normal level, ‘1’ indicates that threshold has been crossed and ‘X’ denotes a “don’t care” condition. In the Decision column ‘?’ indicates an unexpected result that triggers a request for advanced diagnosis. Table 1. Spindle Monitoring Node Tool Condition Decision Table
Case
TREV
A B C D E F G
0 1 1 1 1 0 0
fr 0 1 0 0 0 1 0
REI 3fr 0 X 1 0 0 X 1
fp X X X 1 0 X X
Decision Healthy Broken tooth Broken tooth Blunt tool Wait for next Frame ? ?
The extracted features formed the basis for diagnostics and tool health monitoring following a procedure outlined in Figure 7. This procedure was embedded in the monitoring FEN. This process operates continuously; while the data for a tooth
An Overview of a Microcontroller-Based Approach
379
rotation is being acquired, the calculations on the previous acquisition are carried out in parallel and the results are compared against parameter dependent thresholds for decision making. Thus, under “normal” conditions (Case A) the monitoring system would not react. If a broken tooth arises it will be indicated by a combination of these measures (Cases B & C). When these “broken tool” events are diagnosed an alarm message is sent to the Parameter Monitoring node which, in combination with information from other FENs can make the decision to stop the cutting process.
Fig. 7. The cutter health condition diagnosis process
The increase in overall cutting force levels and unevenness of the cutting process arising when operating with a blunt tool (Case D) will produce conditions under which TREV would be large but the REI of the tool rotation related frequencies would be low and would not cross the threshold. In such a circumstance a “blunt tool” message is sent up through the architecture to the Management Application level. This would produce an operator or maintenance action such as suggesting that the tool should be checked after the machining cycle finishes. The possible delay between TREV and REI based detection may require an additional tool rotation for positive confirmation (Case E). This is due to the responsiveness of the two measures and when in the cycle the tooth breaks. Unresolved events (Cases F&G) which are marked in the Decision column as “?” can be caused by occurrences arising during the cutting cycle such as tool entry, shoulder milling etc. These should not result in an alarm condition and are thus deliberately filtered from the TREV assessment, as outlined in section 2.3. They also arise due to common but uncontrollable events in the process, such as the machining of inclusions (hard spots) and swarf adhesion to a cutter tooth, both of which are transient events that will normally be resolved without halting the cutting process. However, since this type of occurrence may have long term implications with regard to the health of the cutter, the Spindle Monitoring node can send the acquired data to the Management Application for analysis. At this level a “human expert” can apply advanced diagnosis procedures, such as a more detailed frequency analysis and compare this event to previously recorded events. If required updates to the monitoring functions in the FENs can be downloaded using a run time self programming capability.
380
R. Siddiqui, R. Grosvenor, and P. Prickett.
4 Conclusions This paper has outlined a microcontroller based implementation of real-time signal analysis techniques. The system was shown to be able to perform all the computations needed by the deployed algorithms, to enact the decision making process, to undertake house keeping and communicate the results in approximately 2.5 milliseconds. At the maximum spindle speed of 6000 rpm the required cycle time would be 5 milliseconds. This is twice the time required by the deployed algorithms. This means that the microcontroller can perform some additional processing algorithms if and when required. Particular emphasis is placed upon this ability of the Spindle Monitoring node to make an instant but accurate diagnosis of a fault condition. This is facilitated by the integration of modules which may all be monitoring the same process, and can support the fault diagnosis procedure. This can provide a very powerful and flexible sensor system, which can be economically deployed to monitor a wide range of machines and processes.
References 1. Li, H.Z., Chen, X.Q., Zeng, H.: An Embedded Tool Condition Monitoring System for Intelligent Machining (2004), http://www.simtech.a-star.edu.sgTR0307.pdf 2. Baek, D.K., Ko, T.J., Kim, H.S.: Real time monitoring of tool breakage in a milling operation using a digital signal processor. J. of Materials Processing Technology 100, 266–272 (2000) 3. Jesus, R.D., Gilberto, H.R., Ivan, T.V., Carlos, J.J.: FPGA based on-line tool breakage detection system for CNC milling machines. Mechatronics 14, 439–454 (2004) 4. Jesus, R., Ruiz, G.H.: FPGA Implementation of a Tool Breakage Detection Algorithm in CNC Milling Machines. In: Field Programmable Logic and Application, pp. 1142–1145. Springer, Heidelberg (2004) 5. Franco-Gasca, L.A., Jesus, R.R.D., Herrera-Ruiz, G., Peniche-Vera, R.: FPGA based failure monitoring system for machining processes. The Int. J. of Advanced Man. Tech. 40(78), 676–686 (2008) 6. Bolic, M., Drndarevic, V., Samardzic, B.: Distributed measurement and control system based on micro-controllers with automatic program generation. Sensors and Actuators A: Physical 90, 215–221 (2001) 7. Kandasamy, N., Hayes, J., Murray, B.: Time-constrained failure diagnosis in distributed embedded systems: application to actuator diagnosis. IEEE Trans. Parallel and Distributed Systs. 16, 258–270 (2005) 8. Amer, W., Grosvenor, R., Prickett, P.: Machine tool condition monitoring using sweeping filter techniques. J. of Systems and Control 221, 103–117 (2007) 9. Prickett, P.W., Grosvenor, R.I.: A Microcontroller-Based Milling Process Monitoring and Management System. Proc. I. Mech. Engineers Part B: Journal of Engineering Manufacture 221, 357–362 (2007) 10. Microchip Technology dsPIC Digital Signal Controller Introduction to the dsPIC30F Architecture, http://techtrain.microchip.com/webseminars/documents/ dsPICArch_p1_120104.pdf 11. Digital Filter Design/Digital Filter Design Lite Product Overview, http:// ww1.microchip.com/downloads/en/DeviceDoc/01033B%2033.pdf
Use of Two-Layer Cause-Effect Model to Select Source of Signal in Plant Alarm System Kazuhiro Takeda1, Takashi Hamaguchi2, Masaru Noda3, Naoki Kimura4, and Toshiaki Itoh5 1
Department of Materials Science and Chemical Engineering, Shizuoka University, 432-8561 3-5-1 Johoku, Naka-ku, Hamamatsu-shi, Shizuoka 2 Graduate School of Engineering, Nagoya Institute of Technology, 466-8555 Gokiso-cho, Showa-ku, Nagoya-shi, Aichi 3 Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192 8916-5 Takayama, Ikoma-shi, Nara 4 Department of Chemical Engineering, Kyushu University, 819-0395 744 Motooka, Nishi-ku, Fukuoka 5 612-8324 580 Azukiya-Cho, Fushimi-ku, Kyoto [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract. Alarm systems provide an interface between operators and the machinery in a chemical plant because they allow abnormalities or faults caused by operators to be detected at an early stage. Alarms should be used to enable operators to diagnose faults and plan countermeasures, and nuisance alarms should be eliminated. We propose the use of a selection algorithm of sets of pairs of alarm variables and their signs. The signs mean the upper or lower limits of the alarm variables. The selected sets of pairs are theoretically guaranteed to be able to qualitatively distinguish all assumed faults. We propose using a two-layer cause-effect model for the algorithm, which represents the cause and effect relationship between state variables based on the topology of a plant. This model is applied to a simple process. The simulation results illustrate the usefulness of our method. Keywords: Plant Alarm System, Two-Layer Cause-Effect Model.
1 Introduction Larger scale chemical plants or more intelligent plant monitoring systems lead to larger the monitoring domain for one operator, more complex fault detection and fault diagnosis, and more troubles by human error. Importance of a plant alarm system has been increased as a human-machine interface to early detecttion of abnormality or fault by operators. In the plant alarm system, alarm variables with upper and lower limits of monitoring ranges for quality, cost, and safety and so on are used. If the values of the alarm R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 381–388, 2010. © Springer-Verlag Berlin Heidelberg 2010
382
K. Takeda et al.
variables exceed the upper or lower limits, the alarm system alerts with sounds and lights, and displays detail information on the CRT of DCS (Distributed Control System). On the other hand, unnecessary alarms may be assigned, because there isn’t an effective method for plant alarm system design. Many inadequate alarms cause alarm floods in a short time. The inadequate alarms make miss judgments and/or operations in the situation. They often cause various plant troubles. Therefore, the regulations and guidelines for plant safety are newly established in the U.S. and Europe. ISA (International Society of Automation) in the U.S. requires strict risk management through a plant life cycle, which consists of design, operation and management phase [1]. EEMUA in Europe requires that philosophy, objectives and functions of alarm system are clearly developed, and inadequate alarms are eliminated [2]. But they don’t describe design method of the alarm system. Luo et al. propose a method that a set of alarm source variables are selected based on abnormality propagation diagram [3]. It shows the propagation process of abnormalities, which is caused by a malfunction. Alarm system designer selects a symptom at first. They search the candidates of alarm source variables that are the nearest neighbor distance from the listed malfunctions in the diagram and the candidates could be brought by only one malfunction. Bhushan et al. propose design of sensor location to maximize observability and reliability under available resource using a Cause-Effect (CE) model [4]. The CE model can be easy to maintain through the revamping of the plant. But, above papers don’t mention a problem whether all assumed faults in the plant could be distinguished by use of selected alarm source candidates or not. Takeda et al. propose an alarm variable selection method using a CE model [5]. The method is theoretically guaranteed that the selected sets of alarm source variables are able to distinguish all assumed faults. But, the method cannot decide which upper and/or lower limits are used as alarms. In this paper, a pair of an alarm variable and its sign {+, −} of upper or lower limit of the monitoring range is called an “alarm pair”, and a set of alarm pairs are called an “alarm pairs set”. The alarm pair sets which can distinguish all assumed faults are called an “alarm pairs sets family”. Cause-effect relationships of deviations from steady state of state variables are represented by a two-layer CE model. The new algorithm to derive the alarm pairs sets family using the model is proposed. The algorithm decides which upper and/or lower limits are used. The alarm system design contains selection of the alarm pairs sets family and adjustment of threshold of monitoring ranges, but this paper discusses selection of the alarm pair sets family. In the second section, we define the alarm pairs sets family. In the third section, the two-layer CE model which represents cause-effect relationships of deviations from steady state of state variables are proposed. In the fourth section, the algorithm to derive the alarm pairs sets family using the model is proposed. A simple process is used to display efficiency of the proposed model and algorithm, in the fifth section.
Use of Two-Layer Cause-Effect Model to Select Source of Signal
383
2 Selection Problem of the Alarm Pairs Sets Family This paper proposes an algorithm to decide which upper and/or lower limits are used. It is assumed that a measured variable of a plant is s, and a set of measured variables is S. An alarm of upper limit of monitoring range is represented by {+}, and that of lower limit is represented by {−}. A pair of measured variable s and the upper or lower limit {+, −} of the monitoring range is called a “measured pair” sp, and a set of the measured pairs is called a “measured pairs set” Sp. All fault origins which may be occurred in the plant are mapped to the pairs of state variables and their signs. The pair of the state variable and its sign is called an “origin pair” z and the set of all the origin pairs in the plant is represented by Z. The set of the origin pairs to be distinguished by the alarm system are assumed to C. It is assumed that C could be given by PHA (Process Hazard Analysis), like HAZOP (HAZard and OPerability) study, and two or more faults cannot be occurred simultaneously. In this paper, a selection problem of the alarm pairs is defined that an alarm pairs sets family A is selected as a family of sets of the measured pairs sp to qualitatively distinguish all the elements of C. To distinguish the elements of C means to avoid that operators cannot know which the candidates for the origin of the fault is a real cause of the failure in abnormal situation.
3 Two-Layer CE Model A two-layer CE model which represents cause-effect relationships of deviations from steady state of state variables are proposed. In the model, two types of nodes which represent deviation from steady state of a state variable i are defined. i+: A node which represents upper deviation from steady state of i. i−: A node which represents lower deviation from steady state of i. The nodes which represents deviation from steady state of i are contained in + layer or/and in – layer of the model. If only one way deviation may be occurred by like a pump, a manual valve and so on, one node is contained in one layer. Direct causeeffect relationships of deviations from steady state of state variables are represented by branches between each node. The model can be constructed from a topology of the plant piping. A two-layer CE model of an example tank system in Fig. 1 is shown in Fig. 2. The state variables of the system are assumed to inlet flow rate F1, F2, liquid level L, and outlet flow rate F3, respectively. When inlet flow rate F1 is increased and exceeded upper deviation from steady state, liquid level L is increased and exceeded upper deviation. So, cause-effect relationship from a node F1+ to a node L+ is represented by a branch from the node F1+ to the node L+.
384
K. Takeda et al.
F1
F2
L
F3
Fig. 1. Example tank system
+ Layer
F 1+
F2+
L+
F 3+
- Layer
F 1-
F2 -
L
-
F3-
Fig. 2. Two-layer CE model of example tank system
Using the model, effect propagation from an assumed fault is studied qualitatively. For example, original pressure of F1 is assumed to be increased, when all the state variables in the example tank system in Fig. 1 are normal and in steady state. Increasing of original pressure of F1 affects to increase inlet flow rate F1 and deviate upward. The situation is called that the node F1+ is fired. When a node corresponding to an alarm pair is fired, the alarm means to alert. Firing F1+ affects to fire L+, because F1+ and L+ are connected by a branch. Increasing of original pressure affects to state variables in the plant along branch directions. Stage 1: Increasing of original pressure affects to increase inlet flow rate F1. Stage 2: Increasing of inlet flow rate F1 affects to increase liquid level L. Stage 3: Increasing of liquid level L affects to outlet flow rate F3.
4 Selection Algorithm of Alarm Pairs Sets Family In this section, an algorithm to derive the alarm pairs sets family using the two-layer CE model is proposed. 4.1 Design Procedure of Alarm System A design procedure of alarm system is proposed as following. The two-layer CE model and the set of measured variable S are assumed to be given.
Use of Two-Layer Cause-Effect Model to Select Source of Signal
(a) 1.
2.
(b) 3. 4.
5. (c) 6.
385
Determine C of the objective plant. List up all the fault origins to be distinguished by the alarm system in the objective plant, and mapped the fault origins to adequate pairs of nodes and the signs of the two-layer CE model of the objective plant. The pairs construct C. Even if all the available measured pairs are used, some of the elements of C may not be distinguished. In the case, some of the elements of C have to be removed. To select the removed elements, determine grades of all elements of C using a risk matrix with frequencies of the fault origins and impacts of the effects and so on. Select alarm pairs sets family. Map all the measured pairs sp of the plant to nodes of the two-layer CE model. Select a family of sets of measured pairs which can distinguish all the elements of C. The algorithm using the model is described in next section. If all the elements of C are distinguished, then make the derived family the alarm pairs sets family A and go to 6, else go to 5. Remove some of the elements of C using grade of each element, and go to 4. Determine the monitoring ranges. The alarm pairs sets consist of A are candidates of alarm variables and monitoring ranges. The monitoring ranges of alarm variables are adjusted using a dynamic simulator and so on. This step is out of scope of this paper.
4.2 Selection Algorithm of Alarm Pairs Sets Family A selection algorithm of alarm pairs sets family which can qualitatively distinguish all the elements of C is proposed as follows. A set of pairs of nodes and the signs corresponding to possible fault origins of an assumed alarm pairs alert pattern is called a “candidates pairs set”. A candidates pairs set, which is not a proper subset of all the other candidates pairs sets for all assumed alarm pairs alert patterns, is called a “greatest candidates pairs set”. In general two or more greatest candidates pairs sets may be exist. Step 1: If C is empty, then finish calculation as any fault origins to be distinguished could not be distinguished. Else if C is not empty, then make A be empty and go to Step 2. Step 2: Let Q be a power set of S. Step 3: Assume each element of Q as a set of alarm variables. Make the assumed alarm pairs alert pattern for the assumed alarm variables. Assume the following three types of alert state for the assumed alarm variables, and represent each states as signs {+, 0, −}, respectively.
(0)
(+) An alert state which the alarm variable excesses the upper limit of monitoring range. No alert state which the alarm variable is in monitoring range.
386
K. Takeda et al.
(−) An alert state which the alarm variable excesses the lower limit of monitoring range. Step 4: Derive family of greatest candidates pairs sets for all the assumed alarm pairs alert patterns. Step 5: Output A and C, and finish.
5 Cases Study 5.1 Example Process The efficiency of proposed method is illustrated using an example process with recycle loop as shown in Fig. 3. A two-layer CE model for the process is shown in Fig. 4. Raw material is fed from out of boundary to tank 1. Product is discharged to out of boundary through tank 2. A part of intermediate is recycled to tank 1. In Fig. 3, P, F, L, and V are state variables of pressure flow rate, liquid level, and valve position, respectively. All state variables without P1 and P2 are measured. The pairs of variables and the signs of fault origins to be distinguished in the process are assumed as follows. ─ ─ ─ ─ ─
P1[+] : High pressure of feed P1[−] : Low pressure of feed F2[−] : Blockage of a pipe from the reactor to tank 2 V4[+] : Abnormal open of a valve V4 V4[−] : Abnormal close of a valve V4
Therefore, C is {P1[+], P1[−], F2[−], V4[+], V4[−]}, and Sp is {F1+, F1−, F2+, F2−, F3+, F3−, F4+, F4−, L1+, L1−, L2+, L2−, V1+, V1−, V2+, V2−, V3+, V3−, V4+, V4−}.
V2 F2 V1
Tank 2
F1
P1
V3
L2
F3
Tank 1
V4 F4 P2
L1
Fig. 3. Two tanks process
Use of Two-Layer Cause-Effect Model to Select Source of Signal
387
Fig. 4. Two-layer CE model for reaction process
5.2 Results of Alarm Pairs Sets Family Using the proposed method, the alarm pairs sets family A to distinguish all elements of C is selected as shown in Table 1. As shown in the results, the minimum number of combinations of selected alarm variables were 7 kinds of sets consisted of 3 variables. Fewer number of alarm variables are preferred, because those are lower potential for nuisance alarms or alarm flood. Total number of the sets is 344. Table 1. Results of alarm variables selection
Number of alarm variables
Number of sets of alarm variables
3 4 5 6 7 8 9 10 Total
7 36 80 100 76 35 9 1 344
388
K. Takeda et al.
One of the fewest alarm pairs sets is {F1+, F1−, F2+, F2−, V4+, V4−}. Alarm pair F2− should be selected to distinguish F2[−]. Alarm pairs F1− and F2− should be selected to distinguish P1[−]. Alarm pairs F1+ and F2+ should be selected to distinguish P1[+]. Alarm pairs F2+ and V4− should be selected to distinguish V4[−]. Alarm pairs F2− and V4+ should be selected to distinguish V4[+]. Takeda et al. proposes alarm variables {F1, F2, V4} should be selected, but the upper and/or lower limits of monitoring ranges could not decided [5].
6 Conclusions We propose a systematic method that all alarm pairs sets are selected to qualitatively distinguish all the pairs of variables and the sign of assumed fault origins to be distinguished. The method uses a two-layer CE model which represents cause-effect relationships of deviations from steady state of state variables. What the alarm system designer do are selecting the alarm pairs from the results of the proposed method set and adjusting the monitoring ranges. They need not to select alarm variables and to adjust the monitoring ranges by trial and error. Because the proposed model is based on the topology of the objective plant piping, rational of alarm system design is clear and the maintainance of the model is easy. On the other hand, many chemical processes have recycle streams. The recycle stream makes strongly connected components of the two-layer CE model. It is very difficult to select alarm pairs sets to distinguish the pairs of variables and the sign of fault origins to be distinguished in the strongly connected component. In future works, alarm pairs sets selection method for the process with recycle streams is expected.
Acknowledgements This research has been conducted under the Workshop28 (Alarm Management) of the Japan Society for the Promotion of Science (JSPS) 143rd Committee on Process Systems Engineering. Discussions with the workshop members are greatly appreciated. This work was partially supported by the JSPS, Grant-in-Aid for Scientiric Research (B), No. 21310112, 2009.
References 1. ANSI/ISA-18.2-2009 Management of Alarm Systems for the Process Industries 2. Engineering Equipment and Materials Users’ Association (EEMUA); ALARM SYSTEMS A Guide to Design, Management and Procurement Publication No.191 2nd edn., EEMUA, London (2007) 3. Luo, Y., Liu, X., Noda, M., Nishitani, H.: Systematic design approach for plant alarm systems. J. Chem. Eng. of Jpn. 40, 765–772 (2007) 4. Bhushan, M., Rengaswamy, R.: Design of sensor location based on various fault diagnostic observability and reliability criteria. Comp. Chem. Engineer. 24, 735–741 (2000) 5. Takeda, K., Hamaguchi, T., Noda, M.: Plant Alarm System Design based on Cause-Effect Model. Kagaku Kogaku Ronbunshu 36, 136–142 (2010)
Coloured Petri Net Diagnosers for Lumped Process Systems Attila T´ oth1 , Erzs´ebet N´emeth2 , and Katalin M. Hangos3,1 1 2 3
Faculty of Information Technology, University of Pannonia, Veszpr´em, Hungary School of Chemical Engineering, The University of Queensland, Brisbane, Australia Process Control Research Group, Systems and Control Laboratory, Computer and Automation Research Institute, Budapest, Hungary
Abstract. A model-based method is proposed in this paper for algorithmic generation of diagnosers in coloured Petri net (CPN) form from characteristic input-output traces obtained from a qualitative model of lumped process systems. The qualitative model contains the description of the considered persistent faults in the form of fault indicators, and it is transformed into a CPN. The diagnosers are constructed from a CPN obtained by the process mining methodology using the generated inputoutput traces for identically constant inputs. The concepts and methods are illustrated using a simple case study consisting of an industrial storage tank system with additive and multiplicative failures on sensors, and on the behaviour of a pump. Keywords: Coloured Petri nets, diagnoser, qualitative model, process mining.
1
Introduction
Coloured Petri nets (CPNs) are very useful and popular discrete event system models [9] that are used for both control design and diagnosability studies. They are shown to be equivalent to qualitative difference equation models [7], and there is an established methodology of constructing dynamic models in the form of qualitative differential and algebraic equations (DAEs) from the usual DAE models of lumped process systems [6]. The use of qualitative models for automated HAZOP analysis in not new (see e.g. [2]), it has even been generated to batch process systems in a recent study [12], but this has not been used for diagnostic purposes. On the other hand, much progress has been achieved recently in the control community in the area of diagnosability of discrete event systems (see e.g. Lunze, 2009) that forms the theoretical basis of our work. Based on our recent work of constructing qualitative models from dynamic mass balances [8] for diagnostic purposes, the aim of the present work is to construct diagnosers in CPN form that can automatically detect faults from the observable input-output traces of the system. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 389–398, 2010. c Springer-Verlag Berlin Heidelberg 2010
390
2
A. T´ oth, E. N´emeth, and K.M. Hangos
Detectability and Diagnosability of Lumped Process Models
For the purpose of detecting and diagnosing faults in lumped process systems, a special dynamic model of the system is needed that includes not only the description of its normal operating mode, but its considered faulty modes, as well. The common form of dynamic models for lumped process system [6] is a set of ordinary differential equations equipped with algebraic constitutive equations. 2.1
Confluence Type Qualitative Models of Faulty Process Systems
For the sake of fault detection and diagnosis, the lumped process system is decomposed into simple, non-divisible elements that can be static: like sensors, actuating elements (pumps, valves, etc.), connecting elements (pipes, flanges, branches), and dynamic: that are holdups where dynamic conservations balances describe the dynamic behaviours (tanks, reactors, filters, etc.). The simplest possible case is considered here, when only overall mass balances are taken into account as the underlying dynamic models of the holdups [8]. The faults are considered to be persistent, and have a fault indicator variable χ• associated with them. In this way a single difference equation (a sampled dynamic mass conservation balance) is associated to each of the dynamic elements, and a single algebraic equation is used for any static element. The order of magnitude qualitative version [7] of these equations is constructed using the following qualitative range space of the variables and constants: Q = {0, L, N, H} , B = {0, 1} , QE = {e−, 0, L, N, H, e+}
(1)
where 0 is no, 1 is true, L is low, N is normal, H is high, while e+ and e− stand for values outside of the physically meaningful range space. Solution of the Qualitative Model Equations. Having the above qualitative range space of variables, one needs to define a qualitative calculus, which involves at least qualitative addition and multiplication in order to compute the solution of the model equations above. Table 1 shows the defining qualitative addition table that is used in this paper. It should be noted that the qualitative Table 1. Qualitative positive order of magnitude addition table [a] + [b]
0
L
N
H
0 L N H
0 L N H
L N H e+
N H e+ e+
H e+ e+ e+
Coloured Petri Net Diagnosers for Lumped Process Systems
391
addition defined in Table 1 is not the usual interval-based definition [7] but a refined one using heuristic elements. The solution of the obtained qualitative model equations are then generated using qualitative calculus enumerating all the possible combinations of the qualitative values for the input and disturbance variables, and initial conditions for the state variables. The solution of the qualitative model equations is finally arranged into a solution table. The Solution Table of a Tank Element with Overflow Fault. A tank with manipulable in- and outflow is assumed with overflow as a possible fault. The time-discretized dynamic equation is in the form: [m](k ˆ + 1) = [m](k) + [vin ](k) − [vout ](k) (H, [m](k ˆ + 1) − H) if [m](k ˆ + 1) = e+ ([m](k + 1), [vof ](k + 1)) = ([m](k ˆ + 1), 0) otherwise
(2)
where [x](k) is the qualitative value of the variable at the kth time instance, [m] is the mass in the tank, [vin ] and [vout ] are the inlet and outlet mass flow rates, and [vof ] is the overflow mass flow rate. With appropriate scaling and normalization, the interesting part of the qualitative solution of Eq. (2) is given by Table 2. Table 2. Qualitative solution table for the tank element with overflow. X denotes any element of Q, and {L, N, H} stands for a set of qualitative values. [vof ] [m] (k + 1) (k + 1) 0 L L L N N ...
[m] (k)
{L, N, H} H H H H H ...
[vin ] [vout ] (k) (k)
{L, N, H} H H H H H ...
X L N H N H ...
X 0 L N 0 L ...
[vof ] [m] (k + 1) (k + 1) 0 0 L 0 0 L ...
0 H H L H H ...
[m] [vin ] (k) (k) 0 N N N N N ...
[vout ] (k)
0 {L, N, H} N L N 0 N H H N H L ... ...
Sensor with Additive Fault. The simple static sensor senses the true value x and transforms it to the measured value xm with a possible bias b, that is small, i.e. [b] = L. The presence and the sign of the bias is described by the fault indicator χx ∈ {−1, 0, 1}), where −1 is the negative, while 1 is the positive additive sensor fault. The describing qualitative model equation is an algebraic equation [xm ](k) = [x](k) + χx [b]
(3)
where [x] ∈ Q, [x ] ∈ QE and χx ∈ {−1, 0, 1}. The qualitative solution table of the above equation is given in Table 3. m
392
A. T´ oth, E. N´emeth, and K.M. Hangos
Table 3. Qualitative solution table for the sensor element with and without additive fault [x]m (k) 0 L N H
[x](k) [χx ] 0 L N H
0 0 0 0
[x]m (k) L N H e+
[x](k) [χx ] 0 L N H
1 1 1 1
[x]m (k) −
e 0 L N
[x](k) [χx ] 0 L N H
-1 -1 -1 -1
Composite System Models in CPN Form. A CPN [9] model is constructed from the elements (various tanks, sensors, actuators, and connecting elements) of the process system. Each element has a transition associated to it, while the measurable input and output variables, as well as the state and fault indicator (disturbance) variables are described by places. The transition functions are determined from the qualitative solution table describing the element in question. 2.2
Traces, Detectability, Diagnosability
With the above qualitative order of magnitude range set of the variables, the detection and isolation of faults can be based on discrete event sequences, called input-output traces. An input-output trace is a timed sequence of events that only contains events related to the measurable inputs and outputs of the system. A fault is detectable in a model if there exists at least a particular inputoutput trace such that the occurrence of the fault can be uniquely derived from it. This trace will be called a fault signature of the fault. Note that to detect the presence of at least one fault, it is enough to recognize that the system is not in its normal mode. That is to recognize a deviation from a characteristic nominal, normal input-output trace. A fault is diagnosable in a model if it is detectable, and its qualitative value can also be uniquely determined from at least one fault signature. With our assumptions of the faults (persistency and the presence of fault indicator variables), each detectable fault is diagnosable at the same time.
3
Coloured Petri Net Diagnosers
A diagnoser CPN that recognizes the fault signature of a fault indicator variable consists of transitions corresponding to each discrete time instance present in the fault signature input-output trace. Its input places are connected to the measurable input and output variables of the system to be diagnosed that correspond to the qualitative values of the measurable system variables in the fault signature. Its output places present the diagnostic output, by means of the qualitative value of the fault indicator variable. The state of the CPN diagnoser in every discrete time shows the possible faults or the paths that may lead to the faulty situation.
Coloured Petri Net Diagnosers for Lumped Process Systems
3.1
393
The Design of CPN Diagnosers
The confluence type qualitative model of the composite system can be constructed by appropriately concatenating the qualitative solution table of the components. The corresponding hierarchical CPN model is built from the elements of the system stored in a CPN library. Generating the Fault Signatures. Using the composite CPN model, the qualitative input-output behaviours of the diagnosable system can be generated by simulation of the CPN model with all possible identically constant input values in different faulty and non-faulty situations. By logging the observable input-output signatures with the non-observable fault indicator, all the possible input-output traces of the model can be generated. Once the input-output behaviours in different faulty and non-faulty cases are available, the detectability of each situation needs to be carried out. The first step is filtering out those input-output behaviours for which the input-output traces are unique for a fault situation (i.e. the fault signatures), as the modelled faults can be diagnosed based on them. The next step is to create equivalence classes by grouping those input-output traces which are identical under different fault situations, that show the non-detectable faults. The Structure of the Diagnoser CPN. The input places of the diagnoser are the places corresponding to the observable input and output variables in the model that describe their qualitative values in a given discrete time. Each output place of the CPN diagnoser is related to a detectable fault and has a token if the particular fault has been detected. The internal places of the CPN correspond to the observable states of the diagnosable system. Each transition in the diagnoser CPN represents the transition between two observable states of the system to be diagnosed. Therefore, there is a transition between two internal places, if one could be the next observable state of the another. Hence there is an input-output event which moves the diagnoser from one to another state. The input-output event related to a transition is encoded as the guard condition of the transition. 3.2
The Algorithmic Generation of CPN Diagnosers
The algorithmic generation of the CPN diagnoser is performed in three consecutive steps. 1. To generate the possible input-output behaviours, the composite CPN process model implemented in CPNTools [3] was extended to simulate all possible inputs in faulty and non-faulty situations and create sufficient logs with MXML Logging Extensions for CPN Tools [1] to enable further analysis. 2. The process mining framework ProM [13] was used to discover and extract several kinds of information from a log. This tool helps us to identify the unique traces, or fault signatures, and the equivalence classes. Using the Alpha algorithm of ProM, a Petri net was constructed and exported to CPNTools format.
394
A. T´ oth, E. N´emeth, and K.M. Hangos
3. The resulting CPN can be used as the diagnoser of the model, that should be manually configured and connected to the measured qualitative data source such as a monitoring system or simulator in order to obtain real-time data for its operation. 3.3
Complexity of the CPN Diagnosers: Distributed Diagnosis
The complexity of the CPN diagnosers, in terms of the number of its places, depends on the number of the fault signatures which is determined by the number of observable variables in the system be diagnosed. Since the diagnoser is a representation of the observable part of the state space of the model, the size of the CPN diagnoser can be affected by the state explosion problem. In [4] and [5] a distributed on-line fault detection and isolation method is presented for a Petri net (PN) modelled system. Those PN models can be decomposed into several place-bordered PNs satisfying certain conditions. The PN diagnoser of each smaller sub-models can then be constructed, and the small PN diagnosers communicate during the diagnostic process when they exchange their local diagnostic results with their neighbours. As it is formally proven in [4], [5], the presented distributed algorithm recovers the same diagnostic information as the centralized one.
4
Case Study
An industrial system is used in the case study to illustrate the proposed concepts and algorithms. 4.1
The BTX Storage and Transfer System
Benzene-Toluene-Xylene (BTX) is a by-product of the gas processing operation in BlueScope Steel Ltd, Australia. BTX is collected in storage tanks, then periodically it is loaded into road tankers for transport to offsite customers. As a simple illustrative example, let us consider the system with consists of bulk storage tank (BTX tank) tha can overflow to a bund tank, equipped with piping, valves, pump and instrumentation, as illustrated in in the top of Fig. 1. The bottom of the figure shows the corresponding hierarchical CPN model. The confluence type qualitative model of the composite system has been constructed by concatenating the qualitative solution table of each element, these are described above in section 2.1. The measurable qualitative input variable of the system is the inlet flow rate [vin ](k). The measurable qualitative output variables are m – the measured mass (level) in the BTX tank, [Lm BT X1 ](k) and [LBT X2 ](k), m – the measured mass (level) in the Bund tank, [LBund ](k), – the measured pressures in the process line, [P1m ](k) and [P2m ](k), m – the measured output flowrate of the system, [vout ](k).
Coloured Petri Net Diagnosers for Lumped Process Systems
395
FM1 vI XAL1
XAM2
L1
L2
FC
ve
LBTX
XAP1
XAF2
XMP2
P1
FM2
P2
vTO
pO
XAL3 M
L3
BTX
v TO CP XACP
Bund
LBund
Fault indicators:
X_ACP
L_BTX(t0) InitBTXLevel L_Bund(t0) InitBundLevel
X_AL1 BOOL
Initial values:
X_ML2 ADDSENSORFLT
X_AL3 BOOL
X_AP1
X_MP2
ADDSENSORFLT
BOOL
X_AF2 BOOL
ADDSENSORFLT
FLOW V_in BUNDLEVEL
SIMPLIFIED BTX SYSTEM
V^T_out
Simplified BTX
FLOW
FLOW
CP_state(t0) InitPumpState PUMPSTATE
Measurements:
L^m_BTX1 SENSOROUT
L^m_BTX_2 SENSOROUT
L^m_Bund SENSOROUT
P^m_1 SENSOROUT
P^m_2 SENSOROUT
v^m_out SENSOROUT
Fig. 1. BTX storage and transfer system and its CPN realization
In addition, we need to specify – the initial mass in the BTX tank [LBT X ](t0), – the initial mass in the Bund tank [LBund ](t0), and – the initial state of the centrifugal pump [CPstate ](t0). The considered fault are additive and multiplicative failures on sensors, and on the behaviour of pump. 4.2
Constructing the CPN Diagnoser
Using the CPN Tools simulators [3], all the possible input-output traces assuming constant input flowrate have been generated and fed to ProM [13]. A significant part of the resulting CPN is seen Fig. 2. The transition boxes in the left column of the figure encode the fault situations, the initial state and the input signal values in the form of ”F ault i1 i2 i3 [v0, v1, v2] χ1 χ2 χ3 χ4 χ5 χ6 χ7 ” where ”i1” is the qualitative initial value of the BTX mass, ”i2” is the qualitative initial value of the Bund mass, ”i3” is the qualitative initial state of the pump, ”v0,v1,v2” are the qualitative values of the inlet flow rate in the different
396
A. T´ oth, E. N´emeth, and K.M. Hangos
Fig. 2. A part of the log mining result created by the Alpha algorithm in ProM n__43____44__
Step_L_0_0__NNH0000
Step_L_0_0__NNH0000 UNIT
n__52____53__
Step_N_0_0__NNHL000
Step_N_0_0__NNHL000 UNIT
()
n__44____45__
Step_L_0_0__NNH0000__NNHN000
Step_L_0_0__NNH0000__NNHN000 UNIT
n__53____54__
Step_N_0_0__NNHL000__NNHL000
Step_N_0_0__NNHL000__NNHL000 UNIT
n__45____46__
Step_L_0_0__NNH0000__NNHN000__NNHN000
Step_L_0_0__NNH0000__NNHN000__NNHN000 UNIT
n__54____55__
Step_N_0_0__NNHL000__NNHL000__NNHH000
Step_N_0_0__NNHL000__NNHL000__NNHH000 UNIT
() ()
n__56____57__
Step_L_0_L__NLN0LLL
Step_L_0_L__NLN0LLL UNIT Start
n__57____58__
Step_L_0_L__NLN0LLL__NNH0LLL
Step_L_0_L__NLN0LLL__NNH0LLL UNIT
n__58____59__
Step_L_0_L__NLN0LLL__NNH0LLL__NNHLLLL
Step_L_0_L__NLN0LLL__NNH0LLL__NNHLLLL UNIT
()
()
n__60____61__
Step_N_0_L__NNH0LLL
Step_N_0_L__NNH0LLL UNIT
n__61____62__
Step_N_0_L__NNH0LLL__NNHLLLL
Step_N_0_L__NNH0LLL__NNHLLLL UNIT
n__62____63__
() Step_N_0_L__NNH0LLL__NNHLLLL__NNHNLLL n__71__46__55__59__63__67____47__ Detect_0N00000
Step_N_0_L__NNH0LLL__NNHLLLL__NNHNLLL UNIT
UNIT
() () n__64____65__
Step_L_0_N__N0L0NNN
Step_L_0_N__N0L0NNN UNIT
n__65____66__
Step_L_0_N__N0L0NNN__N0L0NNN
Step_L_0_N__N0L0NNN__N0L0NNN UNIT
n__66____67__
Step_L_0_N__N0L0NNN__N0L0NNN__N0L0NNN
Step_L_0_N__N0L0NNN__N0L0NNN__N0L0NNN UNIT
L1 negative bias UNIT
n__68____69__
Step_N_0_N__NLN0NNN
Step_N_0_N__NLN0NNN UNIT
n__44____45__ In
()
n__69____70__
Step_N_0_N__NLN0NNN__NLN0NNN
Step_N_0_N__NLN0NNN__NLN0NNN UNIT
n__70____71__
Step_N_0_N__NLN0NNN__NLN0NNN__NLN0NNN
Step_N_0_N__NLN0NNN__NLN0NNN__NLN0NNN UNIT
Step_L_0_0__NNH0000__NNHN000
UNIT
()
n__45____46__ Out
UNIT
[!v_in=NORMAL andalso !l_1=senNORMAL andalso !l_2=senHIGH andalso !l_Bund = senHIGH andalso !p_1=senNO andalso !p_2=senNO andalso !v_out=senNO]
Fig. 3. A part of the resulting diagnoser for the composite BTX system model
discrete time instances, and ”χi ” is a fault indicator. The measurements in each qualitative simulation time step are encoded as ”Step i1 i2 i3 (output1 )...(outputn )” with outputk = ”vk, o1, o2, o3, o4, o5, o6”, where ”vk” is the input at kth time step, ”o1”–”o6” are the measured output values. The paths that are disjoint from the other ones in the resulting CPN, correspond to the fault signatures: they have been selected and exported to CPNTools format. This selected sub-CPN forms the basis of the CPN diagnoser. Selected
Coloured Petri Net Diagnosers for Lumped Process Systems
397
fault signatures are denoted by red in Fig. 2 referring to the case, when the first BTX level sensor has negative additive bias failure. The connection of the diagnoser to the original CPN model – used as a simulator – was performed by sharing the observable places corresponding to the input and measurement variable places. Fig. 3 illustrates a part of the composite CPN diagnoser corresponding to the input-output traces marked with red in Fig. 2. The bottom of the figure shows the subnet of one step of the diagnoser, which models the input-output observer as the guard of the transition.
5
Conclusion and Discussion
A novel model-based method is proposed in this paper for algorithmic generation of diagnosers in coloured Petri net (CPN) form using characteristic inputoutput traces obtained from a qualitative model of lumped process systems. The diagnosers are constructed from a CPN obtained by the process mining methodology using the generated characteristic input-output traces for identically constant inputs. Based on our experiences with a small industrial case study, the advantages and disadvantages of the approach can clearly be seen. The process mining approach provides a result that can be readily used as a basis for the CPN diagnosers, but the state space of the problem grows very fast with the size of the problem. In the case of only single faults occurrences, we have experienced a state space larger than 2000 elements in our case study. This makes it difficult to select the individual paths suitable for the diagnosis. A remedy of this problem is to design distributed diagnosers, similarly to [4], [5], that is well possible also in this case and will be the subject of our future work.
Acknowledgements This research has been supported by the Hungarian Scientific Research Fund through grants K67625, which is gratefully acknowledged. This work has also been supported by the Australian Research Council Linkage Grant LP0776636, and by the Control Engineering Research Group of the Budapest University of Technology and Economics.
References 1. Alves de Medeiros, A.K., Gunther, C.W.: Process mining: Using CPN Tools to create test logs for mining algorithms. In: Proceedings of the 6th Workshop and Tutorial on Practical Use of Coloured Petri Nets and the CPN Tools, pp. 177–190 (2005)
398
A. T´ oth, E. N´emeth, and K.M. Hangos
2. Bartolozzi, V., Castiglione, L., Piciotto, A., Galuzzo, M.: Qualitative models of equipment units and their use in automatic HAZOP analysis. Reliability Engineering and Systems Safety 70, 49–57 (2000) 3. CPN Group, University of Aarhus, Denmark: CPNTools 2.2.0, http://wiki.daimi.au.dk/cpntools/ 4. Genc, S., Lafortune, S.: Distributed diagnosis of discrete-event systems using Petri nets. In: van der Aalst, W.M.P., Best, E. (eds.) ICATPN 2003. LNCS, vol. 2679, pp. 316–336. Springer, Heidelberg (2003) 5. Genc, S., Lafortune, S.: Distributed diagnosis of place-bordered Petri nets. IEEE Transactions on Automation Science and Engineering 4(2), 206–219 (2007) 6. Hangos, K.M., Cameron, I.T.: Process Modelling and Model Analysis. Academic Press, London (2001) 7. Hangos, K.M., Lakner, R., Gerzson, M.: Intelligent Control Systems: An Introduction with Examples. Kluwer Academic Publisher, New York (2001) 8. Hangos, K.M., N´emeth, E., Lakner, R., T´ oth, A.: Detectability and diagnosability of faults in lumped process systems. In: Pierucci, S., Buzzi Ferraris, G. (eds.) 20th European Symposium on Computer Aided Process Engineering - ESCAPE20, pp. 1447–1452. Elsevier, B.V (2010) 9. Jensen, K.: Coloured Petri Nets. Basic Concepts, Analysis Methods and Practical Use. In: Basic Concepts. Monographs in Theoretical Computer Science, vol. 1. Springer, Heidelberg (1997) 10. Lakner, R., N´emeth, E., Hangos, K.M., Cameron, I.T.: Multiagent realization of prediction-based diagnosis and loss prevention. In: Ali, M., Dapoigny, R. (eds.) IEA/AIE 2006. LNCS (LNAI), vol. 4031, pp. 70–80. Springer, Heidelberg (2006) 11. Lunze, J.: Diagnosability of deterministic I/O automata. In: Prep. 7th IFAC Symposium on Fault Detection, Supervision and Safety of Technical Processes, pp. 1378–1383 (2009) 12. Palmer, C., Chung, P.W.H.: An automated system for batch hazard and operability studies. Reliability Engineering and Systems Safety 94, 1095–1106 (2009) 13. van der Aalst, W.M.P., van Dongen, B.F., Gunther, C.W., Mans, R.S., Alves de Medeiros, A.K., Rozinat, A., Rubin, V., Song, M., Verbeek, H.M.W., Weijters, A.J.M.M.: ProM 4.0: Comprehensive support for real process analysis. In: Kleijn, J., Yakovlev, A. (eds.) ICATPN 2007. LNCS, vol. 4546, pp. 484–494. Springer, Heidelberg (2007) 14. Venkatasubramanian, V., Rengaswamy, R., Kavuri, S.N.: A review of process fault detection and diagnosis Part II: Qualitative models and search strategies. Computers & Chemical Engineering 27, 313–326 (2003) 15. Wen, Y.L., Jeng, M.D.: Diagnosability of Petri nets. In: 2004 IEEE International Conference on Systems, Man and Cybernetics, pp. 4891–4896 (2004)
Proactive Control of Manufacturing Processes Using Historical Data Manfred Grauer, Sachin Karadgi, Ulf Müller, Daniel Metz, and Walter Schäfer Information Systems Institute, University of Siegen, 57076 Siegen, Germany {grauer,karadgi,mueller,metz,jonas}@fb5.uni-siegen.de
Abstract. Today’s enterprises have complex manufacturing processes with several automation systems. These systems generate enormous amount of data in real-time representing feedbacks, positions, and alerts, among others. This data can be stored in relational databases as historical data which can be used for product tracking and genealogy, and so forth. However, historical data is not been utilized to proactively control the manufacturing processes. The current contribution proposes a novel methodology to overcome the aforementioned drawback. The methodology encompasses three process steps. First, offline identification of critical control-related parameters of manufacturing processes and defining a case base utilizing previously identified process parameters. Second, update the case base with real-time data acquired from automation systems during execution of manufacturing processes. Finally, employ similarity search algorithms to retrieve similar cases from the case base and adapt the retrieved cases to control the manufacturing processes proactively. The proposed methodology is validated to proactively control the manufacturing process of a molding machine. Keywords: case-based reasoning, enterprise integration, historical data, manufacturing processes, proactive control, similarity metrics.
1 Introduction Enterprises are confronted with an increasing pressure to manufacture products with high quality, reduced lead times, low cost and mass customization, and at the same time, increase shareholders’ profitability. Due to increased complexity of products and their related manufacturing processes, these challenges are further compounded with events occurring on shop floor like non-adherence to product specifications and resource breakdown. Hence, enterprises endeavor to overcome the aforesaid challenges by enhancing their monitoring and control of manufacturing processes. An enterprise can be hierarchically classified into numerous levels. For instance, VDI 5600 [1] classifies an enterprise into different manufacturing execution system (MES) levels: (i) enterprise control level, (ii) manufacturing control level and (iii) manufacturing level. Attempts are being made to integrate enterprise’s MES levels based on ISO 15704 [2]. Enterprise integration (EI) enhances monitoring and control R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 399–408, 2010. © Springer-Verlag Berlin Heidelberg 2010
400
M. Grauer et al.
of enterprise processes (i.e., business and manufacturing processes). Unfortunately, EI is an elusive goal and in many enterprises, it is not or inadequately realized. Business processes are predominantly located at the enterprise control level and concerned with achieving the enterprise’s long term strategies. Business applications (e.g., ERP system) generate planned performance values (TO-BE) essential to sustain competitive advantages. In addition, these planned values are transactional and generated in non-real time i.e., months or weeks [3]. On the contrary, manufacturing processes are employed to accomplish the objectives set at the enterprise control level. Automation systems are available at manufacturing level (or shop floor) to perform manufacturing processes. Enormous amount of data is generated by these systems during process execution in real-time i.e., seconds or milliseconds, indicating actual performance (AS-IS) of the manufacturing system. Knowledge is embedded into enterprise processes by enterprise members [4] and described through process data. These process data is updated online during execution of enterprise processes in terms of feedbacks. Research has been attempted to integrate TO-BE and AS-IS values for real-time control of enterprise processes [5], [6]. These integrated values are stored in relational databases as historical data which are periodically exploited in offline processes to reveal embedded knowledge utilizing knowledge discovery in databases (KDD), and calculate key performance indicators (KPIs) and overall equipment effectiveness (OEE). However, historical data is not been utilized for proactive control of enterprise processes in real-time. During production, enterprises are confronted with non-adherence to product specifications which is the focus of the current contribution. Deviations in manufacturing process parameters (e.g., required pressure is not achieved) is one of the principal reasons for non-adherence. In addition, products are not assessed immediately after they are manufactured for adherence to specifications but after a certain period of time resulting in decreased productivity. Deferred inspection of product quality may be due to constraints in manufacturing systems or enterprises’ quality strategy. Also, prolonged unobserved deviations lead to reduced resource availability. Finally, decision making process (e.g. concerning products’ quality) can be a complex task spread across different MES levels, and depends upon the quantity and quality of information. Unfortunately, historical data has not been utilized to enhance these decision making processes. On the whole, delays in acknowledging automation systems deviations and inefficient (proactive) decision making processes result in dwindling of productivity. Therefore, it is necessary to perceive the automation systems’ deviations in real-time, intercept these deviations and proactively control the manufacturing processes as well as predicting their outcomes. Elaborate research has been carried out in the area of proactive assembly system. This research has identified parameters for an assembly system to have proactive abilities. In addition, extensive research has been carried out at Information Systems Institute in the area of similarity search for product data, and consequently, various similarity search algorithms have been developed. These algorithms can be adapted for proactive control of manufacturing processes. The current contribution is organized as follows. Section 2 defines proactive and its parameters for an assembly system, introduce similarity search techniques, and presents state-of-the-art related to the use of historical data in manufacturing. Section 3 elaborates a novel research methodology for proactive control of manufacturing processes, and thereby enhancing
Proactive Control of Manufacturing Processes Using Historical Data
401
monitoring and control of manufacturing processes. To validate the elaborated methodology, an industrial case study is described in Section 4. Finally, conclusion is mentioned in Section 5.
2 State-of-the-Art According to Webster dictionary, proactive is defined as “controlling a situation by causing something to happen rather than waiting to respond to it after it happens” [7]. A proactive system reduces the magnitude of disruptions by providing real-time predictable ability during process execution [8]. Elaborate research has been carried out in the area of proactive assembly system [8], [9]. In this regard, researchers have identified proactivity abilities of a system - level of automation (LoA), level of information (LoI) and level of competence (LoC). A tool has been presented to determine the proactivity of the assembly system with various scenarios [8]. One of the scenarios, “operators/systems notices the signals, are able to interpret and has tools to act correctly”, is extremely valid even for manufacturing systems which need more attention. In many instances, design practices are based on past experiences and these experiences are adapted to solve new design problems [10]. At Information Systems Institute, extensive research has been carried out in the area of similarity search and as a result, various similarity search algorithms have been developed to identify similar products from databases [11], [12], [13]. The developed similarity search is based on case-based reasoning (CBR). CBR consists of four phases: RETRIEVE the most similar solved case(s), REUSE the retrieved case(s) to solve the current case, REVISE the reused solution and RETAIN the solution to be used in future [14]. The retrieve phase has two major challenges: calculation of similarity or distance between cases and establishment of a fast retrieval process. For retrieving most similar cases, several similarity and distance metrics for various data types have been developed by researchers. Levenshtein distance and Smith-Waterman similarity measures are used for comparing alphanumeric data [15]. Minkowski distance or its special form i.e., Euclidean distance are used to evaluate the distance between numerical values. Also, Hamming distance can be used to estimate the distance between two input values of equal length for numerical, binary, or string. To assess structural similarities (e.g., between bills of materials), a graph-based distance measure has been applied utilizing an extended Jaccard similarity coefficient [16]. Retrieved cases can be selected for future reuse if their calculated similarity or distance fulfills a predefined similarity threshold. For example, normal (Gaussian) distribution is used to classify cases for reuse after assessing the similarity [17]. For fast retrieval process, special indexing structures for databases have been exploited like the M-Tree or R*Tree [13], [18], [19]. Aforementioned similarity and distance metrics, and retrieval methods have been applied to product data [13]. In current contribution, these metrics and retrieval methods need to be adapted to identify similar process data from historical data. Apart from utilizing the historical data for KDD, KPIs and OEE, historical data has been utilized to create inspection reports and predict maintenance behavior in manufacturing. An approach was proposed for automated creation of inspection reports by
402
M. Grauer et al.
detecting deviations between realization logs written by CNC machines during execution of instructions and approved manufacturing instructions [20]. A methodology for designing a reconfigurable prognostics platform was presented to assess and predict the performance of machine tools [21]. The presented methodology consists of Watchdog Agent® which converts multi-sensory data into machine performancerelated information assisted with historical data. Also, a technique was elaborated capable of achieving higher long-term prediction accuracy by comparing signatures of degradation processes with historical records using measures of similarity [22]. Likewise, diagnostic and prognostic systems were presented to estimate the remaining useful life of engineered systems [23], [24].
3 Research Approach A proactive system detects and mitigates disturbances before their occurrence, and consequently, enhances the production efficiency [8]. A methodology is elaborated to realize a proactive control of manufacturing processes. This methodology enables to perceive the manufacturing systems’ deviations in real-time, intercepts these deviations and proactively controls the manufacturing processes using historical data in correlation with a rule-based system (RBS) and workflow management system (WMS).
Fig. 1. Overview of proactive control of manufacturing processes
The envisaged methodology encompasses three process steps (s. Fig. 1). The first process step consists of offline analysis of automation systems and their manufacturing processes, and identification of critical control-related process parameters (e.g., pressure). In addition, suitable weights are assigned to these identified parameters which assist in calculating overall similarity of the retrieved cases. The cases are managed in a case base whose structure has to be defined based on the identified parameters. The values of the identified parameters are acquired from the automation systems during execution of the manufacturing processes in real-time, and have to be stored in the case base. Finally for real-time process values, sufficient similar cases are retrieved from the case base and adapted to proactively control the manufacturing processes. The aforementioned process steps are elaborated in the following sub-sections. 3.1 Identification of Critical Control-Related Process Parameters Today’s automation systems are complex, and several inputs are required by these systems to execute the manufacturing processes. In addition, enormous amount of data is generated by the automation systems in real-time denoting information like
Proactive Control of Manufacturing Processes Using Historical Data
403
feedbacks, product positions and alerts. For monitoring and control of these manufacturing processes, it is essential to analyze the automation systems and their manufacturing processes to identify critical control-related process parameters. Analysis and identification of critical control-related process parameters can be achieved by employing methods from data mining along with structured interviews. In addition, analysis and identification of process parameters can be strengthened by reengineering techniques and creating data flow diagrams to reveal automation systems’ interdependencies [25]. Data mining is a particular sub-process of KDD process, and its methods and algorithms are utilized in various manufacturing domains (e.g., maintenance, control) [26]. Therefore, various data mining methods like regression analysis (e.g., [27]), clustering, classification, and association can be employed to identify critical control-related process parameters from historical data. In addition for inspection of sufficient data quality, determined parameters can be checked using histogram based methods [28]. In some instances, it is advantageous to validate previously identified control-related parameters and perhaps reduce number of identified parameters using structured interviews with domain experts. However, it is sometimes sufficient to perform only structured interviews with domain experts to identify critical control-related process parameters. While identifying and analyzing the critical control-related parameters, weights need to be assigned to these identified process parameters. The weights can be used for calculation of overall similarity between real-time data of a manufacturing process and cases from the case base. At the outset, all parameters can be assigned same weights i.e., all weights equal to one. Alternatively, individual parameters can be assigned individual weights depending upon their importance in the manufacturing processes. Finally, the case base has to be structured using the identified critical parameters and mostly, this case base is a subset of historical data. 3.2 Data Acquisition during Manufacturing Process Execution To enhance real-time monitoring and control of manufacturing processes, integration within and across various MES levels is mandatory. In this regard, an IT-framework for EI has been envisaged [5]. This IT-framework provides the basic architecture for the current contribution. AS-IS values (e.g., applied pressure) from the automation systems are available via an OPC server in real time (s. Fig. 2). These values are integrated with their corresponding TO-BE values (e.g., required pressure) from an ERP system, and are made available in EI layer. Besides storing these integrated values in relational databases for product tracking and genealogy, the values are used for realtime control of enterprise processes using traceable objects [6]. Traceable objects represent control-relevant objects like orders, products and resources. They are instantiated simultaneously with a corresponding workflow instance in the WMS along with TO-BE values. In addition, during execution of actual manufacturing processes, ASIS values generated by automation systems are used to update the status of traceable objects. The changes in traceable objects’ status are analyzed online by a RBS which is in charge of controlling the manufacturing processes by dispatching necessary control data to automation systems.
404
M. Grauer et al.
Fig. 2. IT-framework for digital enterprise integration (adapted from [5])
3.3 Proactive Control of Manufacturing Processes The envisaged IT-framework (s. Fig. 2) utilizes a RBS and WMS for control of manufacturing processes. However, it does not utilize historical data for real-time control of these processes. As a consequence, the control of manufacturing processes tends to be reactive. To overcome this drawback, similarity detection unit (SDU) is introduced which utilizes historical data as depicted in Fig. 3. The SDU is based on CBR technique and incorporates all the phases of CBR as detailed below.
Fig. 3. Proposed methodology for proactive control of manufacturing processes
1. RETRIEVE: Automation systems generate enormous amount of data and therefore, the case base tends to be large. A case base contains m stored cases (i.e., c1, … ci, … cm) and each case is described by n critical control-related process parameters (i.e., p1, … pj, … pn) as defined in Section 3.1. Integrated data (designated as new case c´) from EI layer is made available online to SDU. It is necessary to use efficient retrieval techniques (e.g., M-Tree indexing based retrieval) to retrieve cases from the case base. Next, compare each retrieved case ci with new case c´ and determine the overall similarity as described in Eq. (1). First, calculate individual similarity for all critical parameters using any of the distance measures mentioned
Proactive Control of Manufacturing Processes Using Historical Data
405
in Section 2 and then proceed with determining overall similarity os(c´,ci) using weights wj. Finally, select those retrieved cases which fulfill the predefined threshold (e.g., 95%).
∑ os(c′, c ) =
n
j =1
i
wj ⋅ sim( p′j , pi, j )
∑j=1 wj n
for i =1,", m
(1)
2. REUSE: Knowledge is defined as a “combination of information and human context that enhances the capacity for decisions” [29]. Hence, it is essential to analyze the retrieved cases, and identify previous decisions, consequence of these decisions and context surrounding the decisions. The identified information encompassed in the retrieved cases can be utilized to predict the outcome of the current manufacturing process (i.e., process is stable or unstable). Finally, the prediction is stored along with the corresponding integrated data (s. Fig. 3). 3. REVISE: The prediction values along with the integrated data are evaluated in a RBS and WMS to proactively control the manufacturing processes. Prediction values are exploited for tweaking the current manufacturing process parameters to provide optimized future process runs and process outcome respectively. 4. RETAIN: Outcome of the RBS and WMS (i.e., control data) is also retained along with the associated integrated data to be used in future.
4 Case Study - Proactive Control of a Molding Process The proposed methodology is a concept and attempts are been made to validate it in an industrial scenario with emphasis on a sand molding machine. The molding machine in consideration is a dedicated machine with a high production rate (approximately manufacturing 250 molds per hour), and controls downstream as well as upstream manufacturing processes. This machine simultaneously produces upper and lower molds at time t1 according to mounted dies as illustrated in Fig. 4. The lower mold is manually assessed for conformance at time t2 along its process route (i.e., after a certain number of lower molds have been produced) and sand cores are placed in the accepted mold. The late manual inspection is mainly due to constraints in the manufacturing system. Finally, upper and lower mold are assembled irrespective of the conformance result and molten material is poured into the mold assembly at time t3 where lower mold adheres to conformance specifications. Analyzing the aforementioned scenario, manual inspection happens later along the process route at time t2. However in case of rejection, a certain number of lower and upper molds would have been already produced (i.e., between time t1 and t2) which possibly will have a high probability of being rejected and could also result in nonutilization of upper mold due to constraint in manufacturing system. As a consequence, monitoring and control of the mold manufacturing process is reactive and results in a reduced production rate and increased wastage of resources (e.g., sand,
406
M. Grauer et al.
Fig. 4. Validation of proposed methodology in sand casting enterprise
energy). To overcome this drawback, SDU is introduced (i.e., just after time t1) to retrieve similar cases online during mold manufacturing process execution (i.e., lower mold). The retrieved information is used to predict if the current process is stable or unstable. Numerous inputs (e.g., clamp pressure, shaft speed, valve settings) are necessary to specify the mold manufacturing process and these inputs define the TO-BE values. During execution of mold manufacturing process, control devices of the molding machine generate real-time data indicating the actual values applied i.e., AS-IS values. TO-BE and AS-IS values are integrated and stored in a relational database. As a consequence, a row is created in the relational database for each mold. Further, critical control-related process parameters are identified which assists in structuring the case base. This case base is a subset of aforementioned relational database. Data from each previously created row is used to define a case. Currently, structured interview with the production manager is used to identify these process parameters. Once sufficient data is stored in the relational database, data mining techniques described in Section 3.1 can be employed to refine the set of control-related process parameters. Finally for simplicity purpose, these control-related process parameters are assigned with a weight equal to one. The aforesaid case base is indexed using M-Tree indexing structure for fast retrieval of similar cases. This structure is updated every time a new case is stored in the database (i.e., when a row is added) during execution of mold manufacturing process. The control-related process parameters of the retrieved cases ci and the new case c´ are subjected to similarity or distance in real-time. Euclidean and Hamming distance measures are employed to determine the difference for numerical and binary data respectively. Finally, overall similarity os(c´,ci) is determined as described in Eq. (1). Further, only the cases whose overall similarity is above the predefined threshold of 95% are selected to predict if the current process is stable or unstable. Also depending upon the prediction, the inputs of mold manufacturing process are tweaked to produce lower molds with a better quality, and thereby, avoiding future rejects.
Proactive Control of Manufacturing Processes Using Historical Data
407
5 Conclusions The current paper has proposed a novel methodology for proactive control of manufacturing processes by utilizing historical data. The methodology involves three process steps. First, offline analyzing and identifying critical control-related parameters of manufacturing processes, assigning weights to these parameters, and defining the structure of a case base. Second, updating the case base (i.e., historical data) with real-time process data acquired from automation systems during execution of manufacturing processes. Finally, similarity detection unit (SDU) can be employed to retrieve similar cases from the case base and adapting the current case to control the manufacturing processes proactively. Further, the proposed methodology is been validated to control the manufacturing process of a molding machine.
Acknowledgments Parts of the work presented here have been supported by German Federal Ministry of Science and Research within German InGrid initiative (#01AK806C) and FP6 EUfunded IST project BEinGRID (IST5-034702).
References 1. VDI 5600: Manufacturing Execution System (MES) - VDI 5600 Part 1 (2007) 2. ISO 15704: Requirements for Enterprise Reference Architecture and Methodologies, ISO 15704:2000/Amd 1:2005 (2005) 3. Kjaer, A.: The Integration of Business and Production Processes. IEEE Control Systems Magazine 23(6), 50–58 (2003) 4. Cheung, W.M., Maropoulos, P.G.: A Novel Knowledge Management Methodology to Support Collaborative Product Development. In: Cunha, P.F., Maropoulos, P.G. (eds.) Digital Enterprise Technology - Perspectives and Future Challenges, pp. 201–208 (2007) 5. Grauer, M., Metz, D., Karadgi, S.S., Schäfer, W., Reichwald, J.W.: Towards an ITFramework for Digital Enterprise Integration. In: Huang, G.Q., Mak, K.L., Maropoulos, P.G. (eds.) Proc. of Int. Conf. on Digital Enterprise Tech., Hong Kong, pp. 1467–1482 (2009) 6. Grauer, M., Karadgi, S.S., Metz, D., Schäfer, W.: An Approach for Real-Time Control of Enterprise Processes in Manufacturing using a Rule-Based System. In: Proc. of Multikonferenz Wirtschaftsinformatik, pp. 1511–1522 (2010) 7. Webster’s Online Dictionary, http://www.websters-online-dictionary.org/ 8. Dencker, K., Fasth, A.: A Model for Assessment of Proactivity Potential in Technical Resources. In: Huang, G.Q., Mak, K.L., Maropoulos, P.G. (eds.) Proc. of Int. Conf. on Digital Enterprise Tech., Hong Kong, pp. 855–864 (2009) 9. Dencker, K., Stahre, J., Gröndahl, P., Mårtensson, L., Lundholm, T., Johansson, C.: An Approach to Proactive Assembly System. In: Proc. of IEEE Int. Symposium on Assembly and Manufacturing, Michigan, pp. 294–299 (2007) 10. Sandberg, M., Larsson, T.: Automating Redesign of Sheet-Metal Parts in Automotive Industry using KBE and CBR. In: Proc. of IDETC/CIE 2006, ASME 2006 Int. Design Eng. Tech. Conf. & Computers and Inf. in Eng. Conf., USA (2006)
408
M. Grauer et al.
11. Lütke Entrup, C., Barth, T., Schäfer, W.: Towards a Process Model for Identifying Knowledge-Related Structures in Product Data. In: Reimer, U., Karagiannis, D. (eds.) PAKM 2006. LNCS (LNAI), vol. 4333, pp. 189–200. Springer, Heidelberg (2006) 12. Müller, U., Lütke Entrup, C., Barth, T., Grauer, M.: Applying Image-based Retrieval for Knowledge Reuse in Supporting Product-Process Design in Industry. In: Aliev, R.A., Bonfig, K.W., Jamshidi, M., Pedrycz, W., Turksen, I.B. (eds.) Proc. of the 8th Int. Conf. on App. of Fuzzy Systems and Soft Computing, pp. 396–404 (2008) 13. Müller, U., Barth, T., Seeger, B.: Accelerating the Retrieval of 3D Shapes in Geometrical Similarity Search using M-Tree-based Indexing. In: Delany, S.J. (ed.) Proc. of the ICCBR 2009 Workshops, pp. 151–162 (2009) 14. Aamodt, A., Plaza, E.: Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. AI Communications 7, 39–59 (1994) 15. Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997) 16. Romanowski, C.J., Nagi, R.: On Comparing Bills of Materials: A Similarity/Distance Measure for Unordered Trees. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 35(2), 249–260 (2005) 17. Alizon, F., Shooter, S.B., Simpson, T.W.: Reuse of Manufacturing Knowledge to Facilitate Platform-Based Product Realization. Journal of Computing and Information Science in Engineering 6, 170–178 (2006) 18. Hjaltason, G.R., Samet, H.: Index-Driven Similarity Search in Metric Spaces. J. ACM Trans. Database Syst. 28(4), 517–580 (2003) 19. Guttmann, A.: R-Trees: A Dynamic Index Structure for Spatial Searching. In: Proc. of the ACM SIGMOD Conference, New York, pp. 47–57 (1984) 20. Tiwari, A., Vergidis, K., Lloyd, R., Cushen, J.: Automated Inspection using Database Technology within the Aerospace Industry. In: Proc. IMechE, vol. 222 Part B: J. Engineering Manufacture, pp. 175–183 (2008) 21. Liao, L., Lee, J.: Design of a Reconfigurable Prognostics Platform for Machine Tools. Expert Systems with Applications 37, 240–252 (2010) 22. Liu, J., Djurdjanovic, D., Ni, J., Casoetto, N., Lee, J.: Similarity Based Method for Manufacturing Process Performance Prediction and Diagnosis. Computers in Industry 58, 558– 566 (2007) 23. Qiu, H., Lee, J., Djudjanovic, D., Ni, J.: Advances on Prognostics for Intelligent Maintenance Systems. In: Proc. of 16th IFAC World Congress (2005) 24. Wang, T., Yu, J., Siegel, D., Lee, J.: A Similarity-Based Prognostics Approach for Remaining Useful Life Estimation of Engineered Systems. In: Proc. of Int. Conf. on Prognostics and Health Management, pp. 1–6 (2008) 25. Grauer, M., Metz, D., Karadgi, S.S., Schäfer, W.: Identification and Assimilation of Knowledge for Real-Time Control of Enterprise Processes in Manufacturing. In: 2010 Second Int. Conf. on Information, Process, and Knowledge Management, pp. 13–16 (2010) 26. Choudhary, A.K., Harding, J.A., Tiwari, M.K.: Data Mining in Manufacturing: A Review Based on the Kind of Knowledge. J. of Intell. Manuf. 20(5), 501–521 (2009) 27. Muia, T., Salam, A., Bhuiyan, N.F.: A Comparative Study to Estimate Costs at Bombardier Aerospace using Regression Analysis. In: Proc. of IEEE Int. Conf. on Industrial Engineering and Engineering Management, Hong Kong, pp. 1381–1385 (2009) 28. Dash, M., Liu, H., Yao, J.: Dimensionality Reduction of Unsupervised Data. In: Proc. of the 9th Int. Conf. on Tools with Artificial Intelligence, pp. 532–539 (1997) 29. De Long, D.: Building the Knowledge-Based Organization: How Culture Drives Knowledge Behaviors. Working Paper, Ernst & Young LLP (1997)
A Multiagent Approach for Sustainable Design of Heat Exchanger Networks Naoki Kimura , Kizuki Yasue, Tekishi Kou, and Yoshifumi Tsuge Department of Chemical Engineering, Faculty of Engineering, Kyushu University 744 Motooka, Nishi-ku, Fukuoka 819-0395 Japan [email protected] http://altair.chem-eng.kyushu-u.ac.jp/
Abstract. Pinch technology is a familiar energy-saving technology in the chemical industries and has been incorporated in the process simula, tion software, however the skilled engineers considerations are required to make preferable heat exchanger networks (HEN). We introduced a multiagent oriented simulation framework into the HEN design to save the energy usage and also reduce the CO2 emissions. In our framework, a number of “ HEN design agents” have their own strategies to optimize HEN respectively. In this paper, we show the effectiveness of our framework applying to two chemical processes. Keywords: multiagent, pinch technology, heat exchanger network.
1
Introduction
Post-Kyoto Protocol negotiations on greenhouse gas emissions have been held for sustainable environment. Although conflicting developed countries and developing countries, there is a consensus that reductions of the energy consumption and the emissions of greenhouse gases are required to prevent the global warming. In chemical industries, it is requested to reduce emissions moreover. “Pinch Technology” is developed in 1970s, and it applied to various plants to save energy consumption to cover the first and second oil crises. Yoon et al. applied pinch technology to the actual Ethylbenzene production process[1]. The energy saving of chemical plants largely depends on the performance of the heat exchanger network (HEN). Pinch technology is useful in this respect. Actually, the methodology of pinch technology has been introduced into various software. Mart´ın et al. have developed their original software named “Hint” based on the methodology of pinch technology to educate chemical engineers in the masters’ course[2]. But, such a software assist the engineers to design and analyze the HEN, it is not possible to optimize the HEN automatically. Because it is required large amount of load and skilled engineers’ considerations to analyze and design the preferable HEN, automatic optimization of HEN achieves effective and efficient promotion of energy saving. Ponte-Ortega et al. and Dipama et al. tried to optimize HEN using genetic algorithm (GA)[3,4]. And Yerramsetty et al. applied differential evolution (DE) to design and optimize the HEN to R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 409–416, 2010. c Springer-Verlag Berlin Heidelberg 2010
410
N. Kimura et al.
avoid large number of calculation amount[5]. The objectives of the optimization problems of these studies are only annual cost, but the environment surrounding chemical plants is not able to be evaluated only by economic criterion. We have been developing a multiagent framework to HEN-design[6,7] to support the engineers to make decisions about design and optimization of HEN. In this study, we will show the effectiveness of our multiagent framework by appling two different chemical processes.
2 2.1
Multiagent Framework Multiagent Framework for HEN-Design
In our framework, “pinch technology” ( mentioned in section 2.2 ) was adopted to design energy-saving chemical processes. In this section, we will explain a design workflow. Fig. 1 shows our multiagent framework. Human Operator— who is on the upper-left of Fig. 1— gives basic process information ( i.e. PFD,
Area II
333K Distillation Column 403K
453K Reactor1
Reactor2
353K Product1
Process Information
313K Product2
Pinch Point ΔTmin = 10 K
390 T [K]
373K 393K
303K
Area III
Area I
430
350 : hot streams : cold streams
310 270
0
2000
4000 H [kW]
Targeting Agent
6000
Composite Curves Human Operator
Project Manager
430 Pinch Point ΔTmin = 10 K
T [K]
390
LPS
350 : hot streams : cold streams
310 CW 270
0
2000
4000 H [kW]
6000
Balanced Composite Curves Area Targeting Method Pinch Point
378
378 453 403
333
373
333 C1 303 293 298 CW
393
H2 313
Equip. Cost Manager
LPS H1 353 343
C2
Multiagent System Fig. 1. Multiagent framework
Grid Diagram
A Multiagent Approach for Design of HEN
411
stream data profile and available utilities ) to Project Manager. Project Manager is the interface between human operator and the multiagent system. Then, Project Manager makes “composite curves” and give it to Targeting Agent on the upper-right of Fig. 1. Targeting Agent decides the utility allocation and the amount of heat recovery constrained by ΔTmin ( minimum approach temperature ). As a consequence, Targeting Agent makes “balanced composite curves”. Then, Targeting Agent converts the “balanced composite curves” into the initial HEN—which is expressed in “grid diagram”— using “area targeting method”. In the area targeting method, the sum of heat transfer area is minimized on the supposition that all the heat transfer from hot stream(s) to just under the cold stream(s) across the balanced composite curves. Therefore the numbers of heat exchangers and branches of the process streams increase and the network gets complicated. For example, there are 17 heat exchangers and 11 branches in the grid diagram on the lower-right of Fig. 1. Then it would be preferable if the HEN was simplified. In our framework, strategies for optimization of HEN were defined and HEN Design Agents receive the initial HEN from Targeting Agent. Each HEN Design Agent respectively makes alternative HENs by simplifying step by step following its own strategy. And they send the information about alternative HENs to Equipment Cost Manager. When Equipment Cost Manager receives information about HEN, “an equipment cost of the HEN” was calculated by summing up the each estimated equipment costs of all the heat exchangers involved in the HEN. After calculating the equipment costs of all the alternatives, Equipment Cost Manager picks out some alternatives, whose equipment costs are low, to send to Project Manager. Project Manager also picks out three alternatives according to other criteria such as complicacy of the network. And Project Manager sends back the selected alternatives to the Human Operator. Finally, the Human Operator is able to make decision on the HEN design based on not only the structure information of the HEN but also the estimated values of the criteria provided from the multiagent system. 2.2
Heat Recovery Analysis Using Pinch Technology
“Composite curves”—which is illustrated on the top middle of Fig. 1—represents the heat demands. In composite curves, the vertical axis is temperature and the horizontal axis is enthalpy. There are two composite curves, the one consists of some solid lines representing heating demands corresponding to the temperature range, and the other consists of some broken lines representing cooling demands of the process. The composite curves have three areas, “Area II” is the overlapped area of heating/cooling composite curves, where heat recovery between the process streams is capable. However, the utilities are required in the “Area I” and “Area III”. The minimum temperature difference ΔTmin is 10 K at “pinch point”. If the cold composite curve was shifted to the right/left, the pinch point was also shifted. Therefore the ΔTmin and three areas were changed accordingly. That is, if the appropriate ΔTmin has been selected, the usage of utilities were reduced. However, it is required large heat transfer area when the temperature
412
N. Kimura et al.
difference was small. Therefore, a trade-off arose between utility usage and initial equipment cost. In general, the alternatives with various ΔTmin were compared to optimize the HEN using objective function as “total cost” ( = annual equipment cost plus annual utility cost ). To prevent global warming, the setting of reduction target of greenhouse gas emissions will be obligated. In this study, we adopted heuristic ΔTmin (=10 K) to optimize HEN. “Balanced composite curves”—which is illustrated on the middle right of Fig. 1—represents the result of heat recovery and energy cost minimization. The heating and cooling utilities are allocated to the composite curves—where cooling utility (CW) is used by 120 kW and heating utility (LPS) is used by 960 kW. 2.3
Strategy for Optimization of HEN
There are some steps to simplify the network structure of HEN. First of all, it is necessary to search the HEN for the specific partial structures, such as “loop”, “split” and “path”. Steps such as “DeleteLoop”, “DeleteWideScaleLoop” and “SplitReduction” are defined. Before executing a step, it is necessary to check whether the new HEN would meet the constraints conditions ( heat balance, ΔTmin etc. ), if the structure was changed by the step. Then the step is executed, the outcome network is different from the one before the step was carried out. The other step would be applied to the new HEN one after another. It must be noted that the heat recovery, that is, the effect of energy-saving is not changed although any step have been applied. However the structures of new HENs were varied depends on which steps were executed and/or which partial structures were changed. It means that the equipment costs vary depends on the HEN. In this study, “strategy” was defined in order to decide which kind of step or which partial structure is prior to be executed. 21 strategies were prepared.
3
Simulations
We applied two different chemical processes—4-stream problem and 10SP1— to the above mentioned multiagent system to design energy-saving HENs. In this section, we will explain the simulation results respectively. 3.1
4-Stream Problem
Process description. The target process is a 4-stream problem with two reactors and one distillation column shown in Fig. 1. There are two hot streams ( H1 and H2 ) and two cold streams ( C1 and C2 ) whose stream data are presented in Table 1. Three heating utilities and one cooling utility are available whose temperatures and usage costs are also presented in Table 1. The utility cost ( that is “energy cost” ) using CW as cooling utility and MPS as heating utility without heat recovery was 456 millions JPY/year.
A Multiagent Approach for Design of HEN
413
Table 1. Stream data and available utilities for the 4-stream problem Stream/Utility Name Hot streams Cold streams
H1 H2 C1 C2 Low Pressure Steam (LPS) Heating Middle Pressure Steam (MPS) utilities High Pressure Steam (HPS) Cooling Cooling water utility (CW)
Temperature[K] Inlet Outlet 453.0 353.0 403.0 313.0 303.0 393.0 333.0 373.0
ΔH[kW] 2000.0 3600.0 3240.0 3200.0
Utility cost [JPY/kW–year] – – – –
378.0
–
33,000
413.0
–
66,000
473.0
–
110,000
–
5,500
293.0
298.0
333K 333K Distillation Column
373K Distillation Column 303 K
453K Reactor1 Reactor2
403K 353K Product1 313K Product2
303 K 273K Reactor1
403K 453K 353K Reactor2 Product1 313K Product2
Fig. 2. The PFD of the initial HEN and an optimized HEN for 4-stream problem
Simulation Results. Figs. 2 illustrate the process flow diagrams. Fig. 2(a) is the initial HEN which is equivalent to the grid diagram on the lower-right of Fig. 1. Fig. 2(b) is one of the optimized HEN by a HEN Design Agent. Table 2 indicates the comparison of energy consumptions and the sum of equipment costs among HENs for 4-stream problem. Figs. 2 and Table 2 tell us that the multiagent system using strategy is effective to reduce the complicacy of the network, and also to reduce the equipment cost with heat recovery attained. 3.2
10SP1 Problem
Process description. This process is popular problem, studied by many research groups. It seems large process which has five hot streams ( H1–H5 ) and five cold streams ( C1–C5 ). The stream data and available utilities are presented in Table 3. The balanced composite curves of 10SP1 problem is illustrated in Fig. 3, where cooling utility (CW) is used by 1879 kW although no heating utility is used.
414
N. Kimura et al. Table 2. Comparison among HENs for the 4-stream problem Energy Num. of Num. of consumption [kW] heat branches Heating Cooling exchangers utility utility [-] [-]
Description
Given problem w/o heat recovery. 6,440 Corresponding to the upper-left of Fig.1. The initial HEN achieved heat recovery target w/o optimization. Corresponding to Fig.2(a). 960 An example of the optimized HEN by the multiagent system. Corresponding to Fig.2(b).
5,600
Sum of equipment cost ×106 [JPY]
-
-
-
17
10
48.3
8
0
45.1
120
600
T [K]
500
400 : hot streams : cold streams CW
300 0
2000
4000 H [kW]
6000
8000
Fig. 3. Balanced composite curves of the 10SP1 problem
Simulation Results. Table 4 shows the comparison of energy consumption, the numbers of heat exchangers and branches, and annualized sum of equipment cost. The initial HEN is very large and complicated network which has 169 heat exchangers and 90 branches. However, a HEN Design Agent makes quite simplified HEN which has only 10 heat exchangers and no branch. And also the sum of equipment cost is discounted by 62 %. Dipama et al. tried to synthesize HENs for this process using genetic algorithms. The number of heat exchanger was 15 and the annualized equipment cost was higher than that of our result by 30 %. It is because that their objective function is to maximize heat recovery, their energy consumption is smaller than those of ours. To maximize heat recovery, it is effective to decrease ΔTmin —their ΔTmin
A Multiagent Approach for Design of HEN
415
Table 3. Stream data and available utilities for the 10SP1 problem ΔH[kW]
H1 H2 H3 H4 H5 C1 C2 C3 C4 C5
Temperature[K] Inlet Outlet 433.0 366.0 522.0 411.0 544.0 422.0 500.0 339.0 574.0 339.0 355.0 450.0 366.0 478.0 311.0 494.0 333.0 433.0 389.0 495.0
1642.0 1557.0 1545.0 762.0 644.0 589.0 1171.0 1532.0 2378.0 2358.0
Utility cost [US$/kW–year] – – – – – – – – – –
Steam
509.0
–
37.64
–
18.12
Stream/Utility Name Hot streams
Cold streams Heating utility Cooling utility
Cooling water (CW)
311.0
355.0
Table 4. Comparison among HENs for the 10SP1 problem
Description
Energy Num. of Num. of Annualized cost consumption [kW] heat branches of the sum of Heating Cooling exchangers equipment cost utility utility [-] [-] ×103 [US$/yr]
Given problem 6,149 w/o heat recovery. The initial HEN achieved heat recovery target w/o optimization. 0 An example of the optimized HEN by the multiagent system. The optimized HEN using genetic algorithm 196 by Dipama et al.[4] The optimized HEN using differential evolution 0 by Yerramsetty et al.[5]
8,028
-
-
-
169
90
24.78
10
0
9.39
1048
15
0
12.31
1,879
10
0
9.49
1,879
is 10 K, although our ΔTmin is 28 K—, even if it will lead to a soaring cost of the equipments. Yerramsetty et al. also tried to synthesize HENs using differential evolution to optimize the structure of the network, the heat loads of each heat exchangers, the splits stream heat flows and ΔTmin , simultaneously. Their result is also indicated in Table 4. The energy consumption and the numbers of heat
416
N. Kimura et al.
exchangers and branches are the same with our result. Furthermore, the annualized equipment costs of ours and Yerramsetty’s are almost equal.
4
Conclusion
In this paper, two different chemical processes are applied to the multiagent system using pinch technology and strategy for optimization of HEN to make energy-saving processes. Even if the given process is larger and the initial network is very complicated, it was possible to reduce the utility usage and the equipment cost by simplifying the HEN in accordance with the strategies which HEN Design Agents have individually.
References 1. Yoon, S.G., Lee, J., Park, S.: Heat integration analysis for an industrial ethylbenzene plant using pinch analysis. Applied Thermal Engineering 27, 886–893 (2007) ´ Mato, F.A.: Hint: An educational software for heat exchanger network 2. Mart´ın, A., design with the pinch method. Education for Chemical Engineers 3, e6–e14 (2008) 3. Ponce-Ortega, J.M., Serna-Gonz´ alez, M., Jim´enez-Guti´errez, A.: Synthesis of multipass heat exchanger networks using genetic algorithms. Computer and Chemical Engineering 32, 2320–2332 (2008) 4. Dipama, J., Teyssedou, A., Sorin, M.: Synthesis of heat exchanger networks using genetic algorithms. Applied Thermal Engineering 28, 1763–1773 (2008) 5. Yerramsetty, K.M., Murty, C.V.S.: Synthesis of cost-optimal heat exchanger networks using differential evolution. Computers and Chemical Engineering 32, 1861– 1876 (2008) 6. Kou, T., Kimura, N., Tsuge, Y.: A Multiagent Approach to Optimize Heat Exchanger Networks for Energy-saving Process Design. In: 3rd International Symposium on Novel Carbon Resource Sciences: Advanced Materials, Processes and Systems toward CO2 Mitigation, Fukuoka, pp. 336–341 (2009) 7. Kimura, N., Fujii, W., Tsuge, Y.: A Design and Optimization Mechanism of Heat Exchanger Network using Multiagent Framework. In: Taiwan/Korea/Japan Chemical Engineering Conference and 55th TwIChE Annual Conference, OS7003, Taipei (2008)
Consistency Checking Method of Inventory Control for Countermeasures Planning System Takashi Hamaguchi 1, Kazuhiro Takeda 2, Hideyuki Matsumoto 3, and Yoshihiro Hashimoto 1 1
Nagoya Institute of Technology, 466-8555 Gokiso-cho, Showa-ku, Nagoya-shi, Aichi 2 Shizuoka University, 432-8561 3-5-1 Johoku, Naka-ku, Hamamatsu-shi, Shizuoka 3 Tokyo Institute of Technology, 2-12-1-S1-42, O-okayama, Meguro-ku, Tokyo [email protected]
Abstract. Human operators in abnormal situations must plan adequate countermeasures against abnormal corrections in chemical plants. Operators in these situations switch some of controllers from automatic to manual and manipulate one variable while observing another during each countermeasure and/or change the set values for the throughput controller when some manipulated variables have become saturated. However, a flow rate controller may not work for a throughput controller in abnormal situations. Therefore, the efficiency of countermeasures needs to be checked by systems that support operators. Hamaguchi et al. proposed a method of configuring a multi-loop controller using CEmatrices for a system that planned countermeasures. We propose a method of checking the consistency with which inventory is controlled so that adequate countermeasures can be planned. Keywords: CE-matrices, countermeasures planning, inventory control.
1 Introduction Distributed Control Systems (DCSs) are used in most chemical plants to simultaneously control thousands of process variables. The main role of human operators in the control systems of these plants is to supervise highly automated systems and plant operations, both in normal and abnormal situations. The number of operators has recently decreased due to the introduction of advanced control systems. Although the frequency of accidents is very low, the load imposed on operators in abnormal situations has increased. DCSs are not effective in abnormal situations because of the lack of systems to support operator decision-making that would maintain safe operations at chemical plants. Operators are faced with serious tasks such as complex decisionmaking to detect, diagnose, and assess emergencies and plan countermeasures that will compensate for and correct abnormal situations. We need to support these activities by operators. Human operators placed in abnormal situations must plan adequate countermeasures against these at chemical plants. While monitoring and fault diagnosis are very fertile grounds for theoretical and industrial developments, there have been few papers that have discussed how countermeasures should be planned. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 417–426, 2010. © Springer-Verlag Berlin Heidelberg 2010
418
T. Hamaguchi et al.
It is difficult to find the causes of faults without any other additional information, although many algorithms have been proposed for diagnosing faults. Multiple possibilities for faults are obtained by using fault-diagnosis systems due to the inadequate number to sensors for diagnosing them. Cooling might be a good countermeasure against some possible faults, but for other possibilities, it might be dangerous. However, one countermeasure might be effective against multiple causes. It could then be executed even if multiple possibilities were diagnosed. Therefore, it is necessary to plan countermeasures to make complete fault diagnoses. Human operators in abnormal situations switch parts of controllers from automatic to manual. They must bear in mind numerous measured variables and operate many manipulated variables by themselves taking obstacles into consideration, such as what kinds of problems might occur when abnormal situations are left untreated, in which situations can abnormal situations be settled, and what actions are necessary to stabilize the plant. Human beings are not good at thinking about multiple problems simultaneously. It is not difficult for operators to focus their attention on one measured variable and the manipulation associated with it. The actions of operators are similar to the configuration for a single loop controller. Therefore, a method of designing the controller configuration can be applied to the planning of countermeasures in abnormal situations. It is difficult to solve these designs for configuring controllers throughout entire plants and there are few systematic approaches to doing so. The Relative Gain Array (RGA) approach is well known for paring problems taking interactions into consideration [1]. Since this RGA approach is required to identify the steady-state gain matrix, it cannot be fully applied to plant-wide problems. Kida proposed a design for a control system that was plant-wide based on practical rules and a quantitative model [2]. Although it was too difficult to assume a quantitative model for the whole plant in abnormal situations, a qualitative model can easily deal with information on the whole plant in abnormal situations such as in a fault-diagnosis system. Hamaguchi et al. have already proposed a method of designing the configuration of controllers and systems for planning countermeasures based on the qualitative model [3][4][5]. The controller configurations can be automatically generated and their availability can be assessed by multiplying Boolean matrices. The independence of individual control loops to cancel disturbances can be checked with the method they proposed. Operators in abnormal situations may change the set values of the throughput controller when some manipulated variables have become saturated. However, the flow rate controller may not work as a throughput controller in abnormal situations. Therefore, we need to check the efficiency of countermeasures offered by operator-support systems. However, the proposed method of configuring controllers cannot evaluate how efficiently set values have been changed to control the inventory plant-wide because the expressions for the control-loop model are optimized to check for and cancel disturbances. A method of configuring controllers using CE-matrices was modified to check how consistently inventory was controlled, which enabled countermeasures to be planned as changes in set values for throughput controllers.
2 Method of Configuring Controllers Using CE-Matrices This section explains the method of configuring controllers, which is based on the expression of the qualitative model using matrices obtained by cause and effect. We
Consistency Checking Method of Inventory Control
419
called them CE-matrices. These CE-matrices have elements “0” or “1”. Each column variable corresponds to a cause and each row variable to an effect. When the (i, j) element of the CE-matrices is 1, the j-th column variable affects the i-th row variable. All the matrices in this paper are CE-matrices. There is an example of a two-tank system in Fig. 1. The F1 and F2 in this plant are the material flow rates and F4 is the product flow rate. The controlled variables in this plant are the level of tank 1, L1, the level of tank 2, L2, and the product flow rate, F4. Four valves, V1, V2, V3, and V4, are utilized as the manipulated variables. The variables, which are not controlled variables or manipulated variables, are called explanatory variables in this paper. LC
1
F1
F2
V1
V2
tank 1
L1
LC
2
F3 V3 tank 2
L2
FC
1
F4
V4
Fig. 1. Example two-tank system
There is a CE model outlined in Fig. 2, which represents cause-and-effect relationships among variables, e.g., in a two-tank system. The process variables are represented by the ellipses and the manipulated variables are represented by the rectangles in this model. The ellipses with double-line border mean controlled variables and the ellipses with single-line border mean explanatory variables. We thought that a mental model of a human operator could be represented with such a simple qualitative model.
V1
V2
F1
F2
L1
V3 F3
V4
L2
F4
Fig. 2. CE model of example two-tank system
420
T. Hamaguchi et al.
The CE-matrix of plant G in Fig. 3, which explains the model of the example twotank system, can generate the CE model in Fig. 2. Manipulated variables are not included in row variables in the CE-matrix G. The “1” at the (1, 7) element of Fig. 3 indicates that the valve, V1, affects the flow rate, F1.
G F1 F2 L1 F3 L2 F4 V1 V2 V3 V4 F1 F2 L1 F3
0 0 1 0
0 0 1 0
0 0 0 1
0 0 1 0
0 0 0 0
0 0 0 0
1 0 0 0
0 1 0 0
0 0 0 1
0 0 0 0
L2 0
0 0
1 0 1 0 0 0 0
F4 0
0 0
0 1 0 0 0 0 1
Fig. 3. CE-matrix of plant G
Let us consider the two controller configurations for the example two-tank system that are outlined in Figs. 4 and 5. Controller configuration 1 in Fig. 4 is not acceptable and configuration 2 in Fig. 5 is acceptable. The purpose of the method of configuring controllers is to assess these differences so that countermeasures can be planned. This approach is used to delete controller configurations that are unacceptable by planning countermeasures that have set values for the throughput controller. Flow controller FC1 for F4 is paired with valve V4 and level controller LC1 for L1 is paired with valve V1 in configurations 1 and 2. Level controller LC2 for L2 is paired with valve V2 in controller configuration 1. LC2, on the other hand, is paired with valve V3 in controller configuration 2. LC
1
F1 V1
tank 1
F2 V2
L1
LC F3
2
V3 tank 2
L2
FC
1
F4
V4
Fig. 4. Controller configuration 1
Consistency Checking Method of Inventory Control
421
These controller configurations correspond to the example of countermeasures against problems at control L2. In unacceptable controller configuration 1, controlled variable L1 is included in the transfer path from V2 to L2 and the effect, which is caused by V2, does not reach L2 since LC1 is trying to maintain L1 in a steady state. If the header of L1 does not change, then F3 from tank 1 is also kept in a steady state. Therefore, controller configuration 1 is not acceptable for canceling disturbances. If LC1 is automated, then a poor countermeasure is to manually operate V2, as seen in Fig. 4, but an adequate countermeasure is to manually operate V3, as seen in Fig. 5. The method of configuring controllers is useful for planning countermeasures and evaluating them. All control loops are independent in controller configuration 2, which is acceptable. LC
1
F1 V1
tank 1
F2 V2
L1
LC F3
2
V3 tank 2
L2
FC
1
F4
V4
Fig. 5. Controller configuration 2
How controllers are configured with this method can be explained with CE-matrix C. The CE-matrices, C1 and C2, in Fig. 6 explain configurations 1 and 2. The manipulated variable of LC2 in configuration 1 is the valve opening, V2. The (8, 5) element of CE-matrices C1 is “1”. In configuration 2, the manipulated variable of LC2 is the valve opening, V3. The (9, 5) element of CE-matrices C2 is “1”. The flow rates, F1, F2, and F3, are explanatory variables in both configurations. The diagonal elements of the explanatory variables are “1”s. These “1”s are used to store information on the cause-and-effect relationships from all process variables without controlled variables to all process variables in G. Each column of CE-matrix C only has “1” element and each of its rows has at most one “1”; all single loop combinations can be easily generated from a unit matrix. Therefore, the configuration for controllers can be automatically generated by computer. CE-matrix C corresponds to a countermeasure in an abnormal situation. Hence, the countermeasure’s options can also be automatically generated. Control loop model GC presents the cause-and-effect relationships resulting from the manipulation of the cancellation of disturbances in a controlled plant. CE-matrix GC is generated by the Boolean multiplication of two matrices, plant G and controller C. The CE-matrices, GC1 and GC2, in Fig. 7 explain the control loop model obtained by using configurations 1 and 2.
422
T. Hamaguchi et al.
C1 F1 F2 L1 F3 L2 F4
C2 F1 F2 L1 F3 L2 F4
F1 F2 L1 F3 L2 F4 V1 V2 V3 V4
F1 F2 L1 F3 L2 F4 V1 V2 V3 V4
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 1
Fig. 6. Controller configurations C1 and C2
GC1 F1 F2 L1 F3 L2 F4
F1 F2 L1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0
F3 L2 F4 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 1
GC2 F1 F2 L1 F3 L2 F4
F1 F2 L1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0
F3 L2 F4 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1
Fig. 7. Control loop models GC1 and GC2
LC2's signal is sent to V2 in configuration 1. The change in V2 affects F2. Therefore, the (2, 5) element of CE-matrices GC1 is “1”. In configuration 2, LC2's signal is sent to V3. The change in V3 affects F3. Therefore, the (4, 5) element of CEmatrices GC2 is “1”. Note that the (6, 5) element in G is “1” and the (6, 5) elements both in GC1 and GC2 are “0”. The method of configuring controllers assumes that the controlled variables will be kept in a steady state by canceling disturbances and the original effects from the controlled variables to the process variables will be cleared in control loop model GC. Modifications to the cause-and-effect relationships by manipulating the cancellation of disturbances in a controlled plant can be accomplished by means of the expression of the C matrix. The behavior of a controlled plant in the mode for canceling disturbances is represented by GC. The expression of the C matrix is modified to plan countermeasures as a change in the set values for the throughput controller to check the consistency with which inventory is controlled and this will be discussed in the next section. Reachability matrix R is defined by using GC as derived in Eq. (1) and k is a positive integer.
Consistency Checking Method of Inventory Control
R
=
∞
∑ (GC )
423
(1)
k
k =1
If the dimension of GC is n, reachability matrix R is defined as
R
=
n −1
∑ (GC )
(2)
k
k =1
Reachability matrix R indicates the effects between process variables in a steady state at a controlled plant. Reachability matrices R1 obtained by using configuration 1 and R2 obtained by using configuration 2 are both in Fig. 8.
R1 F1 F2 L1 F3 L2 F4
F1 F2 L1 F3 L2 F4 1 1 1 1 1 0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1
R2 F1 F2 L1 F3 L2 F4 F1 1 1 1 0 0 0 F2 0 0 0 0 0 0 L1 1 1 1 1 1 0 F3 0 0 0 1 1 0 L2 0 0 0 1 1 0 F4 0 0 0 0 0 1
Fig. 8. Reachability matrices R1 and R2
The (5, 5) element in R1 is “0.” The manipulation of LC2 does not arrive at L2 on R1. Therefore, R1 indicates that controller configuration 1 is not acceptable. However, the diagonal elements of all controlled variables in R2 are “1”. Therefore, R2 indicates that controller configuration 2 is acceptable. Therefore, the method of configuring controllers can be used to assess the acceptability of configured controllers and countermeasures to be taken by using the reachability matrix.
3 Method of Checking Consistency of Inventory Control We take into consideration the change in the set values for the throughput flow rate as a countermeasure in this section. The consistency of changes in set values in plantwide control of inventory must be checked by a system that provides operator support so that effective countermeasures can be planned. Therefore, the effectiveness of interlocking control between the throughput flow rate and material and product flow rates can be checked by using the reachability matrix. As we can see from Fig. 9, the throughput controller is FC1 in configuration 2. In this example scenario, product flow rate F4 is changed by FC1. The increase in F4 causes the decrease in L2 and LC2 causes the increase in F3. The increase in F3
424
T. Hamaguchi et al.
causes the decrease in L1 and LC1 causes the increase in material flow rate F1. Therefore, throughput control by using FC1 in configuration 2 can increase the material flow rate as pull controller and increase the product flow rate as push controller. LC
1
F1 V1
tank 1
F2 V2
SV change
L1
LC F3
2
V3 tank 2
L2
FC
1
F4
V4
Fig. 9. Throughput controller
R2D FC1 R2 F1 F2 L1 F3 L2 F4 F1 0 F1 1 1 1 0 0 0 0 F2 F2 0 0 0 0 0 0 L1 0 = L1 1 1 1 1 1 0 F3 0 F3 0 0 0 1 1 0 L2 0 L2 0 0 0 1 1 0 F4 1 F4 0 0 0 0 0 1
×
D F1 F2 L1 F3 L2 F4
FC1 0 0 0 0 0 1
Fig. 10. Matrix for disturbance reachability R2D
However, this relationship cannot be checked by using the reachability matrix, which was explained in Section 2. The change in set values in the steady state causes disturbances in controlled plants. The change in set values obtained with FC1 is presented by disturbance matrix D, as outlined in Fig.10. The matrix for disturbance reachability RD is generated by the Boolean multiplication of two matrices, R and D. The matrix for disturbance reachability R2D using reachability matrix R2 for configuration 2 is outlined in Fig. 10. The effects of the change in product flow rate caused by FC1 do not present the upstream process variables in the matrix for disturbance reachability R2D. Therefore, information on the reachability matrix must be modified for this purpose. The expression of controller matrix C must be modified to check the effectiveness of interlocking control between the throughput flow rate and material and product flow rates by using the reachability matrix.
Consistency Checking Method of Inventory Control
425
First, the throughput controller must be selected from available alternatives. The diagonal element of the throughput controlled variable that is selected is changed from “0” to “1” in C. However, the elements of a paired manipulated variable for the selected controlled variable must be kept at “1”. Controller matrix C3 and control loop model GC3, which are used to check the effectiveness of the change in set values to control the inventory plant-wide, are shown in Fig. 11.
C3 F1 F2 L1 F3 L2 F4 F1 1 0 0 0 0 0 F2 0 1 0 0 0 0 L1 0 0 0 0 0 0 F3 0 0 0 1 0 0 L2 0 0 0 0 0 0 F4 0 0 0 0 0 1 V1 0 0 1 0 0 0 V2 0 0 0 0 0 0 V3 0 0 0 0 1 0 V4 0 0 0 0 0 1
GC3 F1 F2 L1 F3 L2 F4
F1 F2 L1 F3 L2 F4 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1
Fig. 11. Modified control matrix C3 and control loop model GC3
R 3D F1 F2 L1 F3 L2 F4
FC1 1 0 1 = 1 1 1
R3 F1 F2 L1 F3 L2 F4 D FC1 F1 1 1 1 0 0 1 F1 0 F2 0 0 0 0 0 0 F2 0 L1 1 1 1 1 1 1 × L1 0 F3 0 0 0 1 1 1 F3 0 L2 0 0 0 1 1 1 L2 0 F4 0 0 0 0 0 1 F4 1
Fig. 12. Matrix for disturbance reachability R3D
Reachability matrix R3 is calculated with GC3 and the matrix for disturbance reachability R3D is then generated, as shown in Fig. 12. The effectiveness of interlocking control between throughput flow rate and material and product flow rates can be checked and confirmed with R3D.
4 Conclusions We proposed a method of checking the consistency with which inventory was controlled so that adequate countermeasures could be planned. Operators also require
426
T. Hamaguchi et al.
support with decision-making in the timing or magnitude of operations. Qualitative considerations should be added to the results obtained with the proposed procedure. This approach can be considered a first step in bridging the discontinuity between the diagnosis of faults and the planning of countermeasures against them.
References 1. Luyben, W.L., Luyben, M.L.: Essentials of Process Control. McGraw Hill, New York (1997) 2. Kida, F.: Plant wide control system (1) which the process engineer designs. On the consistency of the plant wide control loop composition. Simple judgment and composition standard tactics rule of the erratum of the control loop composition. Chemical Engineering 49, 144–151 (2004) (in Japanese) 3. Hamaguchi, T., Miura, A., Yoneya, A., Hashimoto, Y., Togari, Y.: Controller Configuration Design and Troubleshooting Planning Using CE-Matrices. Kagaku Kogaku Ronbunshu 25, 384–388 (1999) 4. Hamaguchi, T., Hashimoto, Y., Itoh, T., Yoneya, A., Togari, Y.: Selection of Operating Modes in Abnormal Situations. J. Chem. Eng. of Japan 35(12), 1282–1289 (2000) 5. Hamaguchi, T., Hashimoto, Y., Itoh, T., Yoneya, A., Togari, Y.: Abnormal Situation Correction Based on Controller Reconfiguration. J. Process Control 13, 16–175 (2003)
Fault Semantic Networks for Accident Forecasting of LNG Plants Hossam A. Gabbar Faculty of Energy Systems and Nuclear Science, University Institute of Technology, 2000 Simcoe St. N., Ohsawa, L1H 7K4 ON, Canada [email protected] http://nuclear.uoit.ca/EN/main/74131/gabbar.html
Abstract. In order to reduce risks associated with Liquefied Natural Gas (LNG) production facilities, one approach is to provide real time and risk-based accident forecasting mechanisms and tools that will enable the early understanding of process deviations and link with possible accident scenarios. In this paper, process and fault modeling technique is presented to model causation models and link with accident scenarios using fault semantic networks (FSN). A forecasting algorithm is developed to identify and estimate safety measures for each operation step and process model element and validated with process condition. Keywords: Fault Semantic Network, Accident Forecasting, LNG Production Plants, Fault forecasting.
1 Introduction LNG based energy is receiving higher attention for cleaner and efficient energy supply. However, there are high risks associated with LNG production facilities. These risks are revealed in different accidents that occurred in different accidents. There are different techniques used to manage risks in LNG plant processes. Such as risk analysis, mitigation, reduction, and other treatment methods. One approach of risk management in LNG plants is fault forecasting or prediction so that appropriate actions can be considered. Process monitoring techniques are developed significantly to provide fault forecasting or prediction prior to any serious deviations or process upset. The concept of accident forecasting is discussed by many researchers in the past in different disciplines and views. Predictive maintenance is used to estimate remaining life through condition monitoring of equipment failures, which is one element of accident forecasting [1]. Predictive maintenance is mainly based on understanding of equipment good condition and degradation criteria and use to predict remaining life [2]. This technique is useful when one failure is assumed to contribute to the degradation of process equipment [3]. This will be more complex when more than one failure contributes to process degradation or upset. The understanding of equipment degradation and other contributing faults requires developing fault propagation scenarios with the consideration of multiple failures. The analysis of multiple failures with all possible R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 427–437, 2010. © Springer-Verlag Berlin Heidelberg 2010
428
H.A. Gabbar
degradations is the basis of accident forecasting. This paper presents an integrated methodologies to analyze multiple fault propagation models with degradation and process condition information that are used for accurate fault and accident forecasting. The paper is composed of introduction about LNG process, followed by FSN representation. This is followed by fault / accident forecasting using FSN. 1.1 LNG Process Description In order to provide better accident forecasting solution, it is important to understand liquefied natural gas or LNG processes. LNG is natural gas, mainly methane (CH4). It is temporarily converted into liquid form for ease of storage and transportation. Typically, it is converted into liquid when the production location is far from the usage location. The process starts with treatment in a gas processing units where higher molecular weight hydrocarbons, sulfur compounds and water are removed. This is followed by feeding treated gas into the liquefaction process where it is cooled down. The cooling is typically done using two or three cooling cycles so that the liquefaction temperature reaches -160°C (- 256 °F). The cold liquid LNG is then moved to well-insulated storage tanks at atmospheric pressure. These tanks are loaded into LNG tankers for distribution / shipment. Each of the cooling cycles requires a very large compression train which is usually driven by a gas turbine. In Qatar, the 8 Mtons/y plants are typically based on three trains; each is driven by a 125MW Fr9E gas turbine. 1.2 LNG Accidents There are several LNG accidents that happened in the past. Some of these accidents led to fire and explosion. For example, in October 1944, Cleveland, Ohio, particularly, at the Cleveland peak-shaving plant a tank failed and spilled its contents into the street and storm sewer system. The resulting explosion and fire killed 128 people. The tank was built with a steel alloy that had low-nickel content, which made the alloy brittle when exposed to the extreme cold of LNG. Other LNG accidents led to spills or leak. For example, early 1965, Methane Princess Spill - LNG discharging arms were disconnected prematurely before the lines had been completely drained, causing LNG liquid to pass through a partially opened valve and onto a stainless steel drip pan placed underneath the arms. This caused a star-shaped fracture to appear in the deck plating in spite of the application of seawater. The analysis of these accidents will provide basis to develop accident forecasting techniques [4].
2 Accident / Fault Modeling 2.1 Causation Modeling In order to understand the dynamic of each accident / failure, causation model is developed starting from root causes of any accident / upset / or deviation, till the final
Fault Sem mantic Networks for Accident Forecasting of LNG Plants
429
Fig. 1. Typical Causation Mod del shows possible causes and consequences for a given top evvent, where escalation factors are ussed to show possible propagation. This includes controls / barrriers for each propagation step.
accident or disaster. Each caausation model goes through several escalation steps whhere control and barriers are inccorporated to prevent or mitigate each escalation step. T The causation model is formulaated as shown in figure 1. The proposed model will enaable the identification of the elem ments at each stage and link them through process modell, as described above. 2.2 Fault Modeling Accidents are made of seq quence of escalations of failures / faults. In order to m mode accidents, it is important to t provide systematic mechanism to model failures. T The
Fig. 2. POOM-based fault propagation model shows different levels of fault propagation sceenarios which include: site, plan nt, process, equipment, part, material, and human. It incluudes symptoms and conditions asso ociated with each fault. The nature of fault propagation is the dynamic changes with real time operating condition. This requires developing dynamic structuures of fault propagation that are further tuned based on any changes in process condition. F Fault semantic network is proposed to t represent the dynamic structures of fault propagation scenarrios, as explained in the following section. s
430
H.A. Gabbar
proposed failure modeling approach is shown in figure 2, which shows the relatiionships between failures and d sensors, actuators (e.g. valves), process equipment, ccontrollers, and surrounding en nvironment. Failure is structured hierarchically in view w of Process Object Oriented Modeling M Methodology (POOM). The highest level is pllant facility where faults related d to site are modeled vs. causes, consequences, and asssociated controls and barrierrs. Second level is plant process, which includes proccess equipment. Faults related to material and human are recorded with the associaated symptoms.
Fig. 3. Fault Semantic Networrk (FSN), which includes knowledge structure for fault model that links plant structure, behaviorr, operation, and associated process / equipment variables. FM M is failure mode, PV is process vaariable, HV is human variable, EV is equipment variable, EnV V is environmental variable, and Dev is deviation, such as high-pressure, etc.
2.3 Fault Semantic Network (FSN) In view of the proposed pro ocess model using POOM, fault semantic network is utiliized to construct flexible fault knowledge structure in qualitative manner, as shownn in figure 3. Initially FSN is con nstructed based on ontology structure of fault models onn the basis of POOM where FM M or failure mode is described using symptoms, enabllers, process variables, causes, and a consequences. Each link can be associated with ruules. For example, failures relaated to leak might be associated with rules such as “IF (Structure.Material = (X or Y)) Y and (PV= Pressure) and (Dev = Very-High) THEN ((FM = Fracture).” These rules arre initially defined in generic form based on domain knoowledge, i.e. regardless of plan nt specific knowledge. These rules are further explainedd for plant specific, i.e. rules asso ociated for tank-1 and pipe-2. Formal language is propoosed to represent process domain knowledge and safety control rules, as explained in [5]. Formal language facilitatess the synthesis and validation of fault models within FS SN. The constructed FlSN willl be maintained and utilized as part of the proposed ffault simulator as explained in th he next section.
Fault Sem mantic Networks for Accident Forecasting of LNG Plants
431
3 Accident Forecastiing Algorithm Based on the proposed FSN N, it is important to find systematic mechanism to constrruct FSN at first place for a giveen process design model, i.e. process flow diagram or pipping and instrumentation diagram m. The FSN is used as a basis for accident forecasting, as explained in this section. In FSN, failure modes, or o FM, are linked in networks as cause-consequence mod dels. Each FM is linked with set of related process vaariables. PCA, or Principle Component Analysis metho od, is used to reduce dimensionality by selecting the most m related process variables to each FM. This is extractted from real time and simulation data, and fine tuned by b human experience. It will be further tuned once new w data set is received. The second step is to predict safety s margins from process data using EWMA or Expo onentially Weighted Moving Average method where it dy ynamically calculates control & safety margins and esstimate variability for each process variable within eacch operation. These data sets are calculated for different failure modes and propagation scenarios so that it can n be linked with the initiating event. In real time, process variables v are used to estimate deviations where they are compared with the corresponding trends of possible fault propagation scenarios. The closest scenario will identify the candidate fault propagation scenario. In addition, neurofuzzy with Hidden Markov Model (HM MM) are used to predict the Fig. 4. Accident Forecassting future state and forecast faault / accident scenario. The Algorithm, which starts w with details of the proposed algorithm are presented in this mapping POOM-baased section. Figure 4 shows the proposed algorithm. 3.1 Identification of Proccess Variables Related to Faults In the proposed accident forrecasting algorithm, PCA, or Principle Component Anallysis, is used to identify the most related process variaables for each fault. This is performed in two stages. In first stage, fault simulator is used to record process variaables for the time prior to the fault / failure. This will form m input to the PCA algorithm to calculate the principle components of reduce list of process variables related eaach fault / failure, as shown in figure 5.
process model with FSN. Use PCA to reduce dimensioonality to identify PVs relateed to each deviation. Use Neuurofuzzy logic and Hiddden Markov Model (HMM) aalgorithm to create rules and predict PV future values.. By PVs comparing estimated P with FSN to learn posssible causes, consequences, whhich is used to define posssible corrective actions.
432
H.A. Gabbar
Once the principal components
Original training data
matrix P has been established for an operating state it is possible to use
X XP
this to determine the residual of a
Calculate Covariance Matrix
new sample. The residuals for each variable can then be observed over a
BackTransform to original coordinates
XTX
period of time to determine drifting.
XPPT
Determine Eigenvalues / Eigenvectors
The new sample is a vector, x, that has the same variables as a row of the
X
Transform Data into PC subspace
Determine Residuals
?,V E
matrix X, as shown in equation (1). x = [x1, x2, … , xm]
(1)
Choose Principal Components
P
Calculate Qthreshold
P
Fig. 5. PCA Algorithm to estimate the closest list The steps involved in the determinaof process variables PVs related to each failure / tion are: (a) Obtain the data sample to be checked; (b) Construct vector x fault. scaling it to zero mean using the mean from the columns of matrix X; (c) Establish the estimate of x using only the selected principal components, using equation (2).
x* = xPPT
(2)
Calculation of the residual value is determined using equations (3-6). x = x* + e with e being vector of residuals, therefore
(3)
e = x - x* or
(4)
e = x - xPPT
(5)
e = x(1 - PPT)
(6)
Calculation of the value Q-statistic is shown in equation (7). Q-statistic = eeT = ║x - x*║2
(7)
The Q-threshold value was determined using the Chi-square distribution formula used in Liu (2004). If the Q-statistic > Q-threshold then the sample should be evaluated for outliers and that sample reprocessed. The faults/failures are linked with related process variables within FSN, for example sensor drift will be linked with PV1 in the above table. This step in accident forecasting algorithm is followed by identifying safety limits. Safety limits are determined for the selected process variables PVs for each fault / failure using EWMA algorithm, which uses training data (from real time and simulation) to tune the target control limits to be
Fault Semantic Networks for Accident Forecasting of LNG Plants
433
dynamically calculated for each trend. In addition, it will monitor process variability to ensure that process variable is within desired limits. It can provide indication of how much variability is in the process, which can be used to estimate remaining time to be within the limits. This is performed offline as online and can determine dynamic safety limits for each operation step and for each process variables, as shown in figure 6. Table 1. Principle Components for PVs related to fault failure of drift in sensor data, processed by Matlab. This example shows first principle component as the most related process variable PV1 to the drift failure.
Matlab- output of principal_components = -0.4455 -0.5685 -0.4703
0.4480
0.2376
-0.4455 -0.4410
0.1582 -0.5641 -0.5137
-0.4502
0.0314
0.6677 -0.0206
-0.4495
0.4277
0.1655
-0.4454
0.5463 -0.5297 -0.4199
0.5917
0.5517 -0.5322 0.2152
10.80000 10.60000 10.40000 10.20000 10.00000 9.80000 9.60000 9.40000 9.20000 0
10
20
30
40
Fig. 6. EWMA or Exponentially Weighted Moving Average, is used to determine safety limits of each process variable. This example shows increasing safety limits during startup, and stabilized safety limits during steady state.
z i .represents the statistical variable corresponding to the selected process variable xi . λ is the weight assigned to the EWMA follows equations (8) below, where
current sample mean. For example, if λ=0.2, then the weight assigned to the current sample mean is 0.2 and the weights given to the preceding means are 0.16, 0.128, 0.1024, etc.
434
H.A. Gabbar
zi = λxi + (1 − λ ) zi −1 0 < λ ≤1
(8)
z0 = μ 0 After determining process limits and possible variability for each operation, process variables can be predicted. This is performed in two steps, first training data or simulation is used to determine predicted profile, which will be followed by estimating future deviations using combined Bayesian and Neurofuzzy methods, as explained in the following section. 3.2 Predicting Future Deviations of Process Variable In order to predict process variables, Fuzzy Logic is combined with ANN to develop rules that describe the different trends of process variables for different operations. The proposed algorithm is composed of the following steps: 1.obtaining training data of normal operation for the selected process variables, obtained from the PCA analysis, described previously; 2.load the training data into the neurofuzzy engine, which was developed in MATLAB; 3.define the type, and number of membership functions; 4.generate the Fuzzy Inference System (FIS), which is a set of rules associated with the different artificial network created for the given data set; 5.train the FIS by generating possible rules; and 6.test the generated FIS with given process data sets to verify minimum error of predicting process variable. In this section, a training set of over 250 iterations was collected for system transient to ensure accurate representation of interrelation between process variables, which are identified as: condenser pressure, condenser level, pressurizer pressure, pressurizer level, reflux cooling flow, and flow at the associated control valve. Two triangular membership functions were selected, and the FIS was generated. The algorithm uses a hybrid optimization method for training, which combines least-squares and backpropagation gradient descent method. The result with 3 epochs shows a root mean squared error of 0.0771325. The prediction plotted against the actual training data is shown in figure 7.
Fig. 7. Training vs. predicted data of feed flow as estimated by FIS for the case study
Fault Semantic Networks for Accident Forecasting of LNG Plants
435
Fig. 8. Real time vs. predicted data using the proposed FIS. The data trend on top is faulty the process data deviated from the bottom trend of predicted data. The proposed neurofuzzy algorithm is based on recognizing the deviations in trends of actual process variable and predicted process variable. This requires accurate analysis of the different trends. In the following section, trend analysis mechanism is proposed.
The proposed algorithm is used to predict deviation of actual run. In this case, “valve fails open” where predicted. The output of FIS is shown in figure 8 where the blue color (on the top) is the data obtained from process data (fault), while line with red color (on the bottom) is the predicted data. The system shows deviation which indicate fault case. 3.3 Trend Analysis Algorithm Trend analysis is proposed to be able to compare different trends, accurately. Each trend is composed of a series of identifiers which captures the changes of the corresponding process variable [6-7] were applied. The basis of this method involves nine primitives, as shown in figure 9. Each primitive is characterized by the sign of the first and second derivatives of a trend. By using the nine primitives, the shape of the trend is captured, as shown in figure 9. The classification of each trend will enable the comparison of different trends, individually or as group of trends. Using the primitives as inputs to the FIS will facilitate removing noise in trends and enabling better accuracy in modeling accident scenarios. Database is used to store trends signature as set of primitives and associate with faults / failures within FSN. In order to improve the learning technique, Bayesian method is used to train the network using posterior and prior data, as explained in the next section.
Fig. 9. Primitives used to classify trends. Each primitive is characterized by the sign of the first and second derivatives of a trend, shown as the last two signs of each trend. Typically, trends don’t match exactly. This means, small differences between trends shouldn’t be classified as different trends. Similarity matrix is used to show the similarity between different primitives, as shown in table 2.
436
H.A. Gabbar
Table 2. Similarity matrix is used to show the similarity between different primitives. When two or more trends are compared, any difference is compared using the similarity matrix to conffirm real difference.
A B C D E F G H I
A 1 0 0 0 0.5 0 0.75 0 0
B 0 1 0 0 0 0.5 0 0.75 0
C 0 0 1 0 0 0 0 0 0 0.25
D 0 0 0 1 0 0 0 0 0.25
E 0.5 0 0 0 1 0 0.75 0 0
F 0 0.5 0 0 0 1 0 0.75 0
G 0.75 0 0 0 0.75 0 1 0 0.25
H 0 0.75 0 0 0 0.75 0 1 0.25
I 0 0 0.225 0.225 0 0 0.225 0.225 1
3.4 Bayesian Learning The likelihoods and posterrior probabilities will tune FSN by integrating the predictiion values with the probabillity updates. Boiler level and a feedwater flow were lo ow, which might be due to faultt in any of the two upstream 1 valves, as shown in figure 10. Equations (9-10) are used d to calculate the likelihood for process variables approach hing both the upper and lower conc trol limits.
Fig. 10. Build Bayesian believe network and update tthe likelihood and posterior probabilities for a given thrree different faults which might affect the selected proceess variables.
ܲ൫ܣหܺୀ ൯ ൌ ͳ െ ቀܲ൫ܺ௨௧ ൯ െ ܲሺܺሻቁ
(9)
ܲሺܣȁܺୀ ୀ௪ ሻ ൌ ͳ െ ሺܲሺܺሻ െ ܲሺܺ௪௧ ሻሻ
((10)
Table 3. The cond ditional probability table for the likelihood of P(A|X)
State of X High Normal Low
P(A|X=high) 0.9 0.4 0.1
Table 3 shows conditional probability to estimate the likelihood
P(A|X=low) 0.1 0.4 0.9
Fault Semantic Networks for Accident Forecasting of LNG Plants
437
4 Conclusion Accident forecasting can be accurately performed by using causation models. However, there are large numbers of possible causation models, which made it difficult to accumulate and analyze them. Fault semantic network (FSN) is proposed to integrate all possible fault propagation scenarios associated with process variables and process models represented in POOM. Previous LNG accidents are models and represented using the proposed fault / accident forecasting mechanism.
Acknowledgement Authors thankfully acknowledge financial support provided by Qatar National Research Foundation through National Priority research project number 08-074-2-015. Thanks for the research students at UOIT and to collaborators from Japan for providing accident data about LNG processes.
References [1] Yam, R. C. M., Tse, P. W., Li, L., Tu., P., Intelligent Predictive Decision Support System for Condition-Based Maintenance [J], the international journal of advanced manufacturing technology, 17: 383-391 (2001). [2] Lu, K. S., Saeks, R., Failure prediction for an on-line maintenance system in a Poisson shock environment, IEEE Transactions on Systems, Man, and Cybernetics, 9(6): 356–362 (1979). [3] Verdier, G., Hilgert, N., Vila, J.P,. Adaptive threshold computation for CUSUM-type procedures in change detection and isolation problems [J], computational statistics and data analysis, 2008, 52: 4161-4174 (2008). [4] Gabbar, H.A., Qualitative Fault Propagation Analysis. Journal of Loss Prevention in the Process Industries, Vol. 20, No. 3, 260–270 (2007). [5] Incident News, http://www.incidentnews.gov/ [6] Dash, S., Rengaswamy, R., & Venkatasubramanian, V., Fuzzy-logic based trend classification for fault diagnosis of chemical processes. Computers and Chemical Engineering, 27, 347-362 (2003).
Fuzzy Group Evaluating the Aggregative Risk Rate of Software Development Huey-Ming Lee1 and Lily Lin2 1 Department
of Information Management, Chinese Culture University 55, Hwa-Kung Road, Yang-Ming-San, Taipei (11114), Taiwan 2 Department of International Business, China University of Technology 56, Sec. 3, Hsing-Lung Road, Taipei (116), Taiwan [email protected], [email protected]
Abstract. In this paper, we propose the fuzzy group decision making to tackle the rate of aggregative risk in software development. The proposed method is easily implemented and also suitable for only one evaluator. Keywords: Fuzzy risk assessment; aggregative risk rate.
1 Introduction For a Risk is the traditional manner of expressing uncertainty in the systems life cycle. Once risks have been identified, they must then be assessed as to their potential severity of loss and to the probability of occurrence. Risk assessment is a common first step and also the most important step in a risk management process. Risk assessment is the determination of quantitative or qualitative value of risk related to a concrete situation and a recognized threat. In a quantitative sense, it is the probability at such a given point in a system's life cycle that predicted goals can not be achieved with the available resources. Due to the complexity of risk factors and the compounding uncertainty associated with future sources of risk, risk is normally not treated with mathematical rigor during the early life cycle phases [1]. Risks result in project problems such as schedule and cost overrun, so risk minimization is a very important project management activity [13]. Up to now, there are many papers investigating risk identification, risk analysis, risk priority, and risk management planning [1-4, 6-7]. Based on [2-4, 6-7], Lee [9] classified the risk factors into six attributes, divided each attribute into some risk items, and built up the hierarchical structured model of aggregative risk and the evaluating procedure of structured model, ranged the grade of risk for each risk item into eleven ranks, and proposed the procedure to evaluate the rate of aggregative risk using two stages fuzzy assessment method. Chen [5] ranged the grade of risk for each risk item into thirteen ranks, and defuzzified the trapezoid or triangular fuzzy numbers by the median. Based on [9], Lee et al. [10] presented a new algorithm to tackle the rate of aggregative risk. Lin and Lee [11] presented a new fuzzy assessment form and synthetic analysis for the rate of aggregative risk. Lee and Lin [12] presented a method to evaluate the presumptive rate of aggregative risk in software development. Based on [11, 12], we propose group evaluators to tackle the fuzzy rate of aggregative risk in this paper. The proposed method is easier, more flexible and useful to implement. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 438–444, 2010. © Springer-Verlag Berlin Heidelberg 2010
Fuzzy Group Evaluating the Aggregative Risk Rate of Software Development
439
2 Preliminaries For the proposed algorithm, all pertinent definitions of fuzzy sets are given below [8, 14].
~
Definition 2.1. Let A = ( p, q, r ) , p
R . It is called a triangu-
⎧x − p ⎪q− p, if p ≤ x ≤ q ⎪ ⎪r −x μA~(x) = ⎨ , if q ≤ x ≤ r ⎪r −q ⎪0, otherwise ⎪ ⎩
(1)
~
~ at q. If r=q and p=q, then A is ( q, q, q ) , we call it the fuzzy point q Fuzzy numbers must be first mapped to real values, which can be then compared for magnitude. This assignment of a real value to a fuzzy number is called defuzzification. It can take many forms, but the most standard defuzzification is through computing the centroid. This is defined, effectively, as the center of gravity of the curve describing a given fuzzy quantity. Because of this definition, its computation requires integration of the membership functions. ~ Let A = ( p, q, r ) . We denote r
~ C ( A) =
∫ xμ ( x )dx ∫ μ ( x )dx ~ A
p
r
p
~ A
(2)
~ as the defuzzification of A by the centroid method, where p and r are lower and
upper limits of the integral, respectively. Then, we obtain that the centroid of
~ A = ( p, q, r ) is
~ 1 C( A) = ( p + q + r) 3
(3)
~ ~ Proposition 2.1. Let A1 = ( p1 , q1 , r1 ) and A2 = ( p2 , q2 , r2 ) be two triangular fuzzy numbers, and k>0, then, we have (10 ) ~ ~ A1 ⊕ A2 = ( p1 + p 2 , q1 + q2 , r1 + r2 ) ~ (2 0 ) kA 1 = ( kp1 , kq1 , kr1 )
We can easily show the Proposition 2.1 by the extension principle. ~ ~ Proposition 2.2. Let A1 = ( p1 , q1 , r1 ) and A2 = ( p 2 , q2 , r2 ) be two triangular fuzzy
~
numbers, k ∈ R , and C ( A j ) be the centroid of
~ Aj , for j=1, 2, then we have
440
H.-M. Lee and L. Lin
~ ~ ~ ~ (10 ) C ( A 1 ⊕ A2 ) = C ( A1 ) + C ( A2 )
~ ~ ( 2 0 ) C ( kA 1 ) = kC ( A1 )
We also can easily show the Proposition 2.2 by Proposition 2.1 and Eq. (3).
3 Assessment Form for the Risk Items The criteria ratings of risk are linguistic variables with linguistic values V1 , V2 , …, V7 , where V1 = extra low, V2 = very low, V3 = low, V4 = middle, V5 = high, V6 = very high, V7 = extra high. These linguistic values are treated as triangular fuzzy numbers as follows:
~ i −1 i i +1 (4) Vi = ( , , ), for i = 1, 2, .., 7 8 8 8 ~ Now, we defuzzify Vi by the centroid method, we have C (V~i ) = i as the center of 8 ~ mass of Vi , for i=1, 2, …, 7. Based on the structured model of aggregative risk proposed by Lee [9], we propose the new assessment form of the structured model as shown in Table 1 and proposed an algorithm to tackle the rate of aggregative risk in software development. In Table 1, (a) w2 (k ) is the weight of the risk attribute X k , and 6
∑ w (k ) = 1, 2
0 ≤ w2 ( k ) ≤ 1
k =1
(5) ,
for each k = 1, 2, ..., 6. (b) w1 ( k , i ) is the weight of the risk item X k ,i , and nk
∑ w (k , i) = 1, i =1
1
0 ≤ w1 ( k , i ) ≤ 1 ,
(6)
where n1 = 1, n2 = 4, n3 = 2, n4 = 4, n5 = 2 , n6 = 1; i = 1, 2,..., nk ; k = 1, 2, ... 6. (c) mki( l ) is the numbers of selections for the fuzzy linguistic Vl with respective to X ki .
4 The Proposed Algorithm In Table 1, (a) We let 7
mki = ∑ mki( l ) , l =1
(7)
Fuzzy Group Evaluating the Aggregative Risk Rate of Software Development
441
Table 1. Contents of the hierarchical structure model [12] Attribute
Risk item
X1: Personal
Weight-2 Weight-1
V1
V2
Linguistic variables V3 V4 V5 V6
V7
W2(1) X11: Personal shortfalls, key person(s) quit
W1(1,1)
m11(1)
( 3) m11( 2) m11 m11( 4 ) m11( 5) m11( 6 ) m11( 7 )
X21: Requirement ambiguity X22: Developing the wrong software function X23: Developing the wrong user interface
W1(2,1)
(1) m21
( 2) ( 4) ( 5) (6) (7) ( 3) m 21 m 21 m 21 m 21 m 21 m21
W1(2,2)
(1) m22
( 3) ( 2) ( 4) ( 5) (6) (7) m22 m 22 m 22 m 22 m 22 m 22
W1(2,3)
(1) m23
( 2) ( 4) ( 5) (6) (7) ( 3) m 23 m 23 m 23 m 23 m 23 m23
X24: Continuing stream requirement changes
W1(2,4)
(1) m24
( 3) ( 2) ( 4) ( 5) (6) (7) m24 m 24 m 24 m 24 m 24 m 24
X31: Schedule not accurate
W1(3,1)
(1 ) m 31
( 2) ( 4) (5) (6) (7) ( 3) m 31 m 31 m 31 m 31 m 31 m31
X32: Budget not sufficient
W1(3,2)
(1) m32
( 2) ( 4) ( 5) (6) (7) ( 3) m 32 m 32 m 32 m 32 m 32 m32
X41: Gold-plating
W1(4,1)
( 1) m 41
( 2) ( 4) ( 5) (6) (7) ( 3) m 41 m 41 m 41 m 41 m 41 m41
X42: Skill levels inadequate X43: Straining hardware
W1(4,2)
( 1) m 42
( 3) ( 2) ( 4) ( 5) (6) (7) m42 m 42 m 42 m 42 m 42 m 42
W1(4,3)
(1 ) m 43
( 2) ( 4) ( 5) (6) (7) ( 3) m 43 m 43 m 43 m 43 m 43 m43
X44: Straining software
W1(4,4)
( 1) m 44
( 3) ( 2) ( 4) ( 5) (6) (7) m44 m 44 m 44 m 44 m 44 m 44
W1(5,1)
(1) m51
(2) ( 3) (4) ( 5) ( 6) (7) m51 m51 m51 m51 m51 m51
W1(5,2)
(1) m 52
(2) ( 3) (4) ( 5) ( 6) (7) m52 m52 m52 m52 m52 m52
W1(6,1)
( 1) m 61
( 2) ( 4) ( 5) (6) (7) ( 3) m 61 m 61 m 61 m 61 m 61 m61
X2: System requirement
W2(2)
W2(3)
X3: Schedules and budgets
X4: Developing technology
W2(4)
X5: External resource
W2(5) X51: Shortfalls in externally furnished components X52: Shortfalls in externally performed tasks
X6.: Performance
W2(6) X61: Real-time performance shortfalls
442
H.-M. Lee and L. Lin
~ for k=1, 2, ..., 6; i=1, 2, …, nk . Then, ( mki( l ) / mki )Vl is the weighted triangular fuzzy ~ number of Vl with respective to the linguistic Vl of the sub-item X ki for k=1, 2, …, 7; i=1, 2, …, 6; j=1, 2, … ni .
~ (b) Defuzzify ( mki( l ) / mki )Vl by the centroid, we have l mki( l ) ~ C (( mki( l ) / mki )Vl ) = 8 mki
(8)
We propose the following proposition to tackle the rate of aggregative risk. Proposition 4.1 1) For the sub-item Xki, the risk rate is
R ( k ,i ) =
1 7 l ⋅ mki( l ) ∑ 8 l =1 mki
(9)
for k=1, 2, …, 6; i=1, 2, …, nk . 2) For the attribute Xk, the risk rate is nk
R
(k )
= ∑ w1 (k , i ) R ( k ,i ) i =1
(10)
=
7 1 nk l ⋅ mki( l ) ( , ) w k i ∑ 1 ∑ 8 t =1 l =1 mki
3) For the rate of aggregative risk is 6
R = ∑ w2 (k ) R ( k ) k =1
=
(11)
nk 7 l ⋅ mki( l ) 1 6 w k w k i ( ) ( , ) ∑ 2 ∑ ∑ 1 mki 8 k =1 t =1 l =1
5 Numerical Example Assume that we have the risk attributes in software development, risk items and the numbers selections for the fuzzy linguistic Vl with respective to X ki as shown in Table 2 as an example to implement the proposed algorithm. We can have the following computed results.
Fuzzy Group Evaluating the Aggregative Risk Rate of Software Development
443
Table 2. Contents of the implementing example Risk attribute
Risk item
X1: Personal
Weight-2 Weight-1
V1
V2
V3
Linguistic variables V4 V5 V6
V7
0.3 X11: Personal shortfalls, key person(s) quit
1
1
3
3
3
5
0
0
X21: Requirement ambiguity
0.4
1
3
4
3
4
0
0
X22: Developing the wrong software function X23: Developing the wrong user interface X24: Continuing stream requirement changes
0.4
1
2
2
3
4
2
0
0.1
1
2
3
4
4
0
0
0.1
1
2
3
3
4
1
0
X31: Schedule not accurate
0.5
1
3
3
3
2
1
0
X32: Budget not sufficient
0.5
0
3
2
4
4
1
0
X41: Gold-plating
0.3
1
1
2
4
4
1
0
X42: Skill levels inadequate
0.1
1
2
3
4
5
1
0
X43: Straining hardware
0.3
1
3
4
4
4
0
0
0.3
2
2
1
3
3
1
1
0.5
1
3
3
3
2
1
0
0.5
1
2
3
3
3
1
0
1
2
3
4
4
2
0
0
X2: System requirement
0.3
0.1
X3: Schedules and budgets
X4: Developing technology
0.1
X44: Straining software X5: External resource
0.1 X51: Shortfalls in externally furnished components X52: Shortfalls in externally performed tasks
X6.: Performance
0.1 X61: Real-time performance shortfalls
(a) The risk rate of each risk attribute is shown as follows: Risk attribute Risk rate X1: Personal 0.441667 X2: System requirement 0.4575 X3: Schedules and budgets 0.45261 X4: Developing technology 0.465024 X5: External resource 0.4375 X6.: Performance 0.383333 (b) The rate of aggregative risk is 0.443597
6 Conclusion General survey forces evaluators to assess one grade from the grade table of risk for each risk item, but it ignores the uncertainty of human thought. In this paper, we
444
H.-M. Lee and L. Lin
propose the group decision making to evaluate rate of aggregative risk in software development. The proposed method is easier, more flexible and useful to implement.
Acknowledgments The authors gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation. This work was supported in part by the national Science council of Taiwan, R.O.C., under grant No. 98-2410-H-034-039-.
References 1. AFSC: Software Risk Abatement, U. S. Air Force Systems Command, AFSC/AFLC pamphlet 800-45, Andrews AFB, MD, Sep., pp. 1-28 (1988) 2. Boehm, B.W.: Software Risk Management. CS Press, Los Alamitos (1989) 3. Boehm, B.W.: Software Risk Management: Principles and Practices. IEEE Software 8, 32–41 (1991) 4. Charette, R.N.: Software Engineering Risk Analysis and Management. Mc Graw-Hill, New York (1989) 5. Chen, S.M.: Evaluating the Rate of Aggregative Risk in Software Development Using Fuzzy Set Theory. Cybernetics and Systems: International Journal 30, 57–75 (1999) 6. Conger, S.A.: The New Software Engineering. Wadsworth Publishing Co., Belmont (1994) 7. Gilb, T.: Principles of Software Engineering Management. Addison-Wesley Publishing Co., New York (1988) 8. Kaufmann, A., Gupta, M.M.: Introduction to Fuzzy Arithmetic Theory and Applications. Van Nostrand Reinhold, New York (1991) 9. Lee, H.-M.: Applying fuzzy set theory to evaluate the rate of aggregative risk in software development. Fuzzy Sets and Systems 79, 323–336 (1996) 10. Lee, H.-M., Lee, S.-Y., Lee, T.-Y., Chen, J.-J.: A new algorithm for applying fuzzy set theory to evaluate the rate of aggregative risk in software development. Information Sciences 153, 177–197 (2003) 11. Lin, L., Lee, H.-M.: A Fuzzy Assessment Model for Evaluating the Rate of Risk in Software Development. International Journal of Computers, Issue 3(1), 64–68 (2007) 12. Lee, H.-M., Lin, L.: A Fuzzy Presumptive Rate of Aggregative Risk in Software Development. In: Fourth International Conference on Innovative Computing, Information and Control (ICICIC 2009), Kaohsiung, Taiwan, December 7-9 (2009) 13. Sommerville, I.: Software Engineering, 6th edn. Pearson Education Limited, England (2001) 14. Zimmermann, H.-J.: Fuzzy set theory and its Applications, 2nd revised edn. Kluwer Academic Publishers, Dordrecht (1991)
Fuzzy Power System Reliability Model Based on Value-at-Risk Bo Wang, You Li, and Junzo Watada Graduate School of Information, Production and Systems, Waseda University, Kitakyushu, Japan [email protected], [email protected], [email protected]
Abstract. Conventional power system optimization problems deal with the power demand and spinning reserve through real values. In this research, we employ fuzzy variables to better characterize these values in uncertain environment. In building the fuzzy power system reliable model, fuzzy Value-at-Risk (VaR) can evaluate the greatest value under given confidence level and is a new technique to measure the constraints and system reliability. The proposed model is a complex nonlinear optimization problem which cannot be solved by simplex algorithm. In this paper, particle swarm optimization (PSO) is used to find optimal solution. The original PSO algorithm is improved to straighten out local convergence problem. Finally, the proposed model and algorithm are exemplified by one numerical example. Keywords: Fuzzy variable, Value-at-Risk, System reliability, Improved particle swarm optimization algorithm.
1 Introduction In conventional power system optimization problems [1, 2], power demand and spinning reserve are evaluated in real values using historical data. However, many factors always affected power supply in real situations. The system usually needs to generate different power for the same unit time of each day. As a result, the power systems are suffering from the uncertainty that: the real power demands always depart from forecasts [3]. That is to say, the operate schemes obtained from real values sometimes cannot satisfy the real demands. It might be unsafe during a period of the total time span. Therefore, to handle such uncertainty on the basis of previous data and experts’ opinions, it is more reasonable to treat the power demand and spinning reserve as variables with imprecise distributions, i.e., fuzzy variables. In the literatures, fuzzy set theory has been applied to this topic [4, 5, 6]. These researches illustrate the optimal problems as: decreasing the cost as much as possible without moving too many control settings, while satisfying the soft constraints as much as possible and enforcing the hard constrains exactly. The soft constraints were mainly measured by the membership function of fuzzy variables, and then the results and control processes were obtained according to the objective function and membership degree. However, in this research, we use fuzzy Value-at-Risk criterion to measure the R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 445–453, 2010. © Springer-Verlag Berlin Heidelberg 2010
446
B. Wang, Y. Li, and J. Watada
soft constraints and the reliability of the whole system. On the basis of fuzzy Value-at-Risk, we can provide more significant information for decision markers. Moreover, it has been recognized for many years that the economic dispatch may be unsafe [2]. Without violating the system constraints, if we only minimize the cost as much as possible, sometimes, the whole system cannot be trusted. To avoid the extreme profit-based power dispatch and equip the system with a higher reliability, we adopt variance as another constraint. This optimal problem has commonly been formulated as complex nonlinear optimization problem. It is difficult to solve the problem by linear programming approach. Therefore, we propose a fuzzy simulation based particle swarm optimization as solution method. To straighten out the defect of original PSO, an improved PSO (IPSO) is designed to find better solutions. The remainder of this paper is organized as follows: In Section 2, we briefly introduce some basic knowledge on fuzzy variable; Sections 3 gives the concept of fuzzy Value-at-Risk and its calculation; Section 4 is spent to illustrate the VaR based fuzzy power system reliability model (VaR-FPSRM), then Section 5 provides an IPSO as the solution; the proposed model and algorithm are exemplified by one numerical example in Section 6; finally Section 7 summarizes the conclusions.
2 Preliminaries The fuzzy variable is a fundamental mathematical tool to describe fuzzy uncertainty, and there are mainly three types of fuzzy variables frequently used: triangle, trapezoid and normal distribution. Before introducing the fuzzy VaR, we briefly review some basic facts on fuzzy variables. Suppose ξ be a fuzzy variable whose membership function is μξ , and r be a real number. The possibility and credibility of event ξ ≤ r are expressed respectively as follows: Pos{ξ ≤ r } = sup μ ξ ( t ) t≤r
Nec{ξ ≤ r } = 1 − sup μ ξ ( t ) , and,
(1)
t >r
1 Cr{ξ ≤ r } = [Pos{ξ ≤ r } + Nec{ξ ≤ r }]. 2
(2)
The credibility measure is formed on the basis of the possibility and necessity measures and in the simplest case it is taken as their average. The credibility measure is a self-dual set function [7], i.e. Cr{ξ ≤ r} = 1 − Cr{ξ > r}. Equation (1) results in the following:
C r{ξ ≤ r } =
1 [su p μ ξ ( t ) + 1 − su p μ ξ ( t )]. 2 t≤r t>r
For further discussion, we notice the following:
(3)
Fuzzy Power System Reliability Model Based on Value-at-Risk
447
For an n-ary fuzzy vector ξ = (ξ 1 , ξ 2 , ..., ξ n ), ξ k is a fuzzy variable and k = 1, 2, ..., n. The membership function of ξ is given by taking the minimum of the individual coordinates, that is
p
μ ξ (t ) = min{ μ ξ (t1 ), μ ξ (t 2 ),..., μ ξ ( t n )}, 1
2
n
(4)
where t = (t1 , t2 ,..., t n ) ∈ ℜ n . For more knowledge on fuzzy variable, you may refer to [7], [8], [9].
3 Fuzzy Value-at-Risk According to [10], the VaR of an investment can be considered as the greatest value with a given confidence level. In fuzzy environment [11], if we denote fuzzy variable L as the forecasted power demand of unit time t, the VaR of L with a confidence of (1 − β ) can be written as:
V aR β = sup{λ | C r{ L ≥ λ } ≥ β },
(5)
where β ∈ (0,1) . Example: For a normal distribution fuzzy variable L =FN(1.0, 2.0) , it’s membership function is:
μ L ( t ) = Exp{ − [(t − 1.0) / 2 ]2 }
(6)
According to the knowledge above, we calculate VaR 0.1 . At first, we need to calculate the credibility function of L ≥ r. From equation (2), we can compute the possibility of ξ ≥ r ,
1 ⎧ Pos( L ≥ r )= ⎨ 2 ⎩ Exp{ − [( r − 1.0) / 2] }
r ≤ 1.0 r > 1.0
⎧1-Exp{ − [( r − 1.0) / 2]2 } Nec{ L ≤ r }= ⎨ 0 ⎩
r ≤ 1.0 r > 1.0
(7)
Hence, according to equation (3), the credibility of L ≥ r is expressed as:
⎧⎪1 − Exp{ − [( r − 1.0) / 2]2 } / 2 r ≤ 1.0 Cr{ L ≥ r }= ⎨ 2 r >1.0 ⎪⎩ Exp{ − [( r − 1.0) / 2] } / 2 Therefore, equation (5) results in the following:
VaR 0.2 = sup{ r | Cr{ L ≥ r } ≥ 0.1} = 2.91.
(8)
448
B. Wang, Y. Li, and J. Watada
4 VaR Based Fuzzy Power System Reliability Model The objective of VaR-FPSRM is to minimize the cost of real generated and reserve power, while satisfying all of the constraints and the reliability of the system. A. Objective function. T
Min ∑ [ f1 ( Pt ) + Pr ⋅ f 2 ( Rt )]
(9)
t =1
Respectively, at each unit time t, Pt is the real power generated by the system, Rt denotes the real power reserve, Pr is the probability that the reserve is called, f1 , f2 are the cost functions of real generated power and reserve power. Here, the reserve cost is paid only when the reserve is called. B. Constraints 1) System power balance The system power balance is considered as a soft constraint. For each unit time,
jt
suppose the confidence level as (1 − β1 ) and PD denotes the fuzzy demand at time t, then jt ≥ λ ) ≥ β } (10) V aR (1− β1 ) = sup{λ1 | C r( P 1 1 D
This λ1 indicates the largest power demand under confidence level (1 − β1 ) , if Pt the value λ1 , the power supply can be treated as reliable for time t . 2) System spinning reserve requirement The spinning reserve is prepared for the emergency which may not happen. It is reasonable to consider the reserve requirement as soft constraint, then,
jt ≥ λ ) ≥ β } VaR (1− β ) = sup{λ 2 | Cr( P 2 2 R
(11)
If Rt is not smaller than λ2 , the system can be treated as reliable at time t. 3) The reliability of the whole system To keep the system in a normal state after some emergencies, we usually need the whole system operating with a high reliability. T
For the whole time, the confidence level is (1 − β 3 ) , λ3 = ∑ ( Pt + Rt ) , then, t =1
T
jt + P jt ) ≥ λ ) ≥ β } VaR (1− β 2 ) = sup{λ 3 | Cr( ∑ ( P 3 3 D R
(12)
t =1
If the generated real power of the whole time can satisfy λ3 , the system can be treated as reliable for the whole time under the confidence level (1 − β 3 ) .
Fuzzy Power System Reliability Model Based on Value-at-Risk
449
4) The variance of the power distribution The extreme profit-based power dispatch can minimize the cost but probability lead the system unsafe in a period of time. Therefore, we employ variance as a constraint to overcome such disadvantage. For the real generated power: 1 T (13) V = ∑ ( Pt − λ1 ) 2 ≤ V1 T t =1 For the real power reserve: 1 T (14) V = ∑ ( R t − λ 2 ) 2 ≤ V2 T t =1 Respectively, V1 and V2 denote the largest variance that can be accepted, and λ1 , λ2 are with the same meaning as mentioned above.
5 Solution 5.1 Fuzzy Simulation
In Section 3, we discussed the calculation of VaR for any single fuzzy variable. However, as in equations (11) and (12), we must deal with a series of different types of fuzzy variables. It is impossible to solve these problems from the definition directly. The fuzzy simulation developed by Liu [7] plays a pivotal role in solving the above problem. This approach is well described in [11] for further reference. 5.2 Particle Swarm Optimization
The model proposed above is complex nonlinear which cannot be solved by linear programming approach. Therefore, we employ the PSO algorithm as the solution. The PSO algorithm was initially proposed by Kennedy and Eberhart [12] in 1995. This algorithm uses collaboration among a population of simple search agents to find the optimal solution in a possible space. If the position of one particle can produce a better result, the other particles will approach this one. This algorithm has been widely used to prove its effectiveness in a variety of fields; refer to [2], [13], [14] for details. It is well known that the PSO algorithm can find the optimal solution within less iteration than other algorithms, but possibly suffers from the local convergence problem. In particular, when the selection problem contains a large number of objectives, this defect of the PSO becomes more obvious. To iron out this problem, we improve the original PSO algorithm by employing the escape speed (ES) and particle restart position (PRP). 1). Escape speed: If the optimal solution cannot be updated after fixed time iterations, it will be considered as having fallen into local convergence. Then, each particle speed will be modified by the escape speed. If the escape speed is sufficiently large, all the particles can jump out of the current loop and restart searching in different areas. 2). Particle restart position: Once the particle speed has been modified by the escape speed, the particles will distribute randomly. For most of the optimization problems, there should be a constraint on the search space. Then, the particle restart position is
450
B. Wang, Y. Li, and J. Watada
generated by the edge value of the search space. We repeat modifying the speed of each particle until they arrive at the restart position. Therefore, on the one hand, each particle can satisfy the constraint on the search space, whereas on the other hand, the particle speed can be updated from the escape speed to a suitable value. As a result, the potential global optimal solution will be found. We use the following two-dimensional Figure 1 to approximate the improved PSO.
Fig. 1. Improved PSO algorithm
Based on the above analysis, we summarize the IPSO algorithm as follows: Step 1. Initializing n particles. Each particle includes the real generated power and reserve of each unit time, for example, particle i is described as: D i ,n D i ,2 4 ⎤ ⎡ D i ,1 D i ,2 ⎥ Ui = ⎢ , , ⋅⋅⋅ , ⋅ ⋅⋅, Ri,n R i ,2 4 ⎦⎥ ⎣⎢ R i ,1 R i , 2
,
Step 2. Checking whether the constrains are satisfied, which include the system power balance, spinning reserve requirement, system reliability and the variance measure of the power distribution. If any of the constraints cannot be satisfied, go to step 1 and initialize the particle again. Step 3. Calculating the fitness function (cost function) of each particle, then obtain global best (gbest) and personal best (pbest) according to these values. Step 4. Each particle is iterated to find better solutions. Then, the gbest and pbest are updated according to the optimal values. Over the course of the iterations, if the algorithm gets into a region local convergence, we update the position and speed of each particle by the improvement mentioned above. Step 5. After completing all of the iterations, return the final value of gbest as the optimal result, and at the same time, obtain the optimal power plan.
Fuzzy Power System Reliability Model Based on Value-at-Risk
451
6 Numerical Example In this section, the proposed model and algorithm are exemplified by the following numerical example. Supposing the 24 hours demands and reserves are fuzzy variables evaluated from previous data and experts’ opinions. Table 1 lists the forecasted values of each unit time and the probability (Pr) that the reserve is called. According to the analysis, the confidence level of system power balance and spinning reserve requirement are appointed as 90% and 60%. The reliability of the whole system should not less than 95%. That is, β1 = 0.1 , β 2 = 0.4 , β 3 = 0.05 . To avoid the extreme profit-based power dispatch, coefficients V1 and V2 in equations (13) and (14) are assigned as 30 and 4. In this research, the unit commitment problem is not taken into consideration, the cost function here is simply defined as a linear function: f ( Pt ) = Pc ⋅ Pt . Pc is the cost of per unit power, suppose for each time Pc = 18$ / MW . Table 1. Forecasted demand and reserve
Hour 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Demand (MW) (650, 700, 750) (700, 750, 800) (800, 850, 900) (900, 950, 1000) (950, 1000, 1050) (1050, 1100, 1150) (1100, 1150, 1200) FN(1200, 440) FN(1300, 420) FN(1400, 400) FN(1450, 400) FN(1500, 390) (1350, 1400, 1450) (1250, 1300, 1350) (1150, 1200, 1250) (1000, 1050, 1100) FN(1000, 400) FN(1100, 410) FN(1200, 410) FN(1400, 430) FN(1300, 420) (1050, 1100, 1150) (850, 900, 950) (750, 800, 850)
Reserve (MW) (65, 70, 75) (70, 75, 80) (80, 85, 90) (90, 95, 100) (95, 100, 105) (105, 110, 115) (110, 115, 120) FN(120, 4.4) FN(130, 4.2) FN(140, 4.0) FN(145, 4.0) FN(150, 3.9) (135, 140, 145) (125, 130, 135) (115, 120, 125) (100, 105, 110) FN(100, 4.0) FN(110, 4.1) FN(120, 4.1) FN(140, 4.3) FN(130, 4.2) (105, 110, 115) (85, 90, 95) (75, 80, 85)
Pr 0.05 0.05 0.05 0.05 0.05 0.08 0.1 0.1 0.1 0.1 0.1 0.12 0.1 0.1 0.08 0.08 0.1 0.1 0.12 0.12 0.12 0.1 0.08 0.05
452
B. Wang, Y. Li, and J. Watada
Then, we solve the problem by the proposed IPSO. After 200 times iterations, under the situation that all the constraints are satisfied and the system reliability is no less than 95%, the least cost could be 545324.4 $.
7 Conclusions In fuzzy environment, we proposed a Value-at-Risk based power system reliability model. This model can minimize the system cost with considering various constraints and the reliability of the system. Furthermore, we proposed an improved PSO as the solution. The model has been proved to be useful and effective by the numerical example, it can be applied to various engineering reliability control problems. However, this research did not consider the unit commitment problem. In the future, we will improve this model to solve more complicated and complete systems.
Acknowledgements This research was supported by Waseda University Global COE Program “International Research and Education Center for Ambient SOC" sponsored by MEXT, Japan.
References [1] Pathom, A., Hiroyuki, K., Eiichi, T.: A hybrid LR-EP for solving new profit-based UC problem under competitive environment. IEEE Trans. on Power System 18, 229–237 (2003) [2] Yuan, X., Nie, H., Su, A., Wang, L., Yuan, Y.: An improved binary particle swarm optimization for unit commitment problem. Expert Systems with Applications 36(4), 8049–8055 (2009) [3] Ruiz, P.A., Philbrick, C.R., Cheung, K.W., Sauer, P.W.: Uncertainty management in the unit commitment problem. IEEE Trans. on Power System 24(2), 642–651 (2009) [4] Rahman, H.A., Shahidehpour, S.M.: A fuzzy based optimal reactive power control. IEEE Trans. on Power System 8(2), 662–670 (1993) [5] Khare, R., Christie, R.D.: Prioritizing emergency control problem with fuzzy set theory. IEEE Trans. on Power System 12(3), 1237–1242 (1997) [6] Abou El-Ela, A.A., Bishr, M.A., Allam, S.M., Ei-Sehiemy, R.A.: An emergency power system control based on the multi-stage fuzzy based procedure. Electric Power Systems Research 77(5-6), 421–429 (2007) [7] Liu, B., Liu, Y.K.: Expected value of fuzzy variable and fuzzy expected value models. IEEE Transaction on Fuzzy Systems 10(4), 445–450 (2002) [8] Nahmias, S.: Fuzzy variable. Fuzzy Sets and Systems 1(2), 97–101 (1978) [9] Wang, S., Liu, Y., Dai, X.: On the continuity and absolute continuity of credibility functions. Journal of Uncertain System 1(2), 185–200 (2007) [10] Duffie, D., Pan, J.: An overview of value-at-risk. Journal of Derivatives 4(3), 7–49 (1997) [11] Wang, S., Watada, J., Pedrycz, W.: Value-at-Risk-Based two-satge fuzzy facility location problems. IEEE Transactions on Industrial Informatics 5(4), 465–482 (2009)
Fuzzy Power System Reliability Model Based on Value-at-Risk
453
[12] Kennedy, J., Eberhaart, R.C.: Particle swarm optimization. In: Proceedings of the 1995 IEEE International Conference on Neual Network, IV, pp. 1942–1948 (1995) [13] Sharma, K.D., Chatterjee, A., Rakshit, A.: A Hybrid Approach for Design of Stable Adaptive Fuzzy Controllers Employing Lyapunov Theory and Particle Swarm Optimization. IEEE Transactions on Fuzzy Systems 17(2), 329–342 (2009) [14] Zhao, L., Yang, Y.: PSO-based single multiplicative neuron model for time series prediction. Expert Systems with Applications 36(2), 2805–2812 (2009)
Human Tracking: A State-of-Art Survey Junzo Watada1, Zalili Musa2, Lakhmi C. Jain3, and John Fulcher4 1
Waseda University, Japan [email protected] 2 University Malaysia Pahang, Malaysia [email protected] 3 University of South Australia, Australia [email protected] 4 University of Wollongong, Australia [email protected]
Abstract. Video tracking can be defined as an action which can estimate the trajectory of an object in the image plane as it moves within a scene. A tracker assigns consistent labels to the tracked objects in different frames of a video. The objective of this paper is to provide information on the present state of the art and to discuss future trends in the use of multi-camera tracking systems. In the literature, three main types of multi-camera tracking system have been outlined. The first type relies on challenges in the camera tracking system. The second concerns the methodology of tracking systems in general. The third type relies on current trends in camera tracking systems. We provide an overview of the current research status by summarizing promising avenues for further research. Keywords: Video tracking, Human tracking, Object tracking.
1 Introduction The ability to consistently detect and track object motion, such as vehicles and humans, is a very useful tool for higher level applications that rely on visual input. In the early 1980s, Video Motion Detections (VMD) were developed to enable image processing in security technology. Ever since then, there have been a huge number of developments affecting the methods, algorithms and techniques being implemented in various fields such as security, monitoring, robotic and surveillance systems. During the early stages of tracking system development, many researchers focused on tracking an object's movement in an indoor environment. Nevertheless, society demands tracking systems for a wider range of activities in outdoor environments. Thus, most of the new research on tracking systems has focused on exploring larger dimensions for tracking object motion. Generally, in larger environments, it is not easy to track vehicles or human behavior over many cameras, each of which consists of a limited portion of space in a vast universe [1]. Therefore, a multi-camera approach is an alternative to overcome the large view problem. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 454–463, 2010. © Springer-Verlag Berlin Heidelberg 2010
Human Tracking: A State-of-Art Survey
455
2 Tracking Tracking an object movement is a very challenging task since motions of the target change dynamically. In the conventional method, an object motion can be measured by two models: 1) stochastic models with deterministic and random components; and 2) stochastic model using classical deterministic mechanics [2]. Tracking an object motion based on the constant acceleration model, furthermore, the size of an object will be affected during movement in the camera view. Various techniques and methods have been developed in order to overcome these problems. Some criteria for tracking feature must be taken into consideration in order to decide the best technique for object tracking. 2.1 Tracking by Detection Tracking by detection refers to continuous detection to localize regions, points or features of an image frame by frame. This technique commonly uses schemes classification or invariant descriptors and focuses more on real time applications. Currently, different detection methods have been anticipated for tracking by detection. Based on Collins et al., methods of moving target detection can be classified into three conventional approaches: 1) temporal differencing, 2) background subtraction and 3) optical flow [3]. On the other hand, Yilmaz et al. stressed that there are four categories of detection method that have been used in tracking systems which are 1) point detector, 2) segmentation, 3) supervised learning and 4) background subtraction [4]. a.
Point detector
The point detector method is employed to find significant points (interest points) in images which have an expressive feature in their respective localities. A point process can be specified either in terms of counts of points in a set or else in terms of its interval sequence. In computer vision, interest points have been widely used in solving various problems such as object recognition, video or image retrieval, image registration as well as tracking systems. In general, interest points can be divided into three categories which are contour based, intensity based and parametric model based. The contour based method is frequently used in detecting a line, junction, ending, shape, edge of an object such as outer contour of lips, eyes, eyebrows, building and so on. For example, contour-based moving object detection and tracking has been proposed by Yokoyama and Poggio [5]. The purpose of this method is to detect and track the moving object and non-rigid object using gradient-based optical flow and edgebased feature. The result shows that this method is able to handle occlusion and the interference problem. Cao et al., has proposed contour detection based on spatial temporal characteristics. This method applied S-T features in detecting the contour of the image. The result has shown that the contour of a moving object can be clearly distinguished from its background [6]. It has also been proven that this method is sensitive to object movement and also is capable of the affects of complex background. Other than that, Wang et al. has applied the B-Snake lane model with CHEVP (Canny/Hough estimation of vanishing points) algorithm to find out the possible lines
456
J. Watada et al.
in the road boundaries in the image plane. In the study, Hough transforms have been used to detect the horizon and vanishing points of a line in a highway [7]. The intensity based method is another method that has been employed in point detection. In these days, this method can be applied in solving various applications in tracking systems such as for surveillance, monitoring, and so on. In this method, single Gaussian, mixture of Gaussian, histogram, texture information and gray-level pixel intensity are basic techniques usually employed by researcher. However, this method has quite difficulty overcoming a problem if the different illuminations influence the intensity variations in the images. In order to overcome this problem, the intensity based method has been combined with the other method. Based on Yokoi, the change detection technique with intensity and color information is a robust way to detect a human movement even though if the images have a small illumination and different color[8]. Haritaoglu et al. has proposed intensitybased and stereo-based detection in real-time systems for detecting and tracking people. The result has shown that both methods are capable of successfully distinguishing the foreground and background of an image. The main advantage of these methods are their ability in more accurately detecting shapes (silhouettes) in foreground regions, as well as their capability in avoiding the illumination problem [9]. Ghiasi et al. used the KLT (Kanade Lucas Tomasi) tracking method to detect human movement in a video scene. In their work, a number of traceable points in the image are selected based on large contrasts of intensity [10]. This process is called feature selection. For the next stage, the KLT tracking method locks onto the selected features and strives to locate them in the next upcoming frame. Typically, parametric detection is employed in situations where the densities are known. The information on the density can be formulated to act as decision rules in this technique. However, in motion tracking, most researchers employ non-parametric detection to detect the location of the moving object since the image features in motion tracking dramatically change in every frame. Gisbon and Melsa have suggested that non-parametric detection theory can be appropriate for situations that do not have a reasonable nominal density or in situation that the possible extent of deviations from the nominal cannot be assessed [11]. Elgammal et al. have proposed a non-parametric background model to tracking systems in order to handle non-static background that contains small motion such as tree branches and bushes. This model estimates the probability of observing pixel intensity values based on a sample of intensity values for each pixel. The result shows that this model is very responsive in detecting low false alarm rates [12]. b.
Segmentation
Object detection can be performed by archiving an efficient partition based on features of an image. Archiving can be done for example by color, shape, pattern, etc. The archiving method is called segmentation. The image segmentation algorithm is applied to classify an image according to its common criteria such as color, shape, etc. Each segmentation algorithm addresses two problems; firstly the criteria for a good partition and secondly the method for achieving efficient portioning. In robust real-time upper body limb detection and tracking, Siddiqui and Medioni designed a real time and robust system to detect or track human forearms as a skin colored region. In their research, they used the continuous distance transform
Human Tracking: A State-of-Art Survey
457
technique to measure color skin and edge information as visual features in order to detect and track the forearms of a human [13]. Other than that, the mean-shift algorithm has achieved considerable success in object tracking due to its simplicity and robustness. Mean-shift represents a general non-parametric mode finding/clustering procedure. It was first proposed by Fukunaga and Hostetler [14], later adapted by Cheng [15] for the purpose of image analysis and recently extended by Comaniciu and Meer to low-level vision problems, including segmentation, adaptive smoothing and tracking [16]. In addition to the method mentioned above, histogram is another commonly applied technique in segmentation. Histogram is a approach used to present statistical information that employs a visual graph to show the frequency of a range of variables. In the histogram, a bar chart representing a frequency distribution; heights of the bars represent observed frequencies. Color histogram and image histogram are two typically used types of histogram approach in computer vision. Color histogram represents the distribution of colors in an image by counting the number of pixels of each given set of color ranges such as in two-dimensional or three dimensional color spaces. For example, Healy [17] has proposed a color segmentation algorithm based on physical properties of the reflected spectrum from various materials and assuming negligible sensor noise. Another example can be seen in Huang et al. [18] when they used a combination of scale space filter and projection of the threedimensional (3-D) color histogram onto the one-dimensional (1-D) histograms to perform color segmentation. Meanwhile, an image histogram is a type of histogram which acts as a graphical representation for the tonal distribution in a digital image. Normally, an image histogram of scalar pixel values is preferable to be used in image processing as compared to color histogram. Based on [19], the easiest way to determine the threshold of a binary image is based on determining two peaks of normal distribution from image histogram separated by a valley corresponding to the intermediate gray levels, which is used to divide the object and background into billable. c.
Supervised learning
Supervised learning is a technique employed in data training. This technique requires a large collection of samples for its learning process. During the learning process, a set of input data can be trained and generate the output as a continuous value (called regression) or class label of the input object (called classification). However, in supervised learning, the selection of features plays a significant role in classifying the data. Therefore, in certain cases, it is compulsory to use a set of certain features to ensure the result of the learning process can be separated from one class to another. There are two commonly employed techniques in supervised learning; temporal difference learning and template matching. Temporal difference learning is a prediction method mainly used in solving reinforcement learning problems. This method has the ability to learn directly from raw experience of the agent in order to achieve a goal. Additionally, since it is a prediction method, a prediction is made in achieving its goal. Prediction is made whenever the observation is available. However, the prediction is adjusted to match the observation better. Ho et al. used the temporal deference approach using Actor-Critic algorithm in deriving the update model for human movement. This is an on-policy learning method
458
J. Watada et al.
that makes critique in the form of temporal difference error after each action made by the creator [20]. Additionally, in moving target classification and tracking from realtime video, Lipton et al., used a combination of temporal difference and template matching in order to classify and clutter target tracking into three categories: human, vehicles or background clutter [21]. Nowadays, human tracking systems are widely used for outdoor environment activities. Pai et al., focused their research on pedestrian detection and tracking on crossroads. They used a pedestrian model and walking rhythms approach to recognize a single pedestrian [22]. Meanwhile, Qui et al. have proposed corner feature extraction, motion matching and object classification for the purpose of detecting a pedestrian and bicycle on a road [23]. d. Background subtraction Background subtraction is a very popular conventional method. Various researchers have employed this method for detection purposes. This conventional method is widely used in tracking systems and especially in surveillance applications. The name “background subtraction” comes from the simple technique that subtracts all information of an image and finally thresholds the information to generate the object of interest. Based on [4][9][12], this method is typically used to segment moving regions in image sequences taken from a static camera. The changes or moving region can be detected by simply taking the absolute difference by comparing each new frame to a model of the scene background. Numerous techniques have been classified in background subtraction that has been projected by researchers for example an adaptive Gaussian mixture model. In this technique, the values of a particular pixel are calculated as a mixture of Gaussians. Based on persistence and the variance of each of the Gaussians of the mixture, the background and foreground are determined. Each pixel values that does not fit to the background distributions is considered as foreground. This technique is an interesting technique and has been used in the background subtraction approach. Dikinson and Hunter used an adaptive Gaussian mixture model in color and space to represent background and foreground regions of the scene [24]. This model is used to probabilistically classify observed pixel values, incorporating the global scene structure into pixel-level segmentation. In a real time indoor and outdoor tracker, Stauffer and Grimson have employed an adaptive Gaussian mixture model in order to evaluate and determine the background of an image [25]. Each pixel is classified based on a Gaussian distribution to consider a part of the background model. The result has proven that this technique is reliable in dealing with lighting changes, repetitive motions from clutter, and long-term scene changes. On the other hand, local dependency histogram (LDH) is another background subtraction technique. It has been proposed by Weng et al. to solve problems that are commonly faced in traditional methods for dynamic background subtraction such as waving tree, spouting fountain, illumination changes, camera jitters, etc. [26]. In this technique, each pixel is modeled as a group of weighted LDHs. Next, the process of labeling the pixel as the foreground or background is done by comparing the new LDH computed in the current frame against its LDHs model. Finally, a fixed threshold value is used to define whether or not a pixel matches its model.
Human Tracking: A State-of-Art Survey
459
Kernel density estimation is another prominent technique in background subtraction. This technique is applied to solve the non parametric problem for tracking systems. Generally, kernel density estimation is a nonparametric technique for density estimation in which a known density function (the kernel) is averaged across the observed data points to create a smooth approximation. [16], has applied kernel density estimation to handle situations where the scene background is not completely static, but contains small motions such as moving tree branches and bushes or illumination changes. On the other hand, it is able to suppress false detections that arise due to small camera displacements. Additionally, Li et al., have proposed a kernel density estimation based on gradient features to overcome the suppression of fake objects caused by dynamic cast shadows and reflection images [27]. Moa and Shi have proposed a kernel density estimation technique in nonparametric background model based on diversity sampling for cluttered environments with slow motion [28]. The proposed model is able to deal with multi-modal backgrounds and is robust to scene changes. 2.2 Statistical Tracking The statistical tracking method does not require the system to execute the detection process on every frame. It is mostly based on motion prediction or updated filter. The main advantage of this method is the object tracking process is faster than the continuous detection method. Nowadays, there are several statistical techniques employed in the tracking process such as Kalman filter, particle filter, Parzen filter, etc. The statistical method is carried out in two steps. Firstly, an object tracking is performed by predicting the object's position from the previous information and verifying the existence of the object at the predicted position. Secondly, the observed likelihood function and motion model must be learnt by some sample of image sequences before tracking is performed. Currently, the statistical method is frequently employed in tracking applications. The Kalman filter technique is engaged to track an object motion. In 1960, R.E. Kalman published his famous paper describing a recursive solution to the discrete-data linear filtering problem [29]. Ever since, the Kalman filter has been the subject of extensive research and application, particularly in a wide range of engineering applications from radar to computer vision. The Kalman filter is a set of mathematical equations that provides an efficient computational (recursive) means to estimate the state of a process in several aspects: it supports estimations of past, present, and even future states, and it can do the same even when the precise nature of the modeled system is unknown [29]. In video tracking applications, the Kalman filter is utilized to estimate the state of an object motion in dynamic systems. For example, Weng et al. have proposed an adaptive Kalman filter to track moving object in HSI color space [30]. Their proposed method has the robust ability to track the moving object in the consecutive frames under some kinds of real-world complex situations such as the moving object disappearing totally or partially due to occlusion by other ones, fast moving objects, changing lighting, changing the direction and orientation of the moving object, and sudden changing the velocity of moving object. On top of that, Cuevas et al. have proposed extensions to the Kalman filter technique to overcome a small occlusion problem as well as to model complex movement
460
J. Watada et al.
of an object, respectively [31][32]. In the conclusion, they emphasized that the Kalman filter algorithm has poor performance in tracking an object with a complex trajectory. Consequently, in order to solve such a problem, they decided to employ an extended Kalman filter. Tracking multi object motion is typically in non-linear or in non-Gaussian environments. Therefore, in order to overcome this problem, most researchers use the probability density function (pdf). As an example, Weiler et al., have used an optical flow estimation based on pdf’s calculated at each image position [33]. In the meantime, Wang et al., have proposed a Dynamic Bayesian Networks (DBNs) model based on probability distribution in order to track an object in non-linear, non Gaussian and multimodal situations [34]. Tian et al., have proposed a new object tracking algorithm based on a combined HSV color probability histogram model in their Cam-shift algorithm [35]. The probability distribution describes a range of possible values that a random variable could attain and the probability that the value of the random variable is within any (measurable) subset of that range. Meanwhile, probability distribution function (pdf) is a function which describes the density of probability at each point in the sample space. This method is frequently used for tracking an object in a cluttered background. The condensation method is also a popular technique in pdf. In the condensation method, object tracking is performed by predicting the object's position from the previous information and verifying the existence of the object at the predicted position. This method is also known as a particle filter. Principally, the particle filter represents the prior and posterior distribution of a given parameter as a set of samples (particles), and weights of each particle. In visual tracking, a particle filter uses multiple discrete “particles” to represent the belief distribution over the location of a tracked object. Starting from Blake and Isard’s condensation algorithm [36], the particle filter has become a very popular method for visual tracking in computer vision applications such as for security and monitoring systems that are utilized to track the human body, face, hand gesture, vehicles in a highway and so on. Based on [37][38], the particle filter has shown to provide some improvements in performance over some conventional methods (such as Kalman filter) especially in non-linear or non-Gaussian environments. The improvement here refers to the ability of the methods to provide robust tracking of moving objects in a cluttered environment. On the other hand, a range of prediction methods have been proposed by researchers. Yeoh and Abu Bakar have offered a tracking algorithm that is based on the Linear Prediction (LP) solved by the Maximum Entropy Method (MEM) [39]. The method attempts to predict the centroid of the moving object in the next frame based on several past centroid measurements. Xiao et al. have developed a novel location prediction model based on grey theory [40]. The proposed location prediction model adopts the grey modeling method in order to predict the future location of uncertain moving objects. Meanwhile, How and Bin have employed a Fuzzy Prediction System (FPS) in predicting the position of the moving target in the next image frame [41]. Currently, many researchers have combined statistical methods with continuous detection with the purpose of enhancing the techniques and algorithm for tracking. The
Human Tracking: A State-of-Art Survey
461
purpose of combining several techniques is for solving the various problems in tracking applications such as occlusion and simultaneous changes of the camera and object.
3 Concluding Remarks This paper provided an overview of the recent research on human tracking. In survying the present state-of-the-art, we also identify current trends for future work in the area.
References [1] Musa, Z.B., Watada, J.: A Grid-Computing Based Multi-Camera Tracking System for Vehicle Plate Recognition. Kybernetika Journal, Czech Republic 42(4), 495–514 (2006) [2] Meyer, F., Bouthemy, P.: Region-based tracking in an image sequence. In: Sandini, G. (ed.) ECCV 1992. LNCS, vol. 588, pp. 476–484. Springer, Heidelberg (1992) [3] Collins, R.T., Lipton, A.J., Kanade, T., Fujiyoshi, H., Duggins, D., Tsin, Y., Tolliver, D., Enomoto, N., Hasegawa, O.: System for Video Surveillance and Monitoring, Technical report CMU-RI-TR-00-12, Robotics Institute, Carnegie Mellon University (2000) [4] Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Computing Surveys (CSUR) 38(13), 1–45 (2006) [5] Yokoyama, M., Poggio, T.: A Contour-Based Moving Object Detection and Tracking. In: IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 271–276 (2005) [6] Cao, Y.-y., Xu, G.-y., Riegel, T.: Moving Object Contour Detection Based on S-T Characteristics in Surveillance, Human Interface and the Management of Information. In: Methods, Techniques and Tools in Information Design, pp. 575–583 (2007) [7] Wang, Y., Teoh, E.K., Shen, D.: Lane Detection Using B-Snake. In: International Conference on Information, Intelligence, and Systems, p. 438 (1999) [8] Yokoi, K.: Probabilistic BPRRC: Robust Change Detection against Illumination Changes and Background Movements. In: Conference on Machine Vision Applications, MVA 2009, pp. 148–151 (2009) [9] Haritaoglu, I., Harwood, D., Davis, L.S.: W4: real-time surveillance of people and their activities. In: Pattern Analysis and Machine Intelligence, pp. 809–830 (2000) [10] Ghiasi, S., Moon, H.J., Nahapetian, A., Sarrafzadeh, M.: Collaborative and Reconfigurable Object Tracking. The Journal of Supercomputing 30(3), 213–238 (2004) [11] Gibson, J.D., Melsa, J.L.: Introduction to Non-Parametric Detection with Applications. Academic Press, New York (1975) [12] Elgammal, A.M., Harwood, D., Davis, L.S.: Non-parametric Model for Background Subtraction. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 751–767. Springer, Heidelberg (2000) [13] Siddiqui, M., Medioni, G.: Real Time Limb Tracking with Adaptive Model Selection. In: International Conference on Pattern Recognition (ICPR 2006), vol. 4, pp. 770–773 (2006) [14] Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Transactions on Information Theory 21(1), 32–40 (1975) [15] Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(8), 790–799 (1995)
462
J. Watada et al.
[16] Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(5), 603–619 (2002) [17] Healey, G.: Segmenting images using normalized color. IEEE Trans. Syst., Man, Cybern. 22, 64–73 (1992) [18] Huang, C.L., Cheng, T.Y., Chen, C.C.: Color image segmentation using scale space filter and Markov random fields. Pattern Recognit. 25, 1217–1229 (1992) [19] Huang, Z.-K., Xie, Y.-M., Liu, D.-H., Hou, L.-Y.: Using Fuzzy C-means Cluster for Histogram-Based Color Image Segmentation. In: International Conference on Information Technology and Computer Science, pp. 597–600 (2009) [20] Ho, M.A.T., Yamada, Y., Umetani, Y.: An HMM-based temporal difference learning with model-updating capability for visual tracking of human communicational behaviors. In: Proceedings of Fifth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 163–168 (2002) [21] Lipton, A.J., Fujiyoshi, H., Patil, R.S.: Moving target classification and tracking from real-time video. In: Proc. IEEE Workshop Applications of Computer Vision, pp. 8–14 (1998) [22] Pai, C.-J., Tyan, H.-R., Liang, Y.-M., Liao, H.-Y.M., Chen, S.-W.: Pedestrian detection and tracking at crossroads. In: International Conference on Image Processing (ICIP 2003), vol. 3, pp. II-101-4 (2003) [23] Qui, Z., Yao, D., Zhang, Y., Ma, D., Liu, X.: The study of the detection of pedestrian and bicycle using image processing. Intelligent Transportation Systems 1, 340–345 (2003) [24] Dickinson, P., Hunter, A.: Scene modeling using an adaptive mixture of Gaussians in color and space. In: IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 64–69 (2005) [25] Stauffer, C., Grimson, W.: Adaptive Background Mixture Models for Real-Time Tracking. In: Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 246– 252 (1999) [26] Weng, J., Zhang, Y., Hwang, W.S.: Candid covariance-free incremental principal component analysis. PAMI 25(8) (2003) [27] Li, Y.: Perceptually inspired image estimation and enhancement, Thesis (Ph. D.) Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences (2009) [28] Mao, Y.-f., Shi, P.-f.: Diversity sampling based kernel density estimation for background modeling. Journal of Shanghai University (English Edition) 9(6), 506–509 (2005) [29] Kalman, R.E.: A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME – Journal of Basic Engineering 82(Series D), 35–45 (1960) [30] Weng, S., Kuo, C., Tu, S.: Video object tracking using adaptive Kalman filter. Journal of Visual Communication and Image Representation 17(6), 1190–1208 (2006) [31] Cuevas, E.V., Zaldívar, D., Rojas, R.: Kalman Filter for Vision Tracking, Technical Report B-05-12, Freie Universität Berlin, Fachbereich Mathematik und Informatik (2005) [32] Cuevas, E.V., Zaldívar, D., Rojas, R.: Particle Filter for Vision Tracking, Technical Report B-05-13, Freie Universität Berlin, Fachbereich Mathematik und Informatik (2005) [33] Weiler, D., Willert, V., Eggert, J.: A Probabilistic Prediction Method for Object Contour Tracking. In: Kůrková, V., Neruda, R., Koutník, J. (eds.) ICANN 2008, Part I. LNCS, vol. 5163, pp. 1011–1020. Springer, Heidelberg (2008) [34] Wang, T., Diao, Q., Zhang, Y., Song, G., Lai, C., Bradski, G.: A Dynamic Bayesian Network Approach to Multi-cue based Visual Tracking. In: Proceedings of the Pattern Recognition, 17th International Conference on (Icpr 2004), vol. 2, pp. 167–170 (2004)
Human Tracking: A State-of-Art Survey
463
[35] Tian, G., Hu, R., Wang, Z., Fu, Y.: Improved Object Tracking Algorithm Based on New HSV Color Probability Model. In: Proceedings of the 6th international Symposium on Neural Networks: Advances in Neural Networks - Part II, pp. 1145–1151 (2009) [36] Isard, M., Blake, A.: Contour tracking by stochastic propagation of conditional density. In: Proc. of the 4th European Conference on Computer Vision, vol. 1, pp. 343–356 (1996) [37] Xu, X., Li, B.: Rao-Blackwellised particle filter for tracking with application in visual surveillance. In: IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 17–24 (2005) [38] Ye, Z., Liu, Z.-Q.: Tracking Human Hand Motion Using Genetic Particle Filter. In: IEEE International Conference on Systems, Man and Cybernetics (SMC 2006), vol. 6, pp. 4942–4947 (Okt. 2006) [39] Yeoh, P.Y., Abu-Bakar, S.A.R.: Maximum entropy method (MEM) for accurate motion tracking. In: Conference on Convergent Technologies for Asia-Pacific Region, TENCON 2003, vol. 1, pp. 345–349 (2003) [40] Xiao, Y., Zhang, H., Wang, H.: Location Prediction for Tracking Moving Objects Based on Grey Theory. In: Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 2, pp. 390–394 (2007) [41] How, K.B., Bin, A.S.K.: Implementing Fuzzy Logic in Planar Motion Tracking. Journal of Artificial Intelligence and Machine Learning 7, 1–10 (2006)
Ordinal Structure Fuzzy Logic Predictor for Consumer Behaviour Rubiyah Yusof, Marzuki Khalid, and Mohd. Ridzuan Yunus Universiti Teknologi Malaysia Center for Artificial Intelligence and Robotics (CAIRO), Jalan Semarak, 54100 Kuala Lumpur [email protected], [email protected]
Abstract. In this paper, an intelligent predictor model for forecasting the behaviour of buyers in the product they would be interested to purchase using best customers and biggest customer index as inputs to the predictor. The model is based on fuzzy logic and Ordinal Structure Fuzzy Logic (OSFL). These intelligent models are developed based on the market survey data of the U.S. Bureau of Labour Statistics (BLS) data. The market survey data are sets of consumers categorized in various demographics their spending on the products. The association factors among the products and the consumer’s spending habits are determined using indexes as proposed by New Strategist in order to develop the Fuzzy predictor model. A software is developed to simulate he predictor based on the demographics. The software outputs a list of products based on probability priorities. The result of simulations of the software is compared with the BLA data and found to be quite accurate. Keywords: Fuzzy Logic, Ordinal Structure Fuzzy Logic (OSFL), U.S. Bureau of Labour Statistics (BLS), consumer behaviour prediction.
1 Introduction Consumer Expenditure Survey (CE) program began in 1980. Its principal objective is to collect information on the buying habits of American consumers. Consumer expenditure data are used in various types of research by government, business, labour, and academic analysts. Additionally, the data are required for periodic revisions of the U.S. Bureau of Labour Statistics (BLS) Consumer Price Index (CPI) [2] . Today, research on consumer behaviour is beginning to get noticed by firms and organizations. This is because the studies of consumer behaviour help them to improve their marketing strategies by understanding the psychology issues such as how consumers think, feel, reason, and make selection between different alternatives (e.g., brands, products), how the consumer is influenced by his or her environment (e.g., culture, family, signs, media). Understanding the behavior or buying trends of groups of consumers gives an advantage to companies in deciding the marketing strategies to be employed to increase the sales of their respective products. Customers are very much influenced by R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 464–476, 2010. © Springer-Verlag Berlin Heidelberg 2010
Ordinal Structure Fuzzy Logic Predictor for Consumer Behaviour
465
marketing and promotion activities of products such as advertisements. In order to compete for customers’ attention, advertisements must usually be repeated extensively. However, these activities may be too costly for certain businesses, and moreover, the promotion may be directed to wrong groups of consumers. The promotion or advertisement can be more focused to the needs of the customers and thus will be more effective. This paper describes the development of an intelligent predictor model for predicting the behavior of buyers in the products they would be interested to purchase based on the trends, behavior and habit of buyers. The models are based on fuzzy logic proposed by Zadeh,1965 [3] and ordinal structure fuzzy logic (OSFL). OSFL was first proposed by Ohnishi et. al [4]. The method added dynamic corrections to the reasoning by using the degree of priority among control rules. It is an effective method for simplifying the partitioning of input spaces and decreasing the number of control rules. The rules are single dimensional in nature with an associated weight that will prioritise to fire the rules and can be used for multi input-multi output system. Several applications of OSFL can be found in [5],[6]. The intelligent model is developed based on the market survey on consumer spending of the U.S. Bureau of Labor Statistics (BLS) data. The market survey consists of various data on demographics of consumer spending on the products. The relationship between the categories of demographics and products are determined using 3 main indexes which are (i) Average Spending, (ii) Indexed Spending (Best Customers), (iii) Market Share (Biggest Customers). These indexes (ii and iii) are inputs to the fuzzy logic predictor which predicts the most likely products to be purchased given a certain combination of the demographic categories of the consumers.
2 System Design In this paper, there are two main parts involved, which are: • •
Market survey data analysis (BLS data) Development of Fuzzy Logic Predictor
2.1 Market Survey Data Analysis The market survey consists of various data on consumer spending of the products. The main challenge is to analyze the association factors between customers’ spending and the consumer spending habits in order to develop the fuzzy predictor model. In doing so, we have considered three main factors that determine the relationship between the categories of demographics and categories of products. In order to do this, we have referred to factors proposed by the New Strategist [1]. These factors are: (i) Average spending Average spending is a relationship between the total spending of the households in all segment and the total number of the household in all segments. Average spending
466
R. Yusof, M. Khalid, and Md. R. Yunus
figures are useful for determining the market potential of a product or service in a local area. Average spending can be determined using: n
S ave =
∑
S T (i )
i =1 n
(1)
∑ N (i ) i =1
where Sav e= average spending of all households ST(i) = total spending of household in segment i N(i) =total number of household in segment i n = total number of segments For each segment i, the average spending is given by: S ave ( i ) =
S T (i ) N (i )
(2)
Segment is referred to as one demographic of the household. In this paper the number of segments is 6 and these are: age, income, marital status/children, race, education, region (ii) Indexed spending (Best Customers) The indexed spending figures compare the spending of each demographic segment with that of the average household. To compute the indexes, the average amount each household segment spends on a particular item/product is divided by how much the average household spends on the item, the value is then multiplied by 100. An index of 100 is the average for all households. An index of 125 means average spending by households in the segment is 25 percent above average (100 plus 25). An index of 75 means average spending by households in the segment is 25 percent below average (100 minus 25). The index spending can be described as follows: Best Customer = (average amount each household (segment) spend on the item) / (average household (all) spends on the item) * 100} Best Customer =
(ii)
S ave ( i ) * 100 S ave
(3)
Market share (Biggest Customers)
The Biggest Customers Index indicates the spending power of each demographic segment for each item and can make comparison of which household segment is the biggest customer of each item/product. To produce market share figures, the total amount of all households spending on each item is calculated by multiplying average household spending on the item and
Ordinal Structure Fuzzy Logic Predictor for Consumer Behaviour
467
the total number of households. Then, the total household spending item by item for each demographic segment by multiplying the segment’s average spending on each item by the number of households in the segment. To calculate the percentage of total spending on an item that is controlled by each demographic segment i.e., its market share, each segment’s spending on an item is divided by total household spending on the item as shown in equation (4) Biggest
Customer
( j ) = Best
Customer
*
(M ) j j = m M ∑ j j = 1
(4)
Where Mj=household spending in each segment of demographic j m=number of segment in each demographic. 2.2 Development of Fuzzy Logic Predictor Model We have developed Mamdani Fuzzy and OSFM [3],[4] to model the probability of customer buying a product. The structure of our design is shown in Fig 1. The Fuzzy logic predictor model has the objective of giving information of the most likely product preference to buy given a certain combination of demographics of the household. 2.2.1 Development of Fuzzy Logic There are 2 inputs to each of the fuzzy logic engine and these are the best customer and biggest customer value calculated using the equations as presented in section 2.1. For each demographic, there will be one fuzzy logic engine. So, for six demographics, we will have 6 fuzzy logic engines. The fuzzy logic engine has 1 output which is the most likely demographic segment to buy the product. So, for each product, we will develop different fuzzy logic engines.
Fig. 1. Intelligent predictor model based on BLS data
468
R. Yusof, M. Khalid, and Md. R. Yunus
Fig. 2. Block diagram of fuzzy logic engine
The Best Customer and Biggest Customer are chosen as inputs to fuzzy logic because both data are useful in determining the probability of a consumer buying a particular product. Although the biggest customer shows the largest share of spending (market share) of particular product, it does not actually represent the most likely customer to buy the product. The information provided by the best customer value is also a very important factor in determining the most likely buying product for the customer. Fig. 2 shows the block diagram of fuzzy logic engine. The fuzzy logic engine consists of 4 main blocks which are: • • • •
Fuzzifier Inference engine Knowledge based Defuzzifier
2.2.1.1 Fuzzification. Fuzzification is the fuzzier block and is involved in the conversion of the input/output values of the fuzzy decision system into a corresponding fuzzy input values. The fuzzy sets for the best customer are represented by “Hate, Dislike, Like, Prefer and Love”. The fuzzy sets for best customer inputs are as shown in Fig. 3 below:
Fig. 3. Fuzzy sets of best customer
For the biggest customer input, the fuzzy sets are represented by: Very small, Small, Medium, Big and Very Big” and is shown in Fig. 4.
Ordinal Structure Fuzzy Logic Predictor for Consumer Behaviour
469
Fig. 4. Fuzzy sets of biggest customer
In this way, the input variable Best Customer into categories of preference for products and are being translated into the above 5 fuzzy sets. Similarly, the input variable for Biggest Customer is categorized into the 5 different sizes and are being translated into the above fuzzy sets. The membership value for each of the categories ranges from 0 to 1. 2.2.1.2 Inference Engine. The function of the Fuzzy inference engine is to infer the rules from knowledge of the system of data. Based on the 2 input variables, a set of rules can be developed. A total of 21 rules have been developed and some of the examples are given as follows: Rule 1: IF (best customer value = Hate) AND (biggest customer value = very Small) THEN (the most likely buying value = Very Low ) Rule 2: IF (best customer value = Hate) AND (biggest customer value = Small ) THEN (most likely buying value = Very Low) Rule 3: IF (best customer value = Hate) AND (biggest customer value = Medium) THEN (most likely buying value = Very Low) Where VS=very small, S=small, M=medium, B=big, VB=very big, H=hate, D=dislike, L=like, P=prefer, L=love The Fuzzy association matrix rules based on the 2 input variables is given Fig 5. To get the consequence of the fired rule, Fuzzy Max –Min inference technique is used.
Fig. 5. The fuzzy association matrix rule
470
R. Yusof, M. Khalid, and Md. R. Yunus
2.2.1.3 Defuzzification. The defuzzification is a process of converting the fuzzy output variables into crisp values. In this case a Centroid Defuzzification Technique is used. The fuzzy output variable which is the “Most likely to buy” is represented by the fuzzy sets as in Fig. 6 shown below:
Membership degree Very Low Medium High Very low high Most likely to buy Fig. 6. Fuzzy sets of “Most likely to buy”
2.2.2 Development of Ordinal Structure Fuzzy Logic The fuzzy logic engine developed above has 6 outputs from the 6 demographics categories. The outputs are the most likely to buy segment of each demographic. In order to relate the 6 outputs of the fuzzy logic engine to get the “most likely buying” product (priority of the products); we would need another fuzzy logic engine. Using the conventional Mamdani fuzzy logic engine, there will be a large rule base to determine. For example, if we use 3 membership functions for 6 inputs, there will be about 729 rules (36 = 729 rules base). So, instead, we use another type of fuzzy Logic known as Ordinal Structure Fuzzy Logic (OSFL) model which can handle multi-inputs and multi-outputs. It is suitable for this kind of application as we have 6 inputs to the model and 729 rules which is very large to be considered. The inputs and output structure of OSFL is shown in Fig. 1 above. The difference between the conventional fuzzy logic and OSFL is mainly on the calculation of the inferred values. The conventional fuzzy inference rules are described as follows: Ri: If x1 is Ai1 and x2 is Ai2 then yi is Bi (i = 1,2,…,n) Using the moment method [5], the inferred value is obtained as follows:
(5)
n
y=
∑μ c S i =1 n
i i
∑μ S i
i
(6)
i
i =1
where: Ri is the i-th fuzzy rule. Ai1, Ai2 and Bi are fuzzy variables. yi is the inferred value. μi is the truth value of Ri in the premise ci, Si – central position and area of membership function with the fuzzy variable Bi, respectively
Ordinal Structure Fuzzy Logic Predictor for Consumer Behaviour
471
For an n-input, one-output system, the ordinal structure model simplified from the original model which was proposed by Ohnishi is described as follows: Ri: If x1 is Ai1 then yi is Bi Rj: If x2 is Aj2 then yj is B, (i = 1,2,…,n)
(7)
Using the moment method, n
n
∑w μ c S + ∑w μ c S i
y=
i i i
i =1
j
j j
n
∑w μ S + ∑w μ S i
i =1
j
j =1 n
i i
j
j
(8) j
j =1
where – – – –
Ri is the i-th fuzzy rule with the input x1 and Rj is the j-th rule with the input x2, wi is the weight of the rule Ri, and wj is that of Rj[3]
The main difference between the conventional and the ordinal structure fuzzy model is that the rules of the latter are defined as a set of rules ordered by their importance. Each rule is weighted according to how well its conditional part matches its importance. The determination of the weights is not an easy task as it will affect the accuracy of the predictor model. Usually, the knowledge and experiential rules of experts should be incorporated in the system to define the weights for each rule. From equation (8), it can be seen that we need to calculate the weights of the demographics for all the products and the weights for each product for all products in terms of biggest customer and best customer. 2.2.3 Calculation of Weights for the Demographic Segment In this paper, we propose the steps for calculating the weights of the demographic segment, as follows: (i)
(ii)
(iii)
For each demographic segment, count the number of maximum values (MaxV) obtained for biggest customer and best customer for all product categories and sub categories ( total of all product categories and sub categories=311) Also count the number of a. MaxV2=2nd max value b. MV=Middle value c. LV=lowest value d. LV2=2nd lowest value Assign a different multiplier for the number of counts as follows: Multiplier for a. MaxV=1 b. MaxV2=0.75 c. MV=0
472
R. Yusof, M. Khalid, and Md. R. Yunus
(iv)
d. LV=0.25 e. LV2=0.5 Calculate the weight of each demographic segment for all products using the following equation:
Wi = ((MaxV *1)+ (MaxV2 *0.75)+ (LV2*0.5) + (LV*0.25)+ (MV *0))/PT Where Wi = weights of demographic i (segment) for all products PT= total number of product categories Number of maximum value means the number of times each demographic segment scores the highest value for biggest customer /best customer for all product categories 2.2.4 Calculation of Weights for the Demographic Segment for Each Product In order to calculate the weights of the demographic segment for each product, the following steps are taken: (i) (ii)
Calculate the score for biggest customer and best customer of each demographic segment for each product sub categories For each product sub categories, rank the demographic segment in terms of the score for biggest customer and best customer, and assign the multiplier for the values accordingly as follows: a. MaxV=1 b. MaxV2=0.75 c. MV=0.5 d. LV=0.25 e. LV2=0
The demographic segment that scores the highest biggest or best customer will get a 1 point, the second highest will get a 0.75 point, the lowest will get a 0 point, and the second lowest will get a 0.25 point. The other values in between these 4 values will get a 0.5 point. The procedure is repeated for all sub categories in the product categories. For example, in the Alcoholic beverages category, there are 7 sub categories. •
Calculate the weight of each demographic segment for each product category using the following equation: Weight of each demographi c segment for each product category =
∑ Po int associated
with each subcategor y of product
No of subcategor ies of product
(10)
3 Fuzzy Logic Predictor Software The software is developed using Java programming language and MySQL for the database. The database is used to store all the data of the Market Research file. The data are processed as described above.
Ordinal Structure Fuzzy Logic Predictor for Consumer Behaviour
473
The software is designed such that given a certain combination of demographics of a person; the system will calculate the probability of the most likely product the person will buy. User can view the ranking of the probability for each product category as well as for all product categories chosen by the user. The graphical user interface (GUI) of the software is shown in Fig. 7. The following is the step by step description of how the software works.
Fig. 7. GUI for Fuzzy logic predictor
Fig. 8. Example of the types of products available in the software
474
R. Yusof, M. Khalid, and Md. R. Yunus
Fig. 9. The demographic information of the person
Fig. 10. results for all type of product
Fig. 11. Simulation methodology chart
Ordinal Structure Fuzzy Logic Predictor for Consumer Behaviour
475
Step 1: The software requests for the types of products the person is interested to purchase which is shown as in Fig. 8 Step 2: Next, The software requests for the demographic information of the person which is shown as in Fig. 9. Step 3: When the ‘Calculate Priority’ button is invoked, the system will then calculate the probability of buying the product based on the selected demographics of the person which will then be listed in descending order which is shown as in Fig. 10. Step 4: The system will next calculate the probability of buying for all products if all products are chosen. The simulation methodology can also be described which is shown as in Fig. 11.
Fig. 12. Input data of demographic from a person
Fig. 13. Priority of product Display
476
R. Yusof, M. Khalid, and Md. R. Yunus
3.1 Simulation Result Based on the fuzzy logic predictor software that has been developed, several simulations were conducted to test the accuracy of the software. In this example, we have chosen the Telephone product category which contains four types of sub products. In order to test the software a particular household demographics are required to be chosen. The predicted results of the particular household that have been chosen is shown as in Fig. 12. For the input data as shown in Fig. 12, the simulation result show that type of product that consumer would like to purchase in terms of priorities given his/her demographic background is Telephones, Answering Machines, and Accessories (type 4) which score of Fuzzy OSM Output is 0.6176. The result is shown in Fig. 13. The predictor engine has been tested for several configurations of demographic, and the results, when compared with the BLS data gives an accurate priority product which the household in the specified demographic had purchased.
4 Conclusions Fuzzy logic predictor is an engine that can predict the behavior of consumers on the products they would be interested to purchase based on the trends, behavior and habit of consumers. The predictor models are developed based on the fuzzy logic models which have an advantage due to their ability in handling multiple inputs and outputs. Based on the 6 demographics of householder such as his/her age, household income, type of household, race, region, and education background, the fuzzy logic models are developed to predict the types of product that consumer would like to purchase in terms of priorities based on his/her demographic background. A software has been developed for such purpose and the fuzzy logic predictor is compared to the statistical outputs. The results based on the fuzzy logic predictor models show a high level of accuracy which can be used for many applications.
References 1. Best Customers, Demographics of Consumer Demand, 4th edn. New Strategist Publications, Inc., Ithaca, New York (2006) 2. U.S. Bureau of Labour Statistics (BLS) Consumer Price Index (CPI) 3. Zadeh, L.: Fuzzy sets. Information Control 8, 338–353 (1965) 4. Ohnishi, T.: Fuzzy Reasoning by Ordinal Structure Model of Control Rule. Journal of Japan Society for Fuzzy Theory and Systems 2(4), 125–132 (1990) 5. Naitoh, Y., Furuhashi, T., Uchikawa, Y.: A Variable Ordinal Structure Model for Fuzzy Reasoning and Its Application to Decision Problem Working Order. In: International Conference on Industrial Electronic, Control and Instrumentation, vol. 2, pp. 1539–1543 (1991) 6. Tan, K.K., Khalid, M., Yusof, R.: Intelligent Elevator Control by Ordinal Structure Fuzzy Logic Algorithm. In: Proc. of ICARCV 1997, Singapore (1997) 7. Sano and Kawabe, Technical Report of the IEEJ, IIC-90-14, pp. 19-27 (1990)
Kansei for Colors Depending on Objects Taki Kanda Bunri University of Hospitality, Department of Service Management, 311-1 Kashiwabarashinden, Sayama, Saitama(350-1336), Japan [email protected]
Abstract. Kansei means human feelings in Japanese. Many studies on the analysis or evaluation of Kansei have been conducted. However Kansei is very complicated and it is not easy to analyze or evaluate. When we consider how we feel about color, it is necessary to take various things into consideration. For example, it must be very important to consider that our preference of colors might vary with objects. Here we take the interior (curtain, wall, floor) and the tableware (plate, cup, pan) and examine how human preference varies with objects using method of order. Keywords: Kansei, method of order, Kendall’s coefficient of concordance, Freedman’s test.
1 Introduction Modern society is called information-oriented society and at the same time the present age is called “the age of Kansei”. The ages of the manufacturing industry in Japan have been changing – “the age of quantity” characterized by mass production in the period of the high economic growth, “the age of quality” characterized by quality control, “the age of diversification” characterized by manufacture of many kinds and small quantity of products such as CIM which is short for Computer Integrated Manufacturing and aims at responding to consumers’ various needs and “the age of Kansei” characterized by manufacturing a product to one kind. In the age of Kansei, it is needed to take not only the idea of product-out from companies’ side but also that of market-in from consumers’ side into consideration for the planning of developing, manufacturing and selling products. It is considered necessary and effective for manufacturing companies to develop products in accordance with consumers’ Kansei. Accordingly Kansei engineering which was born in Japan as an Ergonomic consumeroriented technology and was firstly named Emotion technology in 1970s then the name of which was changed to Kansei engineering in the late of 1980 became very important in various fields of industry today. In the present social situation, it might be effective for marketing to choose colors of products so as to fit consumers’ Kansei. The objective of this paper is to study how to choose colors of products for interior and tableware. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 477–484, 2010. © Springer-Verlag Berlin Heidelberg 2010
478
T. Kanda
2 Experiments on Kansei for Colors by Questionnaire Experiments on Kansei for Colors was conducted by using method of order with subjects of 14 University students. Questionnaires were set out by using the 10 colors of red, blue, yellow, white, black, purple, green, orange, grey and brown asking subjects to decide the ranking of colors for products according to their preference. Kendall’s coefficient of concordance was obtained from the results of the questionnaires and Freedman’s test was conducted to confirm the concordance of 14 subjects’ answers. For the evaluation, order statistics from the Normal distribution are given to each order. By using order statistics for evaluation, we are able to obtain evaluation values. The units of the evaluation values are the standard deviations of the standard normal distribution ( σ = 0 ) on the basis of 0 . In order to investigate how the preference differed depending on objects, coefficient of correlation was calculated between two objects.
3 Test for Concordance of Subject’s Answers In order to find out whether the subjects’ answers are concordant or not, Kendall’s coefficient of concordance and Freedman’s test are used. In Kendall’s coefficient of concordance let n and k be the number of subjects and the number of stimuli respectively and Vi (i = 1, 2, L , k ) be the total of the ranking which the subjects give to each stimulus, then the mean of Vi is
n(k + 1) 2
V=
(1)
and the sum of the squares of the deviation from the mean
V is
k
S = ∑ (Vi − V ) 2
(2)
i =1
and dividing this by the maximum of can be written as
W=
S then Kendall’s coefficient of concordance S
S max
,0 ≤W
≤1
(3)
S max is the maximum of S . In Freedman’s test, to confirm the concordance of the ranking by n subjects, the test statistic is calculated as below.
where
F0 =
(n − 1)W 1−W
(4)
Kansei for Colors Depending on Objects
479
and if the calculated value of F 0 is not smaller than the critical value of the tribution for the degrees of freedom
F dis-
f 1= k − 1 −
2k , f 2 = (n − 1) f1 , n
it is decided that Kansei is concordant among
n subjects, that is, ,if
F0 ≥ F ( f1 , f 2 ,α ) is satisfied, it is concluded that the concordance of Kansei among nificant at
(5)
α level of significance.
(6)
n subjects is sig-
Besides when k > 7 , the test statistic
χ 02 = n(k − 1)W is calculated and if the calculated value of the
χ
2 0
χ 02
(7)
is not smaller than the critical value of
distribution for the degrees of freedom
f = k −1 , it is decided that Kansei is concordant among
(8)
n subjects, that is, if
χ 02 ≥ χ 02 ( f , α ) is satisfied, it is concluded that the concordance of Kansei among nificant at α level of significance.
(9)
n subjects is sig-
4 Evaluation of Preference for Colors on Objects Here, interior and tableware are taken and the preference of colors on curtain, wall and floor for interior and plate, cup and pan for tableware are evaluated. The ranking of colors for curtain, wall and floor by 14 subjects is shown in Tables.1, 2 and 3 respectively and for plate, cup and pan it is shown in Tables 4, 5 and 6 respectively. In the figures C1, C2, C3, C4, C5, C6, C7, C8, C9 and C10 represent red, blue, yellow, white, black, purple, green, orange, grey and brown respectively. Tables 7 and 8 show the result of significance test on concordance of the subjects’ answers for interior and tableware respectively and Tables 9 and 10 show the evaluation values of each colors for interior and tableware respectively. The evaluation values are obtained by using order statistics for each ranking of colors. In order to compare evaluation values among objects, the values for coefficient of correlation between two objects for interior and tableware are shown in Tables 11 and 12 respectively. Critical value for coefficient of correlation is 0.632 at 5% level of significance for 10 stimuli and hence if the value of coefficient of correlation is more than 0.632, we can say in Statistic that it is significant at 5% level of significance. .
480
T. Kanda Table 1. Ranking of colors for curtain
Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Total
C1 6 10 5 10 9 9 1 4 9 2 1 10 3 10 89
C2 7 9 1 7 1 7 2 5 1 3 2 7 2 9 63
C3 2 8 3 5 6 5 3 2 10 9 3 6 6 5 73
C4 1 7 7 1 3 3 4 1 2 1 4 4 10 1 49
C5 9 6 9 2 2 1 5 6 3 8 10 3 9 2 75
C6 5 5 6 9 8 10 10 10 8 5 5 9 4 6 100
C7 10 4 4 8 10 8 9 9 7 4 6 8 8 3 98
C8 3 3 2 6 7 6 8 3 6 6 7 5 5 7 74
C9 4 2 8 3 4 2 7 8 5 7 8 1 1 4 64
C10 8 1 10 4 5 4 6 7 4 10 9 2 7 8 85
Total 55 55 55 55 55 55 55 55 55 55 55 55 55 55
C9 4 9 9 10 2 6 7 2 4 9 2 2 1 1 68
C10 1 10 7 8 4 7 6 8 3 2 3 3 9 9 80
Total 55 55 55 55 55 55 55 55 55 55 55 55 55 55
Table 2. Ranking of colors for wall
Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Total
C1 10 5 5 1 10 8 1 5 10 5 8 9 5 8 90
C2 6 4 3 3 5 3 2 6 7 6 7 8 2 7 69
C3 3 3 1 5 6 2 3 3 6 7 4 5 8 3 59
C4 9 2 10 4 1 1 4 1 1 1 1 1 7 2 45
C5 2 1 8 2 3 10 5 7 2 8 10 4 10 6 78
C6 7 6 4 7 7 9 10 10 9 3 9 10 4 10 105
C7 8 7 2 9 8 4 9 9 8 4 6 7 3 5 89
C8 5 8 6 6 9 5 8 4 5 10 5 6 6 4 87
Kansei for Colors Depending on Objects Table 3. Ranking of colors for floor
Subject C1 1 10 2 3 3 3 4 4 5 7 6 7 7 1 8 10 9 8 10 9 11 6 12 10 13 3 14 9 Total 90
C2 9 2 1 3 6 8 2 7 7 7 7 7 2 8 76
C3 5 1 2 10 5 5 3 6 6 5 8 5 8 4 73
C4 4 4 4 1 1 3 4 3 1 2 9 6 9 3 54
C5 3 5 5 2 3 1 5 2 3 3 10 2 10 5 59
C6 C7 C8 7 8 6 6 7 8 7 6 9 5 6 8 8 9 10 10 9 4 10 9 8 9 8 5 9 10 5 10 8 6 5 4 3 9 8 4 4 5 6 6 7 10 105 104 92
C9 2 9 8 9 2 2 7 4 4 4 2 3 7 2 65
C10 1 10 10 7 4 6 6 1 2 1 1 1 1 1 52
Total 55 55 55 55 55 55 55 55 55 55 55 55 55 55
C9 1 5 2 7 2 6 2 10 8 2 8 8 3 2 66
C10 6 9 10 8 10 5 3 9 7 7 9 9 8 3 103
Total 55 55 55 55 55 55 55 55 55 55 55 55 55 55
Table 4. Ranking of colors for plate
Subject C1 1 9 2 7 3 8 4 10 5 8 6 9 7 6 8 2 9 4 10 10 11 4 12 5 13 4 14 6 Total 92
C2 7 2 7 4 6 8 4 1 1 6 3 2 7 10 68
C3 2 6 4 3 4 1 9 3 2 4 7 7 10 5 67
C4 3 1 1 1 3 3 1 5 10 1 1 1 1 1 33
C5 5 10 3 2 1 4 10 4 6 3 10 10 2 9 79
C6 8 8 9 9 7 10 7 6 5 8 5 4 5 7 98
C7 10 4 5 5 5 7 8 7 9 9 6 3 6 8 92
C8 4 3 6 6 9 2 5 8 3 5 2 6 9 4 72
481
482
T. Kanda Table 5. Ranking of colors for cup
Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Total
C1 7 7 8 9 8 4 6 2 4 10 4 2 8 6 85
C2 9 2 7 4 6 7 4 1 1 4 7 5 7 5 69
C3 2 1 4 3 4 1 9 3 2 6 3 1 10 10 59
C4 3 6 1 1 2 3 1 5 10 1 1 7 1 1 43
C5 10 10 3 2 1 9 10 4 6 3 10 3 2 4 77
C6 8 8 9 10 5 10 7 6 5 9 6 4 5 7 99
C7 5 4 5 5 7 8 8 7 9 8 5 10 3 8 92
C8 4 3 6 6 9 2 5 8 3 5 8 6 9 9 83
C9 1 5 2 7 3 6 2 10 8 2 2 8 6 2 64
C10 6 9 10 8 10 5 3 9 7 7 9 9 4 3 99
Total 55 55 55 55 55 55 55 55 55 55 55 55 55 55
Table 6. Ranking of colors for pan
Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Total
C1 1 5 8 5 1 7 6 1 1 10 5 4 9 2 65
C2 10 8 9 1 10 6 7 2 3 7 8 3 8 6 88
C3 3 9 10 10 4 5 10 3 6 5 4 7 7 10 93
C4 7 4 3 9 5 1 3 5 8 2 3 9 1 8 68
C5 9 2 2 6 6 2 1 4 4 4 1 2 2 7 52
C6 C7 8 5 7 6 7 6 4 3 9 3 8 9 5 8 8 9 10 2 9 8 6 10 5 6 10 5 5 4 101 84
C8 6 10 5 2 2 10 4 6 7 6 9 8 6 3 84
C9 2 1 1 8 8 3 2 7 5 1 2 1 3 1 45
C10 Total 4 55 3 55 4 55 7 55 7 55 4 55 9 55 10 55 9 55 3 55 7 55 10 55 4 55 9 55 90
Kansei for Colors Depending on Objects
483
Table 7. Result of significance test on concordance for interior
Kendall's coefficient Test Result of 2 Objects of concordance W Statistic χ 0 Significance Test Curtain 0.15 18.36 Significant Wall 0.17 21.04 Significant Floor 0.22 27.63 Significant The number of subjects: n = 14 Degrees of freedom: f = 9 . 2 Critical value at 5% level of significance: χ ( f ,0.05) = 16.92 . Table 8. Result of significance test on concordance for tableware
Kendall's coefficient Test Result of 2 Objects of concordance W Statistic χ 0 Significance Test Plate 0.24 29.88 Significant Cup 0.19 24.34 Significant Pan 0.19 24.11 Significant The number of subjects: n = 14 Degrees of freedom: f = 9 . 2 Critical value at 5% level of significance: χ ( f ,0.05) = 16.92 . Table 9. Evaluation valued of colors for interior
Objects Curtain Wall Floor
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
-0.45
0.73
0.10
1.08
0.00
-0.98
-0.91
0.16
0.56
-0.28
-0.48
0.35
0.65
1.35
-0.07
-1.12
-0.46
-0.44
0.28
-0.05
-0.52
0.18
0.13
0.88
0.62
-1.17
-1.02
-0.54
0.40
1.03
Table 10. Evaluation valued of colors for tableware
Objects Plate Cup Pan
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
-0.34
0.22
0.20
1.03
-0.11
-0.44
-0.33
0.09
0.26
-0.58
-0.16
0.20
0.37
0.80
-0.07
-0.48
-0.32
-0.13
0.30
-0.50
0.30
-0.23
-0.41
0.21
0.56
-0.52
-0.17
-0.17
0.76
-0.33
Table 11. Comparison of evaluation between objects for interior
Objects Curtain and Wall Wall and Floor Floor and Curtain
Coefficient of Correlation 0.85 0.75 0.85
Result of Significance Significant Significant Significant
484
T. Kanda Table 12. Comparison of evaluation between objects for tableware
Objects Plate and Cup Cup and Pan Pan and Plate
Coefficient of Correlation 0.76 0.25 0.31
Result of Significance Significant Not Significant Not Significant
5 Concluding Remarks Here we were concerned with Kansei for colors depending on objects. Curtain, wall and floor for objects of interior and plate, cup and pan for objects of tableware were chosen respectively and it is studied how our preference for colors varies among objects using method of order. From Freedman’s test conducted here, it was confirmed that subjects’ answers were concordant for the both objects in the interior (curtain, wall and floor) and the tableware (plate, cup and pan). It might be accordingly said that our preference for colors of an object is almost the same. Coefficient of correlation was calculated to compare preference for colors between every two objects in interior and tableware. In interior the values of coefficient of correlation were significant at 5% level of significance for every two objects (curtain and wall, wall and floor, floor and curtain) while in tableware the values were significant at 5% level of significance only for plate and cup and not significant for others. It is therefore said that our preference for colors might depend on objects.
References 1. Yamauti, Z.: Statistical Tables and Formulas with Computer Applications JSA-1972, Japanese Standard Association, Tokyo (1972) 2. Nagamachi, M.: A New Ergonomic consumer-oriented technology for product development. International Journal of Industrial Ergonomics 15(1), 3–12 (1995) 3. Nagamachi, M.: Framework of Kansei Engineering and its Application to a Cosmetic Product. In: Proc. of the 5th International Conference on Engineering Design and Automation, pp. 803–808 (August 2001) 4. Kanda, T.: Studies on Kansei Concerning Colors and Good Tasting. In: Proc. of 6th International Conference on Engineering Design and Automation (EDA 2002), pp. 807–812 (2002) 5. Nagasawa, S.: Kansei and Business. Kansei Engineering International 3(3), 3–12 (2002) 6. Kanda, T.: Colors and Kansei. In: Conference Proc. of 2004 IEEE International Conference on Systems, Man & Cybernetics (IEEESMC 2004), pp. 299–305 (2004) 7. Nagasawa, S., Kanda, T.: Progress of Kansei Engineering and Kansei Goods. In: Proc. of the Second International Conference on Artificial Intelligence in Engineering and Technology (ICAIET 2004), vol. 1, pp. 302–307 (2004) 8. Kanda, T.: Feelings for Good Tasting to Colors of Food and Kansei Engineering. In: Proc. of International Conference on Recent Trends in Information Systems (IRIS 2006), pp. 555– 563 (2006)
A Hybrid Intelligent Algorithm for Solving the Bilevel Programming Models Shamshul Bahar Yaakob1,2 and Junzo Watada1 1
Graduate School of Information, Production and Systems, 2-7, Hibikino, Wakamatsu, Kitakyushu, 808-0135 Japan 2 School of Electrical Systems Engineering, Universiti Malaysia Perlis, 02600 Jejawi, Perlis, Malaysia [email protected]
Abstract. In this paper, genetic algorithm (GA) and neural network (NN) are integrated to produce a hybrid intelligent algorithm for solving the bilevel programming models. GA is used to select a set of potential combination from the entire generated combination. Then a meta-controlled Boltzmann machine which is formulated by comprising a Hopfield network and a Boltzmann machine (BM) is used to effectively and efficiently determine the optimal solution. The proposed method is used to solve the examples of bilevel programming for two- level investment in a new facility. The upper layer will decide the optimal company investment. The lower layer is used to decide the optimal department investment. Typical application examples are provided to illustrate the effectiveness and practicability of the hybrid intelligent algorithm. Keywords: bilevel programming model, neural network, Boltzmann machine, Hopfield network, meta-controlled Boltzmann machine.
1 Introduction Bilevel programming has been successfully applied in many areas including management science, economics, engineering and so on, since 1977, when Candler and Norton [1] first proposed the concept. Even nowadays, bilevel programming has broad application potentials. However, due to its inherent non-convexity and nondifferentiated, bilevel programming often has NP-hard problems, which are very difficult to solve, especially, nonlinear bilevel programming problems. A problem is NP-hard if an algorithm for solving it can be translated into one for solving any NPproblem. In the past few decades, efforts have made to provide effective algorithms for bilevel programming problems [2-4]. The basic concept of the bilevel programming method is that an upper-level decision maker sets his or her goal and/or decisions and then asks each subordinate level of the organization for their optima, which are calculated in isolation. Lower-level decision maker’s decisions are then submitted and modified by the upper-level decision maker with consideration of the overall benefits for the organization. The process is continued until a satisfactory solution is reached [5]. This decision-making process R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 485–494, 2010. © Springer-Verlag Berlin Heidelberg 2010
486
S.B. Yaakob and J. Watada
is extremely useful to such decentralized systems as agriculture, government policy, economic systems finance, power systems, transportation, and network designs, and is particularly suitable for conflict resolution [6-8]. There are certain limitations on known algorithms for solving bilevel programming, or due to a local optimal solution, specific structures of bilevel programming, or due to Kuhn-Tucker conditions. Branch-and-bound method [9,10], descent algorithms [11,12], an evolutionary method [13,14] have been proposed for solving the bilevel programming problems based on this reformulation. Compared with classical optimization approaches, the prominent advantage of neural computing is that it can converge to the equilibrium point (optimal solution) rapidly, and this advantage has been attracting researches to solve a bilevel programming problem using neural network approach. Recently, Shih [15] and Lan [16] have proposed neural network for solving the linear bilevel programming problem. However, it deserves to point out that there are no reports on solving mixed integer quadratic bilevel programming problem using neural network approach. In addition, most of them require additional conditions for a solution. It is difficult to extend such method to solve general nonlinear bilevel programming problems. The objective of this paper is to find the optimal investment's combination in new facility investment. In this study, a hybrid intelligent algorithm method is employed to resolve the bilevel programming problems. The remainder of the paper is organized as follows. Section 2 briefly explains the bilevel programming problem. Sections 3 introduce the Hopfield and BM. Section 4 describes our proposed algorithm for solving bilevel programming problems. Results of numerical example are reported in Section 5. Finally, Section 6 concludes the paper.
2 Bilevel Programming Problems Bilevel programming is a case of multilevel mathematical programming that is defined to solve decentralized planning problems with multiple decision makers in a multilevel or hierarchical organization [14]. Among different levels, decision makers play a game called the Stackelberg game [15], in which the follower responds to any decision made by the leader but are not controlled directly by the leader. Thus the leader is able to adjust the performance of the overall multilevel system indirectly by his decisions. Bilevel programming involves two optimization problems where the constraint region of the first-level problem is implicitly determined by the other second-level optimization problem. Bilevel programming problem, where a top level decision maker has control over the vector x while a bottom level decision maker controls the vector y. Letting the performance functions of F(x,y)and f(x,y) for the two decision makers be linear and bounded, then the bilevel programming problem can be represented as
max F ( x, y )
(Upper Layer),
x
subject to G(x,y) ≤ 0 where y = y(x) is implicitly defined by
A Hybrid Intelligent Algorithm for Solving the Bilevel Programming Models
min f ( x, y )
487
(Lower Layer),
y
subject to g(x,y) ≤ 0 Obviously, the bilevel programming model consists of two sub-models, upper layer, which is defined as an upper-level and lower layer which is a lower level. F(x,y) is the objective function of the upper-level decision-makers and x is the decision vector of the upper-level decision-makers; G(x,y) is the constraint set of the upper-level decision vector. f(x,y) is the objective function of lower-level decision-makers; y is the decision vector of the lower-level decision-makers; and g(x,y) is the constraint set of the lower-level decision vector; y = y(x) is usually called reaction or response function. The key for solving the bilevel programming model is to obtain the response function through solving the lower level problem and replace the variable y in the upper level problem with the relationship between x and y – the response function. We can see, the response function connects the upper and lower level decision variables, which makes the two programming model dependent one another. Only one variable’s optimization could not optimize the whole system performance, which reflects the essence of the response function. Compared with the conventional single-level programming model, the bilevel programming models have much more advantages. The main advantages are that (i) the bilevel programming can be used to analyze two different and even conflict objectives at the same time in the decision-making process; (ii) the multiple criteria decisionmaking methods of bilevel programming can reflect the practical problem better, and (iii) the bilevel programming methods can explicitly represent the mutual action between the department managers and the company directors. Due to the new facility investments involving two kinds of decision-makers those are department managers and the company directors who have a different objective function. Obviously, it is appropriate that the bilevel programming model is adopted to describe the investment problem. In this study, a neural network (NN)-based method is employed to resolve the bilevel programming problems.
3 Hopfield and Boltzmann Machine Research into mutually connected network behavior started around 1948. Simply speaking, it is difficult to select the required number of the units at the same time which make the corresponding energy function minimized. This problem can not be solved by either Hopfield or Boltzmann neural networks. In 1982, Hopfield [17] brought together several earlier ideas concerning recurrent neural networks, accompanied by a complete mathematical analysis. Although the Hopfield network is not a good solution for many applications, it nevertheless warrants revisiting in terms of structure and internal working. This will lead to a modification, by incorporating mutual connections, in order to overcome its drawbacks – in the form of the Boltzmann machine. The Hopfield network is a fully connected, recurrent neural network, which uses a form of “generalized Hebb rule” to store Boolean vectors in its memory. Each unit neuron which represented by n has a state value denoted by sn, In any situation,
488
S.B. Yaakob and J. Watada
combining the state of all units leads to a global state for the network. This global state is the input, together with other prototypes, which are stored in the weight matrix by Hebb’s postulate, formulated as
Wij =
1 N
∑X
ip
X jp ,
(1)
p
where, p = 1…N, Wij is the weight of the connection from neuron j to neuron i, N is the dimension of the vector, p the number of training patterns, and X i p the pth input for the neuron i. In other words, using Hebb’s postulate, we create the weight matrix, which stores the entire prototype that we want the network to remember. Because of these features, it is sometimes referred to as an “Auto-associative Memory”. However, it is worth noting that the maximum number of prototypes that a Hopfield network can store is only 0.15 times more than the total number of units in the network [17]. One application of the Hopfield network is to use it as an energy minimizer. This application comes to life because of the ability of Hopfield networks to minimize an energy function during its operation. The simplest form of energy function is given by the following:
E=
1 N N ∑∑ w ji s j si 2 j =1 i =1
(2)
Here wij denotes the strength of the influence of neuron j on neuron i. The wij are created using Hebb’s postulate as mentioned above, and they belong to a symmetric matrix with the main diagonal line containing only zeroes (which means there are no self-feedback connections). The Boltzmann machine is an interconnected neural network, and a modification of the Hopfield network which helps it escape from local minima. The main idea is to employ simulated annealing, a technique derived from metallurgy industry. According to the “annealing schedule”, where T0 is a critical system parameter, which represents the initial value for the temperature. If T0 is very large, then a strategy is pursued whereby neurons are flipping on and off at random, totally ignoring incoming information. If T0 is close to zero, the network behaves “deterministically”. Although the way in which a Boltzmann machine works is similar to a Hopfield network, we cannot use Hebb's postulate to create the weight matrix representing the correlations between units. Instead, we have to use a training (learning) algorithm – one based on the Metropolis algorithm. The Boltzmann machine can be seen as a stochastic, generative counterpart of the Hopfield network. In the Boltzmann machine, probability rules are employed to update the state of neurons and the energy function as follows: If Vi (t + 1) is the output of neuron i in the subsequent time iteration t + 1 , Vi (t + 1) is 1 with probability P, and Vi (t + 1) is 0 with probability 1 − P , where P[Vi (t .+ 1)] = f (
ui (t ) . ) T
(3)
A Hybrid Intelligent Algorithm for Solving the Bilevel Programming Models
Here,
489
f (⋅) is sigmoid function, ui (t ) the total input to neuron i shown in equation
(4), T network temperature, and
ui (t ) = ∑ wijVi (t ) + θi , j =1
where
.
(4)
wij is a weight between neurons i and j, θi is the threshold of neuron i and Vi
is the state of unit i. The energy functions, E, proposed by Hopfield, is written as:
E=
n 1 n w VV − θiVi . ∑ ij i j ∑ 2 ij =1 i =1
(5)
Hopfield has shown that this energy function decreases monotonically with learning [17]. Therefore, it is likely that this energy function converges to a local minimum. However, in the case of the Boltzmann machine, the energy function can increase with some minute probability. Therefore, the energy function will be unlikely to fall into a local minimum. Thus, the combination of Hopfield network and Boltzmann machine offers a solution to overcome the problem of finding the optimal number of units in the neural network.
4 Hybrid Intelligent Algorithm Conventionally, the number of units is decided on the basis of expert experience. In order to solve this problem, we formulate a hybrid intelligent algorithm consisting of both Hopfield and Boltzmann machine neural networks called meta-controlled BM and hybrid with GA. This hybrid intelligent algorithm can be employed to the bilevel programming problems. The hybrid intelligent algorithm model has two layers – referred to as the upper and lower levels, respectively. The functions of the levels are as follows: 1. Upper level will decide on how much company will be investing in a new facility (based on the numerical example applied in this study) 2. Lower level will decide on how much department will be investing in a new facility (based on the numerical example applied in this study) 4.1 Meta-controlled Boltzmann Machine
This meta-controlled BM is a new type of neural network model, which deletes units (neurons) in the lower layer that are not selected in the upper layer during execution. The lower layer of the meta-controlled BM is a function to restructure using the selected units by upper layer. Because of this feature, the meta-controlled BM converges more efficiently than a conventional BM. This is an efficient method for solving a selection problem by transforming its objective function into the energy function, since the Hopfield and Boltzmann networks converge at the minimum point of the energy function.
490
S.B. Yaakob and J. Watada
The meta-controlled BM just described converts the objective function into energy functions of two components - namely the upper layer and the lower layer. The metacontrolled BM is tuned such that the upper layer influences the lower layer with probability 0.9, and the lower layer influences the upper layer with probability 0.1. The main reason for selecting these probabilities is based on trial and error, it is found that the selected probabilities provide the best solution to our proposed method. Thus the meta-controlled BM is iterated with Yi = 0.9 yi + 0.1 xi for the upper layer, and Xi = xi * (0.9 * yi + 0.1) for the lower layer. Here Yi in the upper layer is a value transferred to the corresponding nodes in the upper layer, Xi in the lower layer is a value transferred to corresponding nodes in the lower layer, yi is the value of the present state at node i in the upper layer, and xi is the value of the present state at node i in the lower layer, respectively. Xi means that the value is influenced to the tune of 90% from the value of node i in the upper layer. When Yi is 1, Xi = xi; otherwise, when yi is 0, 10% of the value of xi is transferred to the other nodes. On the other hand, Yi has a 10% influence on the lower layer. Therefore, even if the upper layer converges to a local minimum, the disturbance from the lower layer makes the upper layer escape from this local minimum. 4.2 Hybrid Intelligent Algorithm
To the proposed algorithm is summarized as follows and is shown in Fig. 1: Block (a) Step_0. Start with a series of randomly generated combinations. Then GA will select a set potential solution combination. Block (b) and (c) Step_1. Set each parameter to its initial value for meta-controlled BM. Step_2. Input Ku (weight for upper layer) and Kl (weight for lower layer). Step_3. Execute the upper layer. Step_4. If the output value of a unit in the upper layer is 1, add some amount of this value to the corresponding unit in the lower layer. Execute the lower layer. Step_5. After executing the lower layer at a constant frequency, decrease the temperature. Step_6. If the output value is sufficiently large, add a certain amount of the value to the corresponding unit in the upper layer. Step_7. Iterate from Step_3 to Step_6 until the temperature reaches the restructuring temperature. Step_8. Restructure the lower layer using selected units in the upper layer. Step_9. Execute the lower layer until reaching the termination condition. Note: The execution of block (b) and (c) are repeated for potential solutions.
Block (d) Step_10. List all the individual solution.
A Hybrid Intelligent Algorithm for Solving the Bilevel Programming Models
Block (a)
491
Block (b) Selected unit
Lower level for bilevel problem
Possible solution/combination
Meta-controlling layer
Meta-controlling layer
Information from lower to meta-controlling Information from meta-controlling to lower
Restructure Reach the restructuring temperature
Lower layer
Restructured Lower layer
Fig.3. Meta-controlled Boltzmann machine
Upper level for bilevel problem Set of solutions based on all possible solution generated by GA.
Meta-controlling layer
Selected unit Meta-controlling layer
Information from lower to meta-controlling
Example: Combination optimal solution 1 x1*, y1* 2 x2*, y2* 3 x3*, y3* .. …… n xn*, yn*
Information from meta-controlling to lower
Restructure Reach the restructuring temperature
Lower layer
Block (d)
Restructured Lower layer
Block (c)
Fig. 1. Hybrid intelligent algorithm for solving the bilevel programming models
5 Numerical Example Investment in a new facility in a department can be divided into two levels including company investment and department investment, and each of them has its own objective functions and decision variables. Investment using company budget and department budget influence each other. Company budget can influence the reactions of department through decision variables, while department has full authority to decide how to optimize its own objective functions in view of the decisions made by company budget. This two-level investment in a new facility, therefore, fits Stackelberg game model, and can be solved by bilevel programming model. x: company investment in new facility, y: department investment in new facility I1 : maximum limit of company investment in new facility I2 : maximum limit of department investment in new facility p : utility parameter of investment in new facility q: assume q parts of benefits of the two levels if an company investment can return to the department Assuming that company first chooses a decision variable, the utility function of company investment in a new facility is. Upper level:
L1 ( x, y, p) = pxy
(6)
And the utility function of the opportunity cost is
L10 ( x, y ) = sx + sy + a1 x + a2 ( x + y ) Hence, we have
(7)
492
S.B. Yaakob and J. Watada
F ( x, y, p) = pxy − sx − sy − a1 x − a2 ( x + y )
(8)
Related fees of company investment in new facility is a1x, let
g1 ( x, y ) = x + a1 x − I1 [{s.t. x + a1 x ≤ I1 }]
(9)
Lower level: The utility function of department investment in new facility is
L2 ( x, y, p, q) = pqxy
(10)
And the utility function of the opportunity cost is
L02 ( x, y ) = sy + a2 ( x + y )
(11)
Hence, we have
f ( x, y, q ) = pqxy − sy − a2 ( x + y )
(12)
Related fees of department investment in new facility is a2(x+y), let
g 2 ( x, y ) = x + a2 ( x + y ) − I 2 [{s.t. x + a2 ( x + y ) ≤ I 2 }]
(13)
Set s = 0.08, a1 = 0.001, a2 = 0.0005, I1 = 3 billion USD, I2 = 2 billion USD, p = 0.75, q=0.9. In this case study, ten new facilities are considered. Partial solution is determined for the upper level. This solution is then transferred to the lower level as known parameters, and meta-controlled BM is implemented to solve the lower level problem. The solution of the lower level problem will communicate with the upper layer until convergence or meet the stopping criteria. We converted the energy function as shown in (14) for the upper level and (15) for the lower level of the bilevel programming problems. max Eupper= −
n m n n m 1 n m (∑∑pxi yj + 2∑∑xi yj ) +2∑xi −K∑∑ sxi +syj +a1xi +a2(xi + yj ) (14) 2 i=1 j=1 i=1 j =1 i=1 i=1 j=1
Eupper will decide on how much company budget will be investing in a new facility. In the upper layer, GA will find the feasible solution and pass it to lower level to allocate the optimal investment to a selected department. n
∑
n m
∑∑
n m 1 n m syj +a2(xi + yj ) (15) max Elower = − (∑∑ pqxi y j + 2∑∑xi y j ) +2 xi − K 2 i=1 j =1 i=1 j=1 i=1 i =1 j =1
Elower will decide on how much departmental investment will be investing in a new facility. K is weight for meta-controlled solve this problem; the proposed hybrid intelligent algorithm has been applied. The results are as shown in Table 1. As shown in Table 1, in the case when K = 0.1, only one new facility is selected to be invest and corresponding optimal investment is (x*, y*) = (18.9217, 29.1749). The new facilities not included in the list of units after restructuring will not be targets for investment. Therefore, in the case of K = 0.3, the number of selected new facility for investment was four, and corresponding optimal investment is (x*, y*) = (19.9256, 29.9325). When K = 0.5, five new facilities are selected from the list of units after
A Hybrid Intelligent Algorithm for Solving the Bilevel Programming Models
493
restructuring, and corresponding optimal investment is (x*, y*) = (19.9313, 29.9410). In the case of K = 0.7, six new facilities are selected, and in the case of K = 1.0, nine facilities are selected from the list of units after restructuring. From the results of our study, we conclude that the number of new facilities selected from the restructured list is directly proportional to K. Table 1. Results for company and departmental according to K
Value of K 0.1
Company Investment 18.9217
Department Investment 29.1749
No. of Selected New Facility 1
0.3
19.9256
29.9325
4
0.5
19.9313
29.9410
5
0.7
19.9516
29.9590
6
1.0
19.9756
29.9610
9
6 Conclusions A new proposal was presented to solve a mixed integer quadratic bilevel programming problems. We proposed the hybrid intelligent algorithm that included the GA and meta-controlled BM. The proposed hybrid intelligent algorithm proved to be very efficient from the computational point of view and quality of solutions. By using the same token, we could conclude that proposed method is capable of solve a bilevel programming problem. Furthermore, the approach is easy to use and has universal applicability to solve a complicated mixed integer quadratic bilevel programming problems. Although the examples demonstrated the efficiency of the proposed hybrid intelligent algorithm, an improvement still needs to be made. For example, an efficient heuristic algorithm that can simultaneously retrieve the solution from both the upper level and lower level problems may be developed for solving large scale problems.
References 1. Candler, W., Norton, R.: Multilevel programming. Technical Report 20, World Bank Development Research Center, Washinton, DC (1977) 2. Bard, J., Moore, J.: A branch and bound algorithm for the bilevel programming problem. SIAM Journal on Scientific and Statistical Computing 11(2), 281–292 (1990) 3. Aiyoshi, E., Shimizu, K.: A solution method for the static constrained stackelberg problem via penalty method. IEEE Transactions on Automatic Control 29(12), 1111–1114 (1984) 4. Wen, U., Huang, A.: Simple tabu search method to solve the mixed-integer linear bilevel programming problem. European Journal of Operation Research 88(3), 563–571 (1996) 5. Bard, J.F.: Coordination of a multidivisional organization through two levels of management. Omega 11(5), 457–468 (1983)
494
S.B. Yaakob and J. Watada
6. Carrion, M., Arroyo, J.M., Conejo, A.J.: A bilevel stochastic programming approach for retailer futures market trading. IEEE Transactions on Power Systems 24(3), 1446–1456 (2009) 7. Shimizu, K., Ishizuka, Y., Bard, J.F.: Nondifferentiable and two-level mathematical programming. Kluwer Academic, Boston (1997) 8. Colson, B., Marcotte, P., Savard, G.: Bilevel programming: A survey. International Journal of Operations Research 3(2), 87–107 (2005) 9. Al-Khayyal, F., Horst, R., Paradalos, P.: Global optimization on concave functions subject to quadratic constraints: an application in nonlinear bilevel programming. Annals of Operations Research 34(1), 125–147 (1992) 10. Edmunds, T., Bard, J.F.: An algorithm for the mixed-integer nonlinear bilevel programming problem. Annals of Operations Research 34(1), 149–162 (1992) 11. Savard, G., Gauvin, J.: The steepest descent direction for the nonlinear bilevel programming problem. Operations Research Letters 15(5), 265–272 (1994) 12. Vicente, L., Savarg, G., Judice, J.: Descent approaches for quadratic bilevel programming. Journal of Optimization Theory and Applications 81(2), 379–399 (1994) 13. Hejazi, S.R., Memariani, A., Jahanshanloo, G., Sepehri, M.M.: Bilevel programming solution by genetic algorithms. In: Proceeding of the First National Industrial Engineering Conference (2001) 14. Yin, Y.: Genetic algorithms based approach for bilevel programming models. Journal of Transportion Engineering 126(2), 110–120 (2000) 15. Shih, H.S., Wen, U.P.: A neural network approach for multi-objective and multilevel programming problems. Journal of Computer and Mathematics with Applications 48(1-2), 99–108 (2004) 16. Lan, K.M., Wen, U.P.: A hybrid neural network approach to bilevel programming problems. Applied Mathematics Letters 20(8), 880–884 (2007) 17. Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. In: Proceedings of National Science, pp. 2554–2558 (1982)
Using Semantics to Bridge the Information and Knowledge Sharing Gaps in Virtual Engineering Javier Vaquero1, Carlos Toro1, Carlos Palenzuela2, and Eneko Azpeitia3 1
Vicomtech Research Centre, Mikeletegi Pasalekua 57, 20009 San Sebastian, Spain {jvaquero,ctoro}@vicomtech.org 2 Inge-Innova, Albert Einstein 44, 01510 Miñano, Spain 3 Inertek 3D, Laga Bidea 804, 48160 Derio, Spain
Abstract. In a Product Life Cycle (PLC) scenario, different Virtual Engineering Applications (VEA) are used in order to design, calculate and, in general, to provide an application scenario for computation engineering. The diverse VEA are not necessarily available when information sharing is needed, a fact that represents a semantic loss as the knowledge gained by using one VEA can be lost if a data translation occurs (e.g. a Finite Element program is normally able to export only the geometry). In this paper we present an architecture and a system implementation based on semantic technologies which allows a seamless information and knowledge sharing between VEA in a PLC scenario. Our approach is validated through a Plant Layout Design application which is able to collect knowledge provided by different VEA available. This work presents our system leaving a statistical analysis for future work, as at the moment our system is being tested. Keywords: Virtual Engineering, Ontologies, Product Life Cycle.
1 Introduction Virtual Engineering (VE) is defined as the integration of geometric models and their related engineering tools such as analysis, simulation, optimization and decision making, within a computerized environment that facilitates multidisciplinary and collaborative product development [1]. Virtual Engineering Applications (VEA) are the software implementations of VE. Each VEA in turn contains a set of Virtual Engineering Tools (VET), which is the collection of features that the VEA offers, e.g. in a CAD-like VEA tools like: makeLine(), drawCircle(), should exist. Nowadays, VEA barely exploit the capabilities of contextual facts, user requirements, user experience and in general, of factors that could be easily modelled and advantaged from a semantics point of view. From a Product Life Cycle perspective, the use of different VEA in each one of the stages is a common practice, for example CAD in the Design stage or FEA in Analysis stage. Some VEA are even proved valuable in several stages. However in this scenario, a semantic loss is reported [2] as in the outmost cases the geometry is the only feature that prevails. Information and knowledge sharing between VEA has R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 495–504, 2010. © Springer-Verlag Berlin Heidelberg 2010
496
J. Vaquero et al.
become an important gap for several reasons which include commercial interests of the manufacturer, the nature of legacy products and language incompatibilities. To approach a solution to this problem, we propose an architecture and a system implementation where all the knowledge generated by VEA in a PLC through its underlying VET is stored in a central knowledge repository. This repository is accessible to the aforementioned Virtual Engineering Applications in a persistent manner, providing a semantic support for the PLC and hence dropping redundancy whilst reducing important costs associated with typical format healing problems and information loss. This paper is organized as follows: In section 2, we present briefly the related concepts necessary for our approach. In section 3, we present our proposed architecture for the semantic approximation to the information loss problem in PLC. In section 4, we present a system based in the proposed architecture for a plant design layout task. Finally, in section 5 we present our conclusions and future work.
2 Related Concepts In this section we present a short overview of some concepts relevant to this paper. An interested reader is invited to review [3], [4], [5] for a wider explanation of the concepts presented. 2.1 Virtual Engineering Applications (VEA) As stated in the introduction, VEA as implementations of the VE concept, are arguably the main facilitators of a new product development. In Fig. 1, the components of a VEA are presented following the sub-division introduced in [3].
Fig. 1. According to [3], VEA are divided into 5 different parts: characteristics, requirements, interaction paradigms, underlying VET and extension capabilities
In general, a VEA is composed by: (i)
A Set of Characteristics, which is the expected benefits of the VEA in terms of capabilities, features, accepted formats for input/output interactions.
Using Semantics to Bridge the Information and Knowledge Sharing Gaps
(ii) (iii) (iv) (v)
497
A Set of Requirements, which is the minimum requisites that must hold the computer system where the VEA will be used. A Set of Interaction paradigms, which is the different GUIs, input/output device characteristics (e.g. mouse, screen), etc. A Set of VET, which is the collection of underlying tools which allow the fulfilment of the characteristics of the VEA. An Extension Capabilities Mechanism, which is generally provided via an API or scripting languages, allowing the programmatic extension of the VEA.
2.2 Classical Approaches in Interoperability between Virtual Engineering Applications The interoperability between VEA is approached classically through data exchange. As each VEA support its own native format for the serialization of its internal data, it is necessary to use a translator between native formats. Translators are often specific for certain needs and no extrinsic knowledge is arguably translated from VEA to VEA, resulting in problems when a new functionality is available (a new VET for example) and producing inoperativeness until an updated version is presented [6]. Translators do not render a complete solution because suppliers provide only compatibility for a small set of features and in many cases the translation results will make no sense as different perspectives are not taken into account. One example of the aforementioned scenario could be the case of a pipe element designed in CAD and the same element considered in a cost handling VEA, although involving the same object, the translation is meaningless because the perspective changes. In some cases, the VEA supplier provides APIs for the development of new translators [7], [8], but this is arguably not a common practice due to marketing reasons or legacy systems. Another solution reported, is related to the use of open standards for data exchange. Examples of these are IGES [9] for graphics or EDI [10] for ERP. These open standards provide a way to share information, but such information is far from the concept of knowledge as specialised relationships are lost in the translation, resulting in semantics loss as described in [2]. Semantic interoperability is one of the newest approaches in the state of the art. It specifically aims to the development of supporting applications with the ability to automatically interpret the information exchanged meaningfully and accurately producing useful results. Ontologies allow information exchange approaching this kind of interoperability [4]: the cases of e-Learning [11], Electronic Commerce [12] or Geographic Information Systems [13] are well documented examples of the success of this kind of interoperability. 2.3 Ontologies Ontologies play a fundamental role in the Semantic Web paradigm [14]. In the Computer Science domain, the widely accepted definition (given by Gruber) states, “an ontology is the explicit specification of a conceptualization” [5]. In other words, it is a description of the concepts and their relationships in a domain of study. Fikes [15] identifies four top-level application areas where ontologies are applicable:
498
J. Vaquero et al.
(i) collaboration, (ii) interoperation, (iii) education and (iv) modelling. Within the VE domain, Mencke [6] considers three major areas where ontologies can be used: (i) Virtual Design, referred to the construction of virtual prototypes and their use with applications for controlling, monitoring and management, (ii) Test & Verification, referred to the simulation for checking the correctness and applicability of the Virtual Prototypes, and (iii) Visualization & Interaction, referred to the presentation of virtual objects and the user interaction. The use of ontologies is validated in our proposal through the fact that in a PLC, each one of the involved VEA comprises a different level of specificity; a fact that leads to the need of a strong knowledge base paradigm if no semantic loss is desired. In our approach, we propose the use of a supporting knowledge base whose design language should be strong enough to contain the particularities of every VEA involved. The aforementioned Supporting Knowledge base should provide mechanisms to reason and be queried about its contents, and at the same time it should be flexible enough to accept the inclusion of new relationships on the fly (in order to support the addition of new VET). Ontologies allow the representation of such Knowledge Base, including also stability checking after on-the-fly changes.
3 Proposed Architecture A traditional PLC information flow is depicted in Fig. 2.a. We propose a supporting Knowledge Base that will store and handle the gathered knowledge following a centralized way as shown in Fig. 2.b. In our proposed Knowledge Base any supporting VEA will be able to contribute with its specific knowledge to the conceptual representation (for example, CAD provides the geometry, CAM provides manufacturing tool paths and FEA provides failure analysis). In order to implement the supporting Knowledge Base, we propose the architecture depicted in Fig. 3.
Fig. 2. Representation of data flow through PLC. (a) presents the classical approximation, and (b) shows the data flow using a support Knowledge Base.
Using Semantics to Bridge the Information and Knowledge Sharing Gaps
499
Fig. 3. Proposed architecture for the implementation of the supporting KB. Data from the Software layer is translated in order to feed the Knowledge Base.
In our architecture, the first layer is called the Software layer; it contains the collection of VEA available in the different PLC stages. These VEA interact with users in a classical way (through their own interfaces). In general VEA in this layer possess extension capabilities that allow the possibility to access their embedded knowledge. The next layer is called Translation layer, in this part of the architecture takes place the alignment and matching between VEA generated knowledge and the Domain ontology located in the next layer. It can be argued that this translator provide the means of matching the VEA knowledge with the supporting Knowledge structure, for such reason, the importance of choosing a good model for the Domain is critical as problems like information incompleteness and multiple sources leading to redundancy should be considered. Each one of the VEA in the Software layer is associated to its own translation module. Translators allow the bidirectional traffic of the knowledge: they are able to convert the knowledge generated in the VEA to the Domain ontology, and vice versa, from the ontology to the VEA format. Translator construction can be made in several ways, being the most usual using an API, or the parsing of the generated output files. The following layer is called the Knowledge Base layer; this part of the architecture contains the Domain ontology itself. All the gathered knowledge generated by each VEA is stored and managed here. The Semantic layer also contains a Reasoner for the exploitation of extrinsic knowledge [16] contained through a reasoning API [17] which is commonly a Descriptive Logics (DL) handler [18] who processes the Domain ontology. At the top of the architecture is the Application layer, where a series of applications capable to interact with the KB in a direct way can be produced. Such applications take advantage of all the stored knowledge while interact with thought their own interfaces.
500
J. Vaquero et al.
4 Case Study As case study, we developed within the frame of a research project; an application for the design of industrial plant layouts. In such scenario, many teams of engineers participate in the design of the plant with different VEA and in the different stages of the PLC producing a situation that is arguably prone to present a semantic loss. In our scenario, the goal is the consideration of a new product that will be manufactured in the aforementioned facility. The Industrial Plant has to be adapted in one or several of the production lines already in existence in order to manufacture the new product. Our visualization tool makes use of the presented architecture in order to provide an effective bridge over the gap of knowledge loss between the different VEA utilized. Fig. 4 depict a relation of our case of study and the generalized architecture presented in section 3. In order to simplify the example, we will consider only three VEA in the Software layer: AutoCAD, used for the static geometries design (located in the design stage of the PLC mainly), RobCAD, used for the kinematics calculation and moving objects simulation(located in the analysis and operation stages of the PLC), and a XML serialized file which contains diverse database gathered information in the form of stored facts about the actual plant (located in the operation and maintenance stages of the PLC). We make a differentiation between static elements (e.g. walls, floor, fixed objects), and non-static elements (e.g. a robotic arm) in order to recognize easily the knowledge that must be involved in the re-calculation duties during the simulation of the plant layout operation. Each VEA used possesses a translator in order to match the generated knowledge to the Domain ontology structure (Translation layer). AutoCAD provides an API called ObjectARX [19], which allows programmatically the parsing of the different 3D models, contained. In the case of RobCAD, the vendor does not provide any public API, but instead a plug-in that allows exporting generated animated models into the VRML format. Even if such files have some non-standardized labels, we developed a VRML parser in order to extract and match the knowledge contained in those files into the ontology (for the extraction of the movements themselves). In a similar way, we developed an XML parser to recover the information stored in the XML Generator output files, matching this information in the ontology. In [20] the advantages of using Engineering Standards as a basis of ontology Domain modelling are detailed. Being one of these advantages the avoidance of semantic loss, the use of engineering standards is not only validated for our solution but highly desired. Following the procedure to create Domain ontologies based on standards presented in [20], we have found in the domain of Plant Design reported success cases with the standard ISO-STEP (10303ap227) [21]. Our Domain Model in the Knowledge Based layer was hence serialized using the aforementioned approach adding also a second ontology for the extension of the STEP protocol to particularities of the problem at hands (again following the methodology presented in [20]). For the reasoning part, we created a simple interface that uses Pellet [22] via the Protégé OWL API [23].
Using Semantics to Bridge the Information and Knowledge Sharing Gaps
501
Fig. 4. Case study matched in the proposed architecture. Three different VEA are located in the software layer, each one with its own translator for KB feeding.
Fig. 5. Industrial Plant Layout Design application screenshot. The interface presents a 3D representation of the layout.
502
J. Vaquero et al.
At this point, we have developed our desired Industrial Plant Layout Design application in the Application layer. The developed application allows the user the examination of different layouts for the same Industrial Plant in a 3D environment by navigating through the plant and modifying the mentioned layout in order to obtain the best possible configuration with the aid of the semantic engine that will not permit configurations that contradict design principles (using for that purpose the reasoner capabilities). The User can also obtain additional information from the elements contained in the active layout, for example the security area needed for the robot KUKA KR 150-2 or the costs associated to the breakdown of painting process manufacture call. As can be seen in the Fig. 5, the application interface shows the three-dimensional representation of the active layout, and the different layouts (with their corresponding elements) in a tree mode. When the mouse is over an element in the tree, the system asks to the knowledge base for the position, translation and scale of the concrete instance of the 3D model and presents the information to user. If the user needs other information about any specific element or layout, he/she has possibility to launch a query to the knowledge base from query menu. Such query is handled by the reasoner over the instances of the ontology.
5 Conclusions and Future Work In this paper, we presented a semantic based architecture that allows a seamless information and knowledge sharing between Virtual Engineering Applications in a Product Life Cycle scenario. We discussed the main highlights of the VEA interoperability problem and validated our presented approach with a test case consisting on the implementation of a Plant Layout Design application supported in our architecture. Our implementation is effective in collecting knowledge provided by different VEA available proving that our approach could be applicable within the industry. At this time, the presented system is being tested in order to offer a deep statistical analysis in subsequent works. As future work, we intend to extend our implementation to more VEA in order to do an effective checking on the economical impact that having a centralized knowledge structure will provide. Also we are currently working on the use of the centralized Knowledge gathered within a Semantic Web application running as a service in a company web page in order to provide means for remote monitoring of an industrial plant. Acknowledgments. We would like to express our gratitude to the Basque Government for the partial funding of the project ePlantVerify where the main techniques presented in this article were developed (under INNOTEK 2007-2010 grants call), and the partial funding of PCT/ES 2010/000009, where our system is in the process of a research patent concession. We also want to thank Inge-Innova and Inertek 3D as partners of the consortium of the aforementioned research initiatives.
Using Semantics to Bridge the Information and Knowledge Sharing Gaps
503
References 1. Jian, C.Q., McCorkle, D., Lorra, M.A., Bryden, K.M.: Applications of Virtual Engineering in Combustion Equipment Development and Engineering. In: ASME International Mechanical Engineering Congress and Expo., IMECE 2006–14362, ASME. ASME Publishers (2006) 2. Zorriassatine, F., Wykes, C., Parkin, R., Gindy, N.: A survey or virtual prototyping techniques for mechanical product development. Journal of Engineering Manufacture 217, 513–530 (2003) 3. Toro, C.: Semantic Enhancement of Virtual Engineering Applications. PhD Thesis, University of the Basque Country, Spain (2009) 4. Obrst, L.: Ontologies for Semantically Interoperable Systems. In: CIKM 2003: Proceedings of the Twelfth International Conference on Information and Knowledge Management, pp. 366–399. ACM, New York (2003) 5. Gruber, T.R.: Toward principles for the design of ontologies used for knowledge sharing. International Journal of Human-Computer Studies 43(5-6), 907–928 (1995) 6. Mencke, S., Vornholt, S., Dumke, R.: The Role of Ontologies in Virtual Engineering. In: 19th International Conference on Database and Expert Systems Application, pp. 95–98. IEEE Computer Society, Los Alamitos (2008) 7. ObjectARX, http://usa.autodesk.com/adsk/servlet/index?siteID= 123112&id=773204 (last visited March 18, 2010) 8. Solidworks 2003: API Fundamentals. Solidworks Corporation (2001) 9. Initial Graphics Exchange Specification (IGES), http://tx.nist.gov/standards/iges (last visited March 18, 2010) 10. Electronic Data Interchange (EDI). Federal Information Processing Standards Publication 161-2, http://www.itl.nist.gov/fipspubs/fip161-2.htm (last visited March 18, 2010) 11. Busse, S.: Interoperable E-Learning Ontologies Using Model Correspondences. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM-WS 2005. LNCS, vol. 3762, pp. 1179– 1189. Springer, Heidelberg (2005) 12. Obrst, L., Liu, H., Wray, R., Wilson, L.: Ontologies for Semantically Interoperable Electronic Commerce. In: International Conference on Enterprise Modelling and Enterprise Integration Technologies (ICEIMT), and the Conference of the EI3-IC Initiative (Enterprise Inter- and Intra-Organisational Integration – International Consensus), Valencia, Spain, April 24-26, pp. 366–399. Kluwer Academic Publishers, Dordrecht (2002) 13. Fonseca, F., Câmara, G., Monteiro, A.M.: A Framework for Measuring the Interoperability of Geo-Ontologies. Spatial Cognition and Computation 6(4), 309–331 (2006) 14. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5), 34–44 (2001) 15. Fikes, R.: Multi-Use Ontologies, Stanford University, http://www-ksl.stanford. edu/people.fikes/cs222/1998/Ontologies/sld001.htm (last visited March 18, 2010) 16. Sattler, U.: Description Logic Reasoners, http://www.cs.man.ac.uk/~sattler/reasoners.html 17. Using the Protégé-OWL Reasoner API, http://protege.stanford.edu/ plugins/owl/api/ReasonerAPIExamples.html 18. Baader, F., Horrocks, I., Sattler, U.: Description Logics. In: Handbook of Knowledge Representation. Elsevier, Amsterdam (2007) 19. Kramer, B.: ObjectARX Primer. Autodesk Press (1999)
504
J. Vaquero et al.
20. Toro, C., Graña, M., Posada, J., Vaquero, J., Sanin, C., Szczerbicki, E.: Domain Modeling Using engineering Standards. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds.) KES 2009. LNCS, vol. 5711, pp. 95–102. Springer, Heidelberg (2009) 21. SCRA: STEP Application Handbook ISO 10303, Version 3 (2006), http://www.uspro.org/documents/STEP_application_hdbk_63006_B F.pdf (last visited March 18, 2010) 22. Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., Katz, Y.: Pellet: A practical OWLDL reasoner. Journal of Web Semantics: Science, Services and Agents on the World Wide Web 5(2), 15–53 (2005) 23. Protégé OWL API, http://protege.stanford.edu/plugins/owl/api/ (last visited March 18, 2010)
Discovering and Usage of Customer Knowledge in QoS Mechanism for B2C Web Server Systems Leszek Borzemski1 and Grażyna Suchacka2 1
Institute of Informatics, Wrocław University of Technology Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland [email protected] 2 Chair of Computer Science, Opole University of Technology Sosnkowskiego 31, 45-272 Opole, Poland [email protected]
Abstract. The paper deals with the problem of guaranteeing high Quality of Service (QoS) in e-commerce Web servers. We focus on the problem of request admission control and scheduling in a Business-to-Consumer (B2C) Web server from the profit perspective of the owner of an e-business company. We propose extending a Web server system with the ability to identify and favour key customers of a Web store and to ensure the possibility of successful interaction for all customers finalizing their purchase transactions. We propose applying a Recency-Frequency-Monetary analysis (RFM) to discover key customer knowledge and using the resulting RFM scores in a novel QoS mechanism. We discuss the mechanism and some simulation results of its performance. Keywords: QoS, Web server, e-commerce, B2C, key customer, RFM analysis.
1 Introduction Increasing popularity of Web-based services has been observed in recent years. The Web has proved to be an especially promising platform for e-commerce, allowing individuals to browse and buy goods over the Internet. Online Web stores, organized as B2C Web sites, pose an attractive alternative to conventional stores. A key factor for the success of each Web store is the efficiency of the Web server system hosting the B2C Web site. The Web server is responsible for receiving HTTP requests from Web clients (i.e. Web browsers), generating HTTP responses and sending them back to the clients. The problem is that Web servers are not always able to provide efficient and scalable services in the face of unpredictable load variations. FIFO scheduling, which is commonly applied in the servers, does not guarantee either request completion or reasonable response times. Furthermore, all requests are treated in the same way and are equally harmed during server overload. In the case of B2C Web servers, such situations are particularly disadvantageous because incoming HTTP requests have different “business” values, depending on the type of business operation being performed by the user in the Web store. Example operations include entry to the site, browsing goods, adding them to a shopping cart, R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 505–514, 2010. © Springer-Verlag Berlin Heidelberg 2010
506
L. Borzemski and G. Suchacka
etc. A sequence of business operations performed by the user during a single visit to the site makes up a user session. Finalizing a purchase transaction is usually connected with the final operations in a session, and thus the successful completion of all previous requests in the session is a condition for product purchase. Improving the quality of service in a B2C Web server system can be achieved by introducing an efficient admission control and/or scheduling algorithm taking into account various session attributes, which may affect the resulting revenue. The rest of this paper is organized as follows. Section 2 discusses related work. Section 3 defines QoS from a business perspective. Section 4 presents the motivation for a novel approach to B2C service, which is described in section 5. In section 6 we discuss some simulation results for the approach, and we conclude in section 7.
2 Related Work A lot of work on QoS for Web servers has been done so far, but relatively few studies have been targeted at B2C systems. Some of them have been focused on B2C workload analysis and identifying customer shopping patterns [1, 2, 3], as well as on personalization and recommendation methods [4, 5]. However, very limited work has been reported on admission control and scheduling algorithms for B2C Web servers, taking the knowledge-based aspect of e-business profitability into consideration. Admission control aims to prevent system overload by rejecting some incoming HTTP requests. In the case of B2C applications, a decision on request rejection under system overload is usually only taken for requests opening new sessions, while requests belonging to sessions in progress are accepted [1, 6, 7]. Under the admission control algorithm proposed in [7] additional information concerns the correlation of a sender’s IP address with a fact of making a purchase in a given Web store. Scheduling concerns reordering a queue of requests to server input or queues to server resources such as the CPU, disk or network interface. To this end, requests usually receive priorities based on various characteristics of a user session, such as the session state, corresponding to a kind of a business operation at the site [1, 2, 6, 8, 9, 10], probabilities of transitions between the session states [1, 9], as well as the financial value of goods in a shopping cart and the session length [2]. The literature analysis has shown that there is a need for preferential treatment of the most valued customers of Web stores. However, to the best of our knowledge, none of the admission control and scheduling algorithms for a B2C Web server have distinguished the most valued customers, nor have applied priority scheduling for them. We address this problem in combination with the problem of ensuring the highest revenue, achieved as a result of the successful completion of buying sessions in a B2C Web server system.
3 Quality of Service from Business Perspective QoS is connected with the classification of network traffic in order to differentiate between different traffic classes and to provide different service levels to different classes. End-to-end QoS involves introducing mechanisms both in networks carrying
Discovering and Usage of Customer Knowledge in QoS Mechanism
507
data between end hosts, and in the end hosts themselves. In this paper we focus on the server side of the Web service, motivated by the literature results [11, 12] showing that the total quality of end-to-end Web service is strongly dominated by Web server delays. We consider QoS-enabled Web server systems for B2C applications. Traditionally, Web server performance has been evaluated from the perspective of a computer system using metrics connected with request latencies in the system, or the system throughput. However, in the case of B2C Web servers, the evaluation of the system performance from the profit perspective is more adequate. Businessoriented performance metrics, related to money earned and lost as a result of request processing in the system, are more relevant from the online retailers’ point of view. We aim to improve the performance of the Web server system in terms of two business-oriented metrics: 1) revenue throughput, defined as an amount of money per minute generated through successfully completed buying sessions [2]; 2) the percentage of successfully completed sessions of the most valued customers in the observation window.
4 Motivation for a Novel Approach to B2C Service We propose extending a B2C Web server system with two functions. The first is an ability to identify and evaluate the most valued customers of the online store, and to offer the best quality of service to them in order to boost their loyalty and to increase the online retailer’s revenue in the long run. The second function is connected with ensuring successful interaction for all customers who are going to make a purchase. Preferential treatment of the most profitable and loyal customers has been practised in traditional marketing and is a component of many CRM (Customer Relationship Management) techniques. The experience of many companies confirms a Pareto principle, which states that the dominant part of company profit is generated by a relatively small percentage of the most profitable customers [13]. It has also been proved that winning over new customers is up to 5–6 times more expensive that retaining the loyalty of those already won over. Furthermore, regular customers provide a bigger profit and are more tolerant of price rises. In the B2C environment the preferential treatment of key customers seems to be especially justified. Most e-customers are returning customers who have a much bigger probability of making a purchase than users without prior purchases [14]. Online shopping activity is highly concentrated, with 5% of online buyers accounting for 40% of all transactions [15]. Heavy buyers tend to be more loyal and are very likely to confine their buying to their favourite stores. On the other hand, in an online store, customers who really want to buy something should be served quickly, as in traditional retail trade. The percentage of e-commerce customers who end up buying something has been found to be merely a few percent, while the prevailing majority of customers only browse the site [14]. Moreover, a significant fraction of sessions at B2C sites is generated by robots [16]. At peak times the server load due to browsing requests far exceeds that of revenue-generating requests, and it prevents users at the most advanced session stages from successful completion of their sessions and making a purchase.
508
L. Borzemski and G. Suchacka
It is also worth favouring customers with items in their shopping carts. However, we argue that they should receive lower priority than customers who are just about to make a purchase because of the common phenomenon of shopping cart abandonment.
5 Novel Approach to B2C Service Using Customer Knowledge We propose a QoS mechanism in order to achieve two business-oriented goals. The main goal is ensuring the highest possible revenue as a result of successful completion of buying sessions. The second goal is offering the shortest page response times for the most valued customers as far as possible (and thereby ensuring a large percentage of successfully completed sessions of key customers). Our method is named KARO (Key customers And Revenue Oriented scheduling) and the algorithm KARO-Rev (KARO – Revenue Maximization). In this section we describe a way of e-customer knowledge discovery using RFM analysis. Then, we present a way of using this knowledge to request service control in the Web server system. 5.1 Discovering of E-Customer Knowledge Using RFM Analysis The idea of identifying the most loyal and profitable customers of a company and retaining long-term relationships with them is the basis of many CRM techniques. We propose applying this idea to the e-commerce environment and implementing it in a request service control mechanism for a B2C Web server system. Motivated by the literature results presented in section 4, we propose distinguishing the most valued customers of the online store depending on their past purchasing behaviour at the B2C Web site. To quantify customer values we propose applying RFM analysis [17]. It is a segmentation method which allows clustering customers based on their past behavioural data and using that information to predict future customers’ behaviour. The behavioural data encompass three aspects of a customer’s buying behavior in the Web store. • Recency means the time interval from the customer’s last purchase until now. • Frequency means the total number of times the customer has made a purchase. • Monetary means the total amount of money the customer has spent in the store. After performing a customer database segmentation according to RFM analysis, each key customer is described with recency, frequency and monetary codes. From a simple or weighted combination of these codes, a single RFM score for a key customer is derived. RFM analysis has been used in traditional retail trade to develop and put into practice efficient customer-specific marketing strategies [18, 19]. In contrast to many other CRM techniques based on CLV (Customer Lifetime Value), it does not require collecting demographic, social or economic information on the company’s clients and does not involve big costs. Strong arguments for applying this method into the e-commerce environment are its simplicity and the lack of a necessity to perform complex computations in real time. All the data required to compute RFM codes are easily accessible and can be captured by a simple observation of customers’ purchasing behaviour.
Discovering and Usage of Customer Knowledge in QoS Mechanism
509
We propose creating a customer database and storing it on the Web server. For each customer who has made at least one purchase in the Web store, there is an appropriate record in the database. Each customer record contains the date of the last purchase, a counter for the total number of times the customer has made a purchase and a counter for the total monetary amount spent by the customer. For each customer these three pieces of data are used to construct RFM codes in the following way. A recency code is based on the calendar and divides customers based on their recency values according to ranges: 0–3 months, 4–6 months, 7–12 months, 13–24 months and above 24 months. To the most recent group one assigns and inserts in each customer record the number 5. The next group receives the number 4, etc. After that step every customer in database has a recency code of 5, 4, 3, 2 or 1. A frequency code is constructed using the Behavior Quintile Scoring method. All customer records are sorted by frequencies with the greatest frequencies at the top. Then the records are divided into five groups on the basis of five ranges of the frequency values, with four cutoffs generated every 20% of the highest frequency value. Thus the highest group includes customers who bought in the store more frequently than 80% of the highest frequency value – these customers receive frequency code of 5. On the other hand, the lowest group includes customers who bought not more often than 20% of the highest frequency value – they receive a score of 1. In this way there are similar values of frequencies in each of five groups although the number of customers in each group may be different. After this step every customer record has a two digit code which varies from 55 (which means the most recent and the most frequent customers) down to 11 (which means the oldest and the least frequent customers). A monetary code is constructed using the same method as for the frequency one but by sorting and dividing the customer records by monetary values. After this step every customer record has a three digit code varying from 555 down to 111. All in all, there are 125 possible RFM cells. A single RFM score of a customer s is calculated according to the formula:
RFM s = wR × R s + wF × F s + wM × M s ,
(1)
where wR, wF, wM are weights assigned to the corresponding behavioral variables, and Rs, Fs, Ms are codes of the corresponding behavioral variables for the customer s, i.e. recency, frequency and the monetary value, respectively. For example, if the weights wR, wF, wM are all equal to 3, then the minimum possible value of a single RFM score will amount to 9 and the maximum possible value will amount to 45. For a given Web store such segmentation should be performed periodically, e.g. every month, and RFM codes and the resulting RFM scores should be recalculated for all customer records in the database. Moreover, after each next purchase made by a customer, the corresponding values should be updated (for the existing record) or a new record should be created (for a new buyer). 5.2 Request Service Control in the B2C Web Server System We consider a B2C Web server system organized as a multi-tiered application, consisting of a single Web server and a single back-end server. A general structure of the system is presented in Fig. 1.
510
L. Borzemski and G. Suchacka
HTTP request HTTP request Classification
HTTP response
Web server system Accepted
Admission control
Q1
Dropped
Rejected
Web server
Dynamic request Dynamic Q 2 request
Classification
Back-end server
Dropped
Fig. 1. The proposed model of request service in a B2C Web server system
We distinguish two types of requests: 1) HTTP requests come to the system from the outside. Each Web page involves processing many HTTP requests, i.e. the first hit for an HTML page and following hits for all objects embedded in it. Embedded objects may be static objects served by a Web server (e.g. images) or dynamic objects created online. Accepted HTTP requests are put into a queue Q1 at the Web server input. If a request's waiting time exceeds T1, the request will be dropped from the queue. 2) Dynamic requests are generated by the Web server and sent to the back-end server if a response to an HTTP request has to be created online, i.e. it requires online computations and a database access. Dynamic requests are put into a queue Q2 at the back-end server input. The maximum request waiting time, after which a request will be dropped from the queue, is specified by T2. Let n be the moment of a new HTTP request arrival to the Web server or a moment of a new dynamic request arrival to the back-end server. Request service control in the Web server system consists of taking decisions at successive moments n concerning HTTP request admittance or rejection, scheduling of admitted HTTP requests waiting in the queue Q1, and scheduling of dynamic requests waiting in the queue Q2. The control decisions are based on business-oriented criteria. At any moment n a number of users are interacting with a B2C Web site, i.e. there are S(n) user sessions active and served by the system. Every session begins with entry to the home page of the site and then the user continues the visit performing various business operations. Each session s = 1, 2, ..., S(n) is characterized by the following attributes: • The session class cs(n)∈{KC, OC}. We define two session classes: a KC class for key customers and an OC class for ordinary ones. We define a key customer as a user who had made a purchase in a Web store sometime; an ordinary customer is a user without prior purchases. We assume that every user starts a session as an ordinary customer and may be identified as a key one at any moment of the session after logging on. • The session rank RFMs(n) reflects the value of the customer. For a KC class it corresponds to the customers’ RFM scores, computed based on the customers’ purchase histories using RFM analysis. For the OC class it is assumed to be zero.
Discovering and Usage of Customer Knowledge in QoS Mechanism
511
• The session state es(n)∈{H, L, B, S, D, A, R, P} corresponds to a type of business operation at the B2C site: entry to the Home page (H), Logging on (L), Browsing (B), Searching for products (S), product’s Details (D), Adding a product to a shopping cart (A), Registration (R) or making a Payment (P). • The session length ls(n) = 1, 2, ... means the number of Web pages visited in the session so far, including the current page. • The value of a shopping cart vs(n) is computed as an amount of money (in $) corresponding to prices of products in the shopping cart. • The state of a shopping cart ηs(n)∈{0, 1} means an empty or not empty cart. In order to support differentiated QoS in the Web server system, we propose introducing dynamic session priorities and applying priority-based request admission control and scheduling. A priority-based approach to the request service in a Web server has been commonly used in QoS research. Our dynamic session priority scheme is based on the method proposed in [2]. Let Ps(n) be the priority of session s at the n-th moment. The priority changes with the session progress depending on its attributes. We define two thresholds, TMed and TLow, which determine three ranges for session length. The priority of session s at the n-th moment is calculated according to the following formula:
Ps(n) =
Highest for (cs(n) = KC) ∨ [(cs(n) = OC) ∧( ηs(n) = 1) ∧(es(n) = P)], High for [(cs(n) = OC) ∧ ( ηs(n) = 1) ∧ (es(n) # P)] ∨ [(cs(n) = OC) ∧ ( ηs(n) = 0) ∧ (ls(n) < TMed)], Medium for (cs(n) = OC) ∧ ( ηs(n) = 0) ∧ (TMed ≤ ls(n) < TLow), Low for (cs(n) = OC) ∧ ( ηs(n) = 0) ∧ (ls(n) ≥ TLow) .
(2)
The priority Highest is reserved for all key customers, as well as for ordinary customers finalizing their purchase transactions. The priority High is assigned to all customers entering the site, to give a chance of successful interaction to all users, and to allow key customers to log into the site, as well as to ordinary customers with some items in their shopping carts. Priorities Medium and Low are assigned to ordinary customers who have stayed at the site for a long time with empty carts. A session priority is verified and updated at arrivals of HTTP requests for new Web pages only. Thus, all subsequent HTTP requests for objects embedded in a page, as well as all corresponding dynamic requests, have the same priority. In order to enforce admission control decisions, we define two thresholds for the system load intensity: I1 and I2, determined by the queue Q2 length L2(n). A condition of acceptance of an HTTP request at the n-th moment is given by the formula:
ζ( P (n), L2(n)) = s
0 if a request is for an HTML object and [(I1 ≤ L2(n) < I2) ∧ (Ps(n) = Low) ∨ [(L2(n) ≥ I2) ∧ ((Ps(n) = Low) ∨ (Ps(n) = Medium))], 1 in other cases.
(3)
When L2(n) exceeds the threshold I1, the system starts to reject HTTP requests for new Web pages in Low priority sessions. Above the threshold I2 HTTP requests for new Web pages both in Low and Medium priority sessions are rejected. All requests
512
L. Borzemski and G. Suchacka
for embedded objects are accepted regardless of the session priority, as well as all requests for new pages in High and Highest priority sessions. We apply priority scheduling in the queues Q1 and Q2, which enables changing the order of request execution at the Web and back-end server, respectively. Both for the queue Q1 and Q2 strict priority scheduling is applied between four priorities, which means that all higher priority requests are scheduled before the lower priority ones. Highest priority requests are ordered decreasingly according to shopping cart values, and requests from the sessions with the same shopping cart values are additionally ordered decreasingly according to the session ranks. High priority requests are ordered decreasingly according to shopping cart values, while both Medium and Low priority requests are queued according to the FIFO order.
6 Simulation Results
450 400 350 300 250 200 150 100 50 0
FIFO 40% KC FIFO 10% KC KARO-Rev 40% KC KARO-Rev 10% KC
20
60 100 140 180 220 260 300 Session arriv al rate [sessions/min]
Completed KC sessions [%]
Revenue throughput [$/min]
To verify the efficiency of the proposed approach, we developed a simulation tool consisting of a session-based workload generator and a B2C Web server system simulator. Our simulation model, and the initial version of the tool, have been described in [20]. Using the tool we have performed a performance study of a Web server system with and without the algorithm KARO-Rev. In all simulation experiments, the following parameter values have been applied: T1 = T2 = 8 seconds, TMed = 2, TLow = 20, I1 = 30, I2 = 80. A user page latency limit, after which a user gives the interaction up, was equal to 8 seconds. A share of KC sessions in the generated workload, PKC, was 10% or 40%. Each experiment was run for a constant session arrival rate and the system performance was monitored in a 3-hour observation window following a 10-hour preliminary phase of the simulated system operation. Fig. 2 presents a variation of two main business-oriented performance metrics as a function of a session arrival rate. As it can be seen in Fig. 2-left, for KARO-Rev the revenue throughput increases with the system load throughout the whole load range, as opposed to FIFO. Fig. 2-right shows that KARO-Rev has also been able to ensure a very high percentage of successfully completed KC sessions. For PKC = 10% this percentage has varied from 99.93 to 100. Even for the extremely challenging case, i.e. for PKC = 40%, this value has been above 92 for the maximum load level, which is a very satisfactory result compared to the corresponding FIFO value of 25.8. 100 90 80 70 60 50 40 30 20 10 0
FIFO 40% KC FIFO 10% KC KARO-Rev 40% KC KARO-Rev 10% KC
20
60 100 140 180 220 260 300 Session arriv al rate [sessions/min]
Fig. 2. Revenue throughput (left) and a percentage of completed key customer sessions (right)
Discovering and Usage of Customer Knowledge in QoS Mechanism
513
800 300 sessions/min
750
100 sessions/min
700 650 600 550 500
90-percentile of page response time [ms]
90-percentile of page response time [ms]
Fig. 3 shows the 90-percentile of KC page response time for different session ranks (i.e. for different key customers’ RFM scores), achieved for PKC = 10% (left) and PKC = 40% (right), for the session arrival rates of 100 and 300 sessions/min. As it can be seen, the system has offered higher page response times to sessions with lower ranks. The higher the system load and the bigger the share of KC sessions at the site, the bigger the QoS differentiation in terms of the 90-percentile of page response time between KC sessions with different ranks. The results have confirmed the ability of the algorithm KARO-Rev to achieve business-oriented control goals. The algorithm has been able both to increase the revenue significantly, and to guarantee high QoS for key customers of the Web store, with respect to their RFM scores. 4100 3600
300 sessions/min 100 sessions/min
3100 2600 2100 1600 1100 600
450
100
9 12 15 18 21 24 27 30 33 36 39 42 45 Session rank
9 12 15 18 21 24 27 30 33 36 39 42 45 Session rank
Fig. 3. 90-percentile of KC page response time for different session ranks in the case of 10% of KC sessions (left) and 40% of KC sessions (right)
7 Conclusions In this paper we have proposed a novel approach to the request service control in a B2C Web server system in order to achieve two business-oriented goals: increasing the achieved revenue and ensuring higher QoS for more valued customers of the Web store. We have proposed applying RFM analysis to discover e-customer knowledge and using it in an admission control and scheduling algorithm for a B2C Web server system. The proposed way of acquiring the customer knowledge does not involve a large cost and is transparent to Internet users. Simulation results for various workload conditions have shown that the proposed algorithm KARO-Rev has been very successful in achieving both goals even under an extremely heavy load. Future work will include the investigation of our mechanism for a variety of workload scenarios. Moreover, the mechanism could significantly benefit from augmenting its awareness with knowledge about sessions generated by robots, as opposed to humans. Another avenue of further study is to apply other ways of identifying key customers known in the CRM field. The use of semantic technologies, similar to that proposed in [4], could be useful for enhancing our approach as well.
514
L. Borzemski and G. Suchacka
References 1. Chen, H., Mohapatra, P.: Overload Control in QoS-aware Web Servers. Computer Networks 42(1), 119–133 (2003) 2. Menascé, D.A., Almeida, V.A.F., et al.: Business-Oriented Resource Management Policies for E-Commerce Servers. Perform. Eval. 42(2-3), 223–239 (2000) 3. Vallamsetty, U., Kant, K., Mohapatra, P.: Characterization of E-Commerce Traffic. Electronic Commerce Research 3, 167–192 (2003) 4. Blanco-Fernández, Y., et al.: Personalizing e-Commerce by Semantics-Enhanced Strategies and Time-Aware Recommendations. In: 3rd SMAP 2008, pp. 193–198 (2008) 5. Cao, Y., Li, Y.: An Intelligent Fuzzy-based Recommendation System for Consumer Electronic Products. Expert Syst. Appl. 33(1), 230–240 (2007) 6. Carlström, J., Rom, R.: Application-aware Admission Control and Scheduling in Web Servers. In: Infocom 2002, vol. 2, pp. 506–515. IEEE Press, New York (2002) 7. Yue, C., Wang, H.: Profit-aware Admission Control for Overload Protection in Ecommerce Web Sites. In: 15th IEEE IWQoS 2007, pp. 188–193. IEEE Press, Los Alamitos (2007) 8. Singhmar, N., Mathur, V., Apte, V., Manjunath, D.: A Combined LIFO-Priority Scheme for Overload Control of E-commerce Web Servers. In: IISW 2004 (2004) 9. Totok, A., Karamcheti, V.: Improving Performance of Internet Services Through RewardDriven Request Prioritization. In: 14th IEEE IWQoS 2006, pp. 60–71. IEEE Press, Los Alamitos (2006) 10. Zhou, X., Wei, J., Xu, C.-Z.: Resource Allocation for Session-Based Two-Dimensional Service Differentiation on e-Commerce Servers. IEEE Trans. Parallel Distrib. Syst. 17(8), 838–850 (2006) 11. Cardellini, V., Casalicchio, E., Colajanni, M., Yu, P.S.: The State of the Art in Locally Distributed Web-Server Systems. ACM Computing Surveys 34(2), 263–311 (2002) 12. Harchol-Balter, M., Schroeder, B., Agrawal, M., Bansal, N.: Size-based Scheduling to Improve Web Performance. ACM Trans. on Computer Systems 21(2), 207–233 (2003) 13. Koch, R.: The 80/20 Principle: The Secret of Achieving More with Less. Doubleday, New York (2008) 14. Measure Twice, Cut Once – Metrics For Online Retailers, Buystream, http:// www.techexchange.com/thelibrary/online_retail_metrics.html 15. Pecaut, D.K., Silverstein, M., Stanger, P.: Winning the Online Consumer: Insights into Online Consumer Behavior. Boston Consulting Group Report (2000) 16. Menascé, D.A., Almeida, V.A.F., et al.: A Hierarchical and Multiscale Approach to Analyze E-Business Workloads. Perform. Eval. 54(1), 33–57 (2003) 17. Miglautsch, J.R.: Thoughts on RFM Scoring. The Journal of Database Marketing 8(1), 67– 72 (2000) 18. Chen, M.-C., Chiu, A.-L., Chang, H.-H.: Mining Changes in Customer Behavior in Retail Marketing. Expert Syst. Appl. 28(4), 773–781 (2005) 19. Ha, S.H.: Applying Knowledge Engineering Techniques to Customer Analysis in the Service Industry. Advanced Engineering Informatics 21(3), 293–301 (2007) 20. Borzemski, L., Suchacka, G.: Web Traffic Modeling for E-commerce Web Server System. In: CCIS, vol. 39, pp. 151–159. Springer, Heidelberg (2009)
Conceptual Fuzzy Model of the Polish Internet Mortgage Market Aleksander Orłowski1 and Edward Szczerbicki2 1
Gdansk University of Technology, Faculty of Applied Physics and Mathematics PhD Program [email protected] 2 Gdansk University of Technology, Faculty of Management and Economics [email protected]
Abstract. The aim of this paper is to present the conceptual structure of basic fuzzy model representing the Polish Internet mortgage market. The paper starts with an introduction describing the market complexities and challenges, and description of previously created rule based model. Then the steps of the process of proposed model fuzzyfication are presented. Final part of the paper consists of conclusions and directions for future research. Keywords: Soft modeling, fuzzy model, internet mortgage market.
1 Introduction Polish e-banking market and especially the part of the market that this Chapter concentrates on, i.e. mortgage loans via Internet, consists of 4 main stages as shown in Figure 1 [1]. Figure 1 follows the approach of descriptive modelling at conceptual stage of problem formulation [2] and shows the connections that exist on the mortgage market. Banks are institutions that sell mortgage and offer the option to apply for it on their own web pages. Because the market is very substantial in size and is characterised by strong competition, banks allow their partners to sell their financial products on the partners’ pages to generate more sold products. These companies are typical brokers that receive the commission for every product that they sell. They promote and sell banks’ products on their pages but they also create a network called “partner system” which allows the owners of small web pages to sell the products of the banks on their own pages. The owners of private web pages do not sell enough products to cooperate directly with banks, which is the reason for their association with brokers. For each product sold on a partner web page, its owner receives commission from the broker, who in turn gets his from the bank. Each level in Figure 1 represents different participants on the market. The Figure should be looked at in a top-down manner. On the first level there are banks, the second - brokers, level 3 consists of Partner web pages. Customers make up the lowest level. It was our intention to describe the market from the customers’ point of view. The model presumes numerous customers looking for mortgage using web searchers to find the best offer. Doing so, they reach partner web pages offering R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 515–522, 2010. © Springer-Verlag Berlin Heidelberg 2010
516
A. Orłowski and E. Szczerbicki
different mortgages. It needs to be pointed out that top search results are bought by the owners of private web pages. A customer chooses one of the web pages (mostly the top one in the web search) and moves to it. This web page provides information about the credit and often adds calculators that enable the customer to find out his/her credit rating. There is also a special link with the application template for a loan in the chosen bank. While visiting the web page a customer fills out the application for the mortgage, which is subsequently sent to the broker with whom the partner has signed a contract. Next the broker sends this application to the bank with which the broker has signed the contract in the first place. The bank begins the credit procedure, contacts the client, checks the credit rating, and makes a decision.
Fig. 1. Formal description of the general mortgage market model [1]
The main challenge of the strategic nature for this briefly described market is making accurate predictions about the number of sold mortgages, especially in the part of mortgages sold by web partners. Due to the fact that there are several variables of different nature (quantitative and qualitative) influencing the market, traditional statistical methods can’t be used here because they don’t work properly. Variables in “hard” mathematics take numerical values, while in “soft” fuzzy logic applications the non-numeric linguistic variables are often used to facilitate the expression of rules and facts [5],[7]. This is an important modelling aspect in this case where a number of variables such as “current market feelings” can’t be expressed numerically. As stated in [6] the complete description of a real life system often would require far more detailed data than a human being could ever recognize simultaneously, process, and understand. To address the above challenge a dedicated rule based fuzzy model for predicting the number of sold mortgages using web partners is proposed and developed. The proposed model is described in the Sections that follow.
Conceptual Fuzzy Model of the Polish Internet Mortgage Market
517
2 Rule Based Model In the initial approach to the problem of developing a proper prediction mechanism for the Internet mortgage market a dedicated rule based model was created [1]. The model included the total of 120 rules (complete model), 11 constraints, 4 variables and 2 scenarios (each scenario including 120 rules). The model, which was developed and embedded in times of fast growing period of market and economy, did not work properly in the times of financial crisis that came later, so it was necessary to introduce some changes to this model. This Section concentrates on these changes and focuses on important aspect of choosing the proper set of model variables that can be used in the following step which is the fuzzification process. First, there is a need to present the initial version of rules and variables which were used within the model [4]: Production Rule: IF variable_1 is value_1 AND variable_2 is value_2 AND variable_3 is value_3 AND variable_4 is value_4 AND variable_5 is value_5 AND variable_6 is value_6 THEN result will increase value_7 (1) Each rule from the model consists of variables and its values presented below: z z z z z z z z z z z z z z
Variable_1 = [Commission] Variable_2 = [Interest rates] Variable_3 = [Advertising] Variable_4 = [WIG] Variable_5 = [IbnGR] Variable_6 = [WNE] Result = [Selling mortgage in the Internet] Value_1 = [Small, medium, high] Value_2 = [Very small, small, medium, high] Value_3 = [Very small, small, medium, high, very high] Value_4 = [bad, average, good, very good] Value_5 = [bad, average, good, very good ] Value_6 = [very bad, bad, average, good, very good] Value_7 = [Very small, small, medium, high, very high]
It was necessary to assess and verify the variables which appear in the model presented above. This process was performed in the following three steps: 1. Selection of the modified set of model variables 2. Verification of the variables 3. Analysis of the correlation between variables The above process was described in [4] and it resulted in the choice of four variables and the production rule as follows:
518
A. Orłowski and E. Szczerbicki
z z z z z z z z z z
Variable_1 = [Commission] Variable_2 = [Interest rates] Variable_3 = [Advertising] Variable_4 = [WIG] Result = [Selling mortgage in the Internet] Value_1 = [Small, medium, high] Value_2 = [Very small, small, medium, high] Value_3 = [Very small, small, medium, high, very high] Value_4 = [bad, average, good, very good] Value_5 = [Very small, small, medium, high, very high]
Production Rule: IF variable_1 is value_1 AND variable_2 is value_2 AND variable_3 is value_3 AND variable_4 is value_4 THEN result will increase value_5 (2) Generally, the pre-defined initial group of six variables had unnecessarily strong representation of very general economic indicators which were highly correlated with each other. The first variable to go was WNE (Market sentiment indicator) as the one with the longest actualization cycle (it is published only 4 times a year). For the five remaining variables (commission, interest rate, advertising, WIG and IbnGR) the correlation level between them was evaluated. The findings of this process were presented in [4] with two variables (WIG and IbnGR) highly correlated suggesting that one of them should be deleted from the final list. Because the Warsaw Stock Exchange main indicator level (WIG) is updated more often than IbnGR and has well defined numerical representation the IbnGR variable was deleted from the model.
3 Fuzzy Model 3.1 Fuzzification of the Rule Based Model The key element of fuzzification process is establishing the proper membership function [8]. Due to the fact of the high complexity of the model and a strong influence of socioeconomical factors it was decided to use membership function structured in a table format. All four variables which are used in the model were assigned three linguistic values as illustrated in Table 1 for variable “commission”. Table 2 shows, as further illustration, the membership function for the same variable (“commission”). The linguistic values in Table 1 are based on Polish Internet Mortgage Market reports [11]. Table 1. The example of linguistics values for variable “commission”
Crisp values of the commission level in % 0,1 – 0,3 0,31 - 4 0,41 - 1
Linguistic values Small Medium High
Conceptual Fuzzy Model of the Polish Internet Mortgage Market
519
Table 2. Tabled membership function for variable “commission” Ling. value small
medium
Crisp values 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45 0,5 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 0,95 1 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45 0,5
Grade of membership 1 1 1 0,8 0,7 0,5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0,2 0,3 0,5 1 0,6 0
Ling. value
High
Crisp values 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 0,95 1 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45 0,5 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 0,95 1
Grade of membership 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0,4 1 1 1 1 1 1 1 1 1 1 1
Table 2 illustrates part of data used in the model with 3 fuzzy sets corresponding to each of three linguistic variables. Grade of membership for crisp values were assigned using the database of experiences and information related to Internet mortgage web pages operating on Polish market. For instance, for the linguistic value ‘high’ the grade of membership is 0,4 for value 0,45, and for values 0,5 and higher this membership is equal to 1. It delivers from the aforementioned database where commissions with value 0,45% and lower account for 90% of the market. Value 0,45 can be regarded in this business as a border level above which selling mortgages becomes very profitable activity. Maximum value registered in the period of last 5 years was the commission of 0,6%. However, according to financial institutions operating in Poland, the highest possible level of commission is 1% which obviously represents the grade of membership 1 for linguistic value ‘high’.
520
A. Orłowski and E. Szczerbicki
In the process of fuzzy model development the rule base was used consisting of 81 production rules. This number comes from the number of variables (four variables) in the model and values of these variables (three linguistic values for each variable):
3 4 = 81
(3)
Each rule in the rule base is developed using the IF... THEN logical construct consisting of four variables as in the following example: Production Rule: IF
Commission is small AND AND Advertising is small THEN Selling mortgage in the Internet
Interest rates AND WIG is small
is is
small small (4)
After defining the membership function with its fuzzy values for membership numerical degrees we can proceed with the process of formulating the mapping from a given input to fuzzy output. This is called fuzzy inference and provides a basis from which decisions can be made, or patterns discovered. 3.2 Fuzzy Inference The fuzzy inference process takes a fuzzy set as input and follows the model logic to arrive at the output [9]. To develop and train the inference engine for our case it is necessary to specify values of output variables for the existing 81 production rules. As it was not possible to automatically generate these output values, the expert market knowledge was applied for this purpose. Below we present illustrative examples explaining some parts of this process: Example 1: The lower the value of variable ‘interest rate’ is, the higher the value of ‘selling mortgage in the Internet’ should be. All other variables in the model (money spent on advertising, WIG, commission) work in the other direction: the higher values indicate increase of the output. It should also be to noticed that due to the fact that the model includes all mathematically possible combinations between variables and their linguistic values (the model is complete), it includes also of rules which most probably will never appear in the real life applications. The example of such a rule is presented below: Example 2: IF Commission is small AND Interest rate is medium AND Advertising is high AND WIG medium THEN Selling mortgage in the Internet is medium
is (5)
When the commission for the sold product is small, the interest rate is medium and the stock exchange level (WIG) is medium, it is theoretically possible but highly improbable to invest money in promoting and selling mortgages in these conditions. However
Conceptual Fuzzy Model of the Polish Internet Mortgage Market
521
it was decided not to delete the rules as the one shown in the above example and use the complete model as in real life people may contradict the logic and common sense. Whenever expert knowledge and heuristics are applied, it is necessary to provide justification and verification of such application. As this paper establishes technicalities and formal stages of the proposed fuzzy approach to mortgage market modeling, the verification aspect is left for further stages of this research. 3.3 Defuzzification Defuzzification as the process of producing a quantifiable result is the last step in fuzzy logic based modelling process, and it delivers crisp output values. The Height Method (also know as Max-membership principle) was used in this step [3], [10]. This method has numerous advantages, the most important of which is the fact that the shape of the membership function has no influence on defuzzification and that it does not require complicated calculations. Some examples of the obtained crisp results are presented in Figure 2.
Fig. 2. A screenshot from modeling process presenting crisp results of defuzzyfication process
The main purpose of applying the three steps of fuzzy modeling process illustrated above (fuzzification, fuzzy inference, defuzzification) was to test the possible applicability of purely technical aspects of the process. It was necessary to do this as the proposed in this apper application of fuzzy modeling has never been tested and reported in literature before. At this stage the exact values (especially of defuzzification process) are of no much interest, we will focus on these issues in the future. Here we were able to show that the proposed concept is technically feasible.
4 Conclusion and Future Research Directions This paper presents the steps of developing the conceptual fuzzy model of the Internet mortgage market starting from the basic rule based approach. The proposed model was modified in subsequent steps and finally fuzzyfied. The process of building fuzzy model is presented in three steps: fuzzification, fuzzy inference, and defuzzification.
522
A. Orłowski and E. Szczerbicki
The paper introduces only the first major phase of creating fuzzy model. This phase was necessary mainly for checking some technical aspects of fuzzifing the problem of Internet mortgage market modeling. In the next step the model will be tuned and enhanced using the real market data to make it fully adequate and applicable to real life. To meet the ultimate aims of this research it will be necessary to create the group of three sub-models each one tailored for general economical circumstances that can be experienced; recession, moderate growth and fast growing market. Based on the past results it can be concluded that creation of one comprehensive model does not give proper results when the market changes. Each sub-model will have specific set of variables and their values due to the fact that in different market situations different indicators become important. This was confirmed very strongly by the last financial crisis 2008-2010. Another important future research issue will be determination of the proper fuzzy model for the current market situation. Neural networks will be considered as decision support tool to make the above determination. The final research steps will include the process of verification and fine tuning of proposed models. For verification purposes it is planned to use the 2003-2010 data of one of the biggest polish internet brokers that would cover all three basic market situations: recession, moderate growth and economic boom.
References 1. Orłowski, A.: Knowledge Management in the Internet Mortgage Market. Gdańsk University of Technology, Master Thesis (2008) 2. Szczerbicki, E., Waszczyk, M.: Descriptive modeling of virtual transactions. Cybernetics and Systems: An International Journal 35(5-6), 559–573 (2004) 3. Piegat, A.: Modelowanie i sterowanie rozmyte, Fuzzy Modelling and Control. Akademicka Oficyna Wydawnicza EXIT, Warszawa (1999) 4. Orłowski, A., Szczerbinki, E.: Toward Fuzzy Model of Internet Mortgage Market: Concepts and Initial Assumptions. In: Grzech, A. (ed.) ISAT 2009, Information Systems Architecture and Technology: IT Technologies in Knowledge Oriented Management Process. Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław (2009) 5. Zadeh, L.A., et al.: Fuzzy Sets, Fuzzy Logic, Fuzzy Systems. World Scientific Press, Singapore (1996) 6. Zimmermann, H.-J.: Fuzzy Set Theory and Its Applications, 4th edn., p. 3. Kluwer Academic Publishers, Dordrecht (2001) 7. Zadeh, L.A.: Fuzzy Sets and Information Granularity; Advances in Fuzzy SystemsApplications and Theory. In: Klir, L.A.Z.G.J., Yuan, B. (eds.) Fuzzy Sets, Fuzzy Logic and Fuzzy Systems; Selected Papers, vol. 6, pp. 433–448, 16. World Scientific, Singapore (1996) 8. Zadeh, L.A.: Fuzzy Sets. Information and Control 8, 338–353 (1965) 9. Fuzzy Logic Documentataion, http://documents.wolfram.com/ applications/fuzzylogic/index.html (19.03.2010) 10. Szczepaniak, P., Lisboa, P.: Fuzzy systems in medicine, pp. 69–71. Physica-Verlag/A Springer-Verlag Company (2000) 11. Bankier.pl, http://panel.systempartnerski.pl/informacje/ tabela_prowizji/ (19.03.2010)
Translations of Service Level Agreement in Systems Based on Service Oriented Architecture Adam Grzech and Piotr Rygielski Institute of Computer Science Wroclaw University of Technology Wybrzeze Wyspianskiego 27 50-370 Wroclaw, Poland {adam.grzech,piotr.rygielski}@pwr.wroc.pl
Abstract. The gain of the paper is to introduce and to discuss a formal specification of computer system’s Service Level Agreement (SLA) and its translation into structure of complex services (composed of atomic services) delivering required functionalities and non-functionalities in distributed environment. The SLA is composed of two parts specifying quantitative and qualitative requirements. The former requirements define structure of the adequate complex services as a directed graph, where potential parallelism of atomic services performance may be taken into account. The qualitative requirements are applied to select the optimal complex service realization scenario; it is based on assumption that various atomic services distinguished in the complex services structure are available in the environment in different versions and locations. Different versions of atomic services are delivering the required functionalities and satisfy non-functionalities at various levels. Various locations of atomic services means that the time and cost of atomic services delivery (communication and calculation) may vary. Proposed model of SLA translation into complex services causes that a scenario variants may be applied — among others — to calculate upper and lower complex services’ delivering times and to estimate validity of possible parallelism in complex services. Keywords: SOA, Quality of Service, graph parallelism.
1
Introduction
In systems based on SOA (Service Oriented Architecture) paradigm, complex services are delivered by a distributed computer communication system as a composition of selected atomic services [3,5]. Each atomic service provides certain and well defined functionality and is characterized by parameters describing quality of required and delivered service. Functionality of requested complex service is a sum of functionalities of ordered atomic services; the order is a consequence of complex service decomposition into simple tasks ruled by same time and data interdependencies. Besides requested functionality, service requester may specify certain non- functional properties of complex service formulated in terms of security, availability, cost, response time, quality measures, etc. Values R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 523–532, 2010. c Springer-Verlag Berlin Heidelberg 2010
524
A. Grzech and P. Rygielski
of the non-functional parameters — given in the particular SLA (Service Level Agreement) [4,7] — of the requested complex service may be delivered without changes or should be negotiated before acceptance [1]. In order to deliver complex service with requested functional and non-functional properties, appropriate atomic services must be chosen in the process of complex service composition. One of the most important non-functional properties of services, which is a commonly used measure of the quality of service (QoS), is service response time. Service response time is mainly influenced by three factors: execution time of atomic services, load of the system (number of requests present in the system) and communication delays on links connecting atomic services. In order to guarantee stated in SLA service response time all these factors must be taken into account in the process of service composition. 1.1
SLA Translation Model
It is assumed that new incoming i-th complex service request is characterized by proper Service Level Agreement description denoted by SLA(i). The SLA(i) is composed of two parts describing functional and nonfunctional requirements, respectively SLAf (i) and SLAnf (i). The first part characterizes functionalities that have to be performed, while the second contains values of parameters representing various quality of service aspects (delivery times, guarantees, etc.). The SLAf (i) is a set of distinguished, separated and ordered functionalities subsets: (1) SLAf (i) = Γi1 , Γi2 , . . . , Γij , . . . , Γini where: – Γ (i) = Γi1 ∪ Γi2 ∪ . . . ∪ Γij ∪ . . . ∪ Γini is a set of all functionalities necessary to complete the i-th complex service request, – Γi1 ≺ Γi2 ≺ . . . ≺ Γij ≺ . . . ≺ Γini ordered subset of distinguished functionalities subsets required by i-th complex service request; Γij ≺ Γij+1 (for j = 1, 2, . . . , ni − 1) denotes that delivery of functionalities from the subset Γij+1 cannot start before completing functionalities from the Γij subset. – Γij = {ϕij1 , ϕij2 , . . . , ϕijmj } (for i = 1, 2, . . . , ni ) is a subset of functionalities ϕijk (k = 1, 2, . . . , mj ) that may be delivered in parallel manner (within Γij subset); the formerly mentioned feature of particular functionalities are = l). denoted by ϕijk ϕijl (ϕijk , ϕijl ∈ Γij for k, l = 1, 2, . . . , m + j and k The proposed scheme covers all possible cases; ni = 1 means that all required functionalities may be delivered in parallel manner, while mj = 1 (j = 1, 2, . . . , ni ) means that required functionalities have to be delivered in sequence. It is also assumed that the ϕijk (j = 1, 2, . . . , ni and k = 1, 2, . . . , mj ) functionalities are delivered by atomic services available at the computer system in several versions. It is assumed that the various versions of the distinguished atomic service offer exactly the same functionality and different values of nonfunctional parameters.
Translations of SLA in Systems Based on SOA Paradigm
525
The nonfunctional requirements may be decomposed in a manner, similar to that introduced for functional requirements, i.e.; SLAnf (i) = Hi1 , Hi2 , . . . , Hij , . . . , Hini (2) where Hij = {γij1 , γij2 , . . . , γijmj } is a subset of nonfunctional requirements related respectively to the Γij = {ϕij1 , ϕij2 , . . . , ϕijmj } subset of functionalities. According to the above assumption the SLAf (i) of the i-th complex service request may be translated into ordered subsets of atomic services: SLAf (i) = Γi1 , Γi2 , . . . , Γini ⇒ ASi1 , ASi2 , . . . , ASini , (3) where {ASi1 , ASi2 , . . . , ASij , . . . ASini } is a sequence of atomic services subsets satisfying an order (ASi1 ≺ ASi2 ≺ . . . ≺ ASij ≺ . . . ≺ ASini ) predefined by order in the functionalities subsets Γi1 ≺ Γi2 ≺ . . . ≺ Γij ≺ . . . ≺ Γini (Fig. 1). i-th complex service functionalities
Γij
Γi1
SLA f (i)
{ϕi11,
...
ϕi12 ,
{ϕij1,
ϕij 2 ,
Γini
...
{ϕini 1,
ϕini 2 ,
...
...
...
ϕi1m1 }
ϕijm j }
ϕini mn } i
sets of atomic services for i-th complex service
ASi1
SLA f (i)
{ai11, ai12 ,
ASij
...
{aij1, aij 2 ,
...,
...,
ai1m1 }
aijm j }
ASini
...
{aini 1, aini 2 , ..., aini mn } i
Fig. 1. Transformation of functionalities into sets of atomic services
The order in sequence of atomic services is interpreted as the order in functionalities subsets: ASij ≺ ASi,j+1 (for j = 1, 2, . . . , ni − 1) states that atomic services from subsets ASi,j+1 cannot be started before all services from the ASij subset are completed. Each subset of atomic services ASij (j = 1, 2, . . . , ni ) contains aijk atomic services (k = 1, 2, . . . , mj ) available at the computer system in several versions aijkl (l = 1, 2, . . . , l(k)); ASij = {aij1 , aij2 , . . . , aijni }. Moreover, it is assumed that any version aijkl (l = 1, 2, . . . , l(k)) of the particular aijk (aijk = {aijk1 , aijk2 , . . . , aijkl(k) }) atomic service (k = 1, 2, . . . , mj ) assures the same required functionality ϕijk and satisfies quality requirements at various levels. The above assumption means that — if f un(aijkl ) and nf un(aijkl ) denote, respectively, functionality and level of nonfunctional requirements satisfaction
526
A. Grzech and P. Rygielski
delivered by l-th version of k-th atomic service (aijkl ∈ ASij ) — the following conditions are satisfied: – f un(aijkl ) = ϕijk for l = 1, 2, . . . , mk , = nf un(aijkr ) for l, r = 1, 2, . . . , mk and l = r. – nf un(aijkl ) The ordered functionalities subsets SLAf (i) determines possible level of parallelism at the i-th requested complex service performance (in the particular environment). The parallelism level lp (i) for i-th requested complex service is defined by the maximal number of atomic services that may be performed in parallel manner at distinguished subsets of functionalities SLAf (i) , i.e., lp (i) = max{m1 , m2 , . . . , mj , . . . , mni }.
(4)
The possible level of parallelism may be utilized or not in processing of the i-th requested complex service [6]. Based on the above notations and definitions two extreme compositions exist. The first one utilizes possible parallelism (available due to computation resources parallelism), while the second extreme composition means that the requested functionalities are delivered one-by-one (no computation and communication resources parallelism). components subsets for i-th complex service ASi1
ASi 2
ASi3
SLA f (i ) {ai11,
{ai31,
ai12 ,
{ai 21,
ai32 ,
ai13}
ai 22 }
ai33}
performance scenario with the highest parallelism
ai11
ai31 ai 21
SLA f (i ) ai12
ai32 ai 22
ai13
ai33
in sequence performance scenario ai11
ai31 ai 21
SLA f (i ) ai12
ai32 ai 22
ai13
ai33
Fig. 2. Two extreme performance scenarios for required complex service
The above presented discussion may be summarized as follows (example — Fig. 2). The known functional requirements SLAf (i) may be presented as a sequence of subsets of functionalities, where the size of the mentioned latter subsets depends on possible level of parallelism. The available level of parallelism
Translations of SLA in Systems Based on SOA Paradigm
527
defines a set of possible performance scenarios according to which the requested complex service may be delivered. The space of possible solutions is limited, from one side, by the highest possible level parallelism and, from another side, by natural scenario, where all requested atomic services are performed in sequence. The mentioned above extreme compositions determines some set of possible i-th requested complex service delivery scenarios. The possible scenario can be represented by a set of graphs G(i). Nodes of graph represent particular atomic services assuring requested complex service functionalities, while edges represent an order according to which atomic services functionalities have tobe delivered. The set of all possible graphs G(i) G(i) = {Gi1 , Gi2 , . . . , Gis } assures the requested functionality, but fulfill nonfunctional requirements various levels. The latter may be obtained (and optimized) assuming that at least one node of the particular graph contains at least two versions of the requested atomic service. Two listed above extreme compositions are denoted by graph Gpar (i) and set of equivalent graphs Gseq (i) (Gpar (i) ∈ G(i) and Gseq (i) ∈ G(i)) for parallel and sequenced realization of the i-th required complex service, respectively. It is worth to note that the higher possible parallelism of delivering atomic services is represented only by one graph Gpar (i), while — in general case — possible parallelism of delivering atomic services belonging to the distinguished ASij groups means that there is set of equivalent compositions (set of equivalent graphs Gseq (i)) according to which the required complex service may be delivered. The set of graphs Gseq (i) is in practice a set of equal-length paths, where the lengths of paths (measured by of all number of hops) are equal to number atomic services increased be one (m1 + m2 + . . . + mj + . . . + mni ) + 1 . In the above presented example (Figure 2) the paths lengths are equal to 9 and the number of all equivalent realizations of complex service by sequences of atomic services is equal to 72.
2
Quantitative Analysis of Complex Services Realizations
In general, the optimization task may be formulated (under assumption that all scenarios from the set G(i) deliver at least functional requirements) as follows: SLA∗nf (i)
←
max
Gi1 ,Gi2 ,...,Gis
max
aijkl ∈aijk ∈ASij
Hi1 , Hi2 , . . . , Hini .
(5)
The latter task may be reduced where the particular i-th requested complex service composition (i.e., an graph equivalent to particular i-th requested complex service processing scheme) is assumed. In such a case the optimization task can be formulated as follows: SLA∗nf (i, Gi ) ←
max
aijkl ∈aijk ∈ASij
Hi1 , Hi2 , . . . , Hini ,
where Gi represents the selected i-th required complex service’s scenario.
(6)
528
A. Grzech and P. Rygielski
The above formulated task means that the optimal versions of atomic services, determined by the selected i-th requested complex service performance scenario, should be selected. Such optimization problems can be found in [2]. Any graph Gip (Gip ∈ G(i) and p = 1, 2, . . . , s) (p-th possible performance scenario for the required i-th complex service) may be interpreted as a rootgraph for a set of realization graphs {Gipr } for r = 1, 2, . . . , R, where R is a number of all different combinations of atomic services versions. For example — assuming that there are three distinguished groups of atomic services (ni = 3) containing, respectively, 3, 2 and 3 atomic services (m1 = 3, m2 = 2 and m3 = 3) with two versions of each, the example of root-graph is presented on Figure 3; in the considered example the required complex service root-graph is a sources of 256 (R = 256) different i-th complex service realization graphs with possible level of parallelism equal to 3 (lp (i) = 3). The illustrative calculations simply show how size of the complex services (measured by number of atomic services necessary to complete the required complex service) influences increase of the number of possible solutions.
ai11
ai 31
ai111 ai112
ai 311 ai 21
ai 312
ai 211 ai12
SLA f (i)
ai 212
ai121 ai122
ai32 ai 321
ai 22
ai 322
ai 221 ai13 ai131 ai132
ai 33 ai 222
ai 331 ai 332
Fig. 3. Root-graph for the required complex service realization graphs
Any two realization graphs (determined by a particular root-graph defined by a particular complex service functionalities) int the set {Gipr } differ at least by one version of the set of atomic services aijk standing for the graph Gip nodes. Two possible realization graphs (Gip1 and Gip2 ) for the above required complex service root-graph are presented at the Figure 4.
Translations of SLA in Systems Based on SOA Paradigm
ai11
529
ai31
ai111
ai 311 ai 21
ai112
ai 312
ai 211 ai12 SLA f (i)
ai32
ai 212
ai121
ai 321 ai 22
ai122
ai 322
ai 221 ai13
ai33 ai 222
ai131 ai132
ai 331
Gip1
ai 332
Gip 2
Fig. 4. Two possible complex service realization graphs for root-graph (Figure 3)
3
Complex Service Delivery Times and Resources Utilization
The two distinguished extreme compositions of the i-th required complex service, i.e., based on the highest possible parallelism of atomic services (represented by the Gpar (i) graph) and based on sequence of atomic services (represented by the Gseq (i) set of graphs) can be used to estimate the limits of the i-th required complex service completing times. Let one denote by d Gpar (i) and d Gseq (i) the completing time for the two mentioned extreme compositions for the i-th required complex service execution time. The introduced times (delays) satisfy the inequalities given below. For the highest possible parallel composition of the i-th required complex service (under assumption that the computation and communication resources parallelism, available at the distributed system, does not introduce any complex service performance limits): d(Gpar (i)) ≤
+
n
i −1 j=1
max
k=1,2,...,m1
max
k=1,...,mj ;l=1,...,mj+1
+
comin (ai1k )
com(aijk , ai,j+1,l ) + max
l=1,...,mni
max
k=1,2,...,m1
max
cal(ai1k )+
l=1,...,mj+1
comout (aini l ),
cal(ai,j+1,l ) + (7)
530
A. Grzech and P. Rygielski
where: – comin (ai1k ) is a communication delay between the i-th complex service input interface and the k-th (k = 1, 2, . . . , m1 ) atomic service from the Ai1 set – cal(ai1k ) is a calculation time of k-th atomic service from the Ai1 set – com(aijk , ai,j+1,l ) is a communication delay between the k-th (k = 1, 2, . . . , mj) and l-th (l = 1, 2, . . . , mj+1 ) atomic services belonging to the two separate, neighbouring and ordered sets of atomic services Ai j and Ai,j+1 , respectively – cal(aijk ) is a calculation time of k-th atomic service from the Aij set – comout (aini k ) is a communication delay between the k-th atomic service from the Aini set and the i-th complex service output interface. For sequence composition of the i-th required complex service (under assumption that the resources parallelism does not exists and the complex service is composed with atomic services performed one-by-one): d(Gseq (i)) ≤
+
mj ni
j=1
calaijk
max
k=1,2,...,m1
comin (ai1k )
+ (mj − 1)
k=1
+
max
max
k=1,2,...,m1
l,u∈{1,2,...,mj };l =u
max
k=1,...,mni
cal(ai1k )+
com(aijl , aiju ) +
(8)
comout (aini k ),
where com(aijl , aiju ) is a communication delay between successive atomic services ordered according to the given Gseq (i). It is rather obvious that above given expressions for d Gpar (i) and d Gseq (i) present, respectively, lower and upper i-th complex service’s completing (delivering) times and moreover, that the compared times always satisfy the following inequality: d Gpar (i) ≤ d Gseq (i) . The value d Gseq (i) − d Gpar (i) , which is always non-negative, may be considered as a simple measure of parallelism attractiveness at the i-th complex service’s completing procedures. If the distance between the two considered completing times is relatively high, it means that parallel processing of the selected atomic services is worth to be organized and deployed. If the distance between the two compared values is small it means that all efforts necessary to organize and perform atomic services in parallel manner are not compensated by noticeable reducing service’s completing time. of the i-th complex The value d Gseq (i) − d Gpar (i) , i.e., parallelism attractiveness, strongly depends on possible parallelism in i-th complex service’s completing procedures, i.e., level of resources parallelism (the latter is measured by highest number of atomic services that can be computed in parallel). Assurance of the highest possible level in computing atomic services in a parallel manner is possible if, and only if a required level of resources distribution is obtainable. It further means that the possible parallelism attractiveness should be compared with the efficiency of the required resources utilization.
Translations of SLA in Systems Based on SOA Paradigm
531
The above discussed d Gseq (i) − d Gpar (i) value may be applied both as a simple measure of resources utilization in distributed environment as well as for distributed environment design purposes. The gain of the former application is to evaluate resources utilization factors reflecting validity of the distributed environment (especially available parallelism) for required complex services realization purposes. The aim of the latter is to predict required level of resources parallelism for known set of complex services (given by known set of complex services SLA).
4
Experimental Example
In former analysis a lower and upper bound for service completing time (represented as graph) were introduced (equation (7) for parallel and (8) for serial). Moreover, there has been a parallelism measure for graph G(i) introduced (equation (4)) having its upper and lower bound dependent on number of vertices used ni in graph, respectively, lp (Gsec (i)) = mk and lp (Gpar (i)) = 1. Of course one k=1
can use other measures to describe the possible grpah parallelism level. Having such parallelism measure allows to estimate differences in service completing times with respect to various values of mentioned measure. This can be used to estimate the benefit (expressed as service completing time in this example) of using scenario graphs with various parallelism level values. Experimental example shows lower bound of service completing time (shortest obtainable completing time at given level of parallelism) for exemplary graph with 7 vertices (two of vertices represent service input and output interface). In example there has been another parallelism measure (P M 1) used for comparison which is expressed as total number of paths connecting input and output interface divided by vertices count. Measure P M 2 is a measure mentioned formerly (equation 4) expressed as maximum number of vertices that can be used simultaneously in processing one request. The experiment consisted of generating all possible directed acyclic graphs with 7 vertices and with distinguished starting and ending vertex. Each vertex of such generated graph has been represented as computational node with proper atomic service version installed. Edges of mentioned graph were modeled as communication channels linking computational nodes. To model the system and execute the experiment an OMNeT++ simulation environment has been used. For each of generated scenario graph there was single complex service request executed and completion time of such composed complex service with graph parallelism measure values has been stored. From the collected data lower bounds have been chosen (lowest possible completion time for given parallelism measure value) and presented on a chart 5. Proper delays have been modeled as follows: single atomic service completing time was set to 1, transport delay for each edge was set to 0.1. Noteworthy is fact that parallelism measures used in experiment have different and domains (i) = 1.2 best and worst possible completing time — in this example d G par and d Gseq (i) = 5.6 — are reached for various parallelism level values.
532
A. Grzech and P. Rygielski
Service completing time
6 5 4 3
PM1 PM2
2 1 0 0
1
2
3
4
5
6
Parallelism measure value
Fig. 5. Best obtainable service completing time with respect to constraints put on parallelism level value using two different parallelism measures proposed
Introduced approach can be useful in various optimization tasks concerning shaping of service execution graph with respect to quality of service constraints and can be implemented in real SOA system for using it in composition process to optimize quality of services delivered to users. Acknowledgments. The research presented in this paper has been partially supported by the European Union within the European Regional Development Fund program no. POIG.01.03.01-00-008/08.
References 1. Anderson, S., Grau, A., Hughes, C.: Specification and satisfaction of SLAs in service oriented architectures. In: 5th Annual DIRC Research Conference, pp. 141–150 (2005) 2. Grzech, A., Rygielski, P., Swiatek, P.: QoS-aware infrastructure resources allocation in systems based on service-oriented architecture paradigm. In: HET-NETs 2010, pp. 35–48 (2010), ISBN 978-83-926054-4-7 3. Johnson, R., Gamma, E., Helm, R., Vlisides, J.: Design patterns; elements of reusable object-oriented software. Addison-Wesley, Reading (1995) 4. Keller, A., Ludwig, H.: The WSLA Framework: Specifying and Monitoring Service Level Agreements for Web Services. Journal of Network and Systems Management 11(1), 57–81 (2003) 5. Milanovic, N., Malek, M.: Current Solutions for Web Service Composition. IEEE Internet Computing 8(6), 51–59 (2004) 6. Stefanovic, D., Martonosi, M.: Limits and Graph Structure of Available InstructionLevel Parallelism (Research Note). In: Bode, A., Ludwig, T., Karl, W.C., Wism¨ uller, R. (eds.) Euro-Par 2000. LNCS, vol. 1900, pp. 1018–1022. Springer, Heidelberg (2000) 7. Li, X., et al.: Design of an SLA-Driven QoS Management Platform for Provisioning Multimedia Personalized Services. In: AINA Workshop, pp. 1405–1409 (2008)
Ontology Engineering Aspects in the Intelligent Systems Development Adam Czarnecki and Cezary Orłowski Gdansk University of Technology, Faculty of Management and Economics, Department of Information Technology Management, ul. Narutowicza 11/12, 80-233 Gdańsk, Poland {Adam.Czarnecki,Cezary.Orlowski}@zie.pg.gda.pl
Abstract. The ontology engineering encompasses both, artificial intelligence methods and software engineering discipline. The paper tries to address a selection of aspects pertaining to development activities such as choice of the environmental framework, functionality description, specification methods and roles definition. Authors refer to the ontology development projects they were involved in. Keywords: ontology, intelligent systems, knowledge engineering, OWL, requirements specification.
1 Introduction The ontology development process can be seen as a specific case of the software engineering project. Such steps as requirements and knowledge acquisition and formal specification, technology for implementation (languages and development tools), testing and deployment need to be planned and executed in order to succeed. It can also be regarded as a project management activities belonging to the following areas: processes, technologies and human resources. Authors of this paper would like to address some of these issues in the context of the projects they were involved in—mostly the agent-based systems that utilize ontologies along with knowledge bases for the purpose of the information technology assessment. The aim of this paper is not to present an entire engineering approach, but to pinpoint a number of dilemmas authors have encountered while conducting their research and, to some extent, confront it with available methods or guidelines both from the artificial intelligence and software engineering areas.
2 Ontology Definitions A term ‘ontology’, which derives from the philosophy, has been adopted by the computer science and knowledge management and is a defined as a structure that provides the means to classify objects, to give these classifications names and labels, and to define the kind of properties and relationships they can be assigned [13, p.142]. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 533–542, 2010. © Springer-Verlag Berlin Heidelberg 2010
534
A. Czarnecki and C. Orłowski
A more precise definition states that ontology is a logical theory which gives an explicit, partial account of a conceptualization [8]. Formal ontology definitions can be found in [5] and [14]. Any of the proposed definitions may not be sufficient to understand the purpose for creating ontologies. The most prominent reasons are [10, p. 1]: ─ to share common understanding of the structure of information among people or software agents, ─ to enable reuse of domain knowledge, ─ to make domain assumptions explicit, ─ to separate domain knowledge from operational knowledge, ─ to analyze domain knowledge. The motivation to develop an ontology is closely bounded with its expected functionality (see: 4 Functionality of the ontology).
3 Ontology Development Framework 3.1 Languages Ontology-based solutions are expected to deliver knowledge—a rather sophisticated information structure (perhaps the only higher form of data organization is wisdom). But before one starts to aggregate data into information and information into knowledge, one must begin with giving a means to express all those types of communication: the language. There are certain requirements for ontology languages concerning their expressiveness, way of organizing concepts, creating relations, embedding axioms, the extent to which it is possible to distinguish facts from claims or to conduct reasoning [10, pp. 2-5]. Many distinctive features can be identified and used for the exhausting evaluation of the linguistic solutions. However, only a short list of the ‘milestone’ languages is to be mentioned in this paper: Knowledge Interchange Format (KIF), Ontolingua, Operational Conceptual Modelling Language (OCML), LOOM, Web Ontology Language (OWL). Ontolingua, OCML, LOOM can be described as ‘classic’ languages, while OWL—built upon RDF, RDFS and serialized in XML—have features (such as grounds in formal logic that allows reasoning, open-world assumption, monotonicity) that meet the requirements of the contemporary knowledge-based systems such as the Semantic Web. The Semantic Web is a term referring to The World Wide Web Consortium (W3C) vision of the Web of linked data. The W3C recommends the Web Ontology Language (OWL) [12] as a language for ontology development. Authors experience with past and current ontology projects also suggests accepting OWL as a standard language with Protégé, an application developed at Stanford University, as its tool (however in one case the use of the table-form ontology in Microsoft® Excel’s Visual Basic for Applications also proved to be successful [2]). The language along with the application tool both need a method that would describe how to develop an ontology in order to define an ontology technology. The next subsection of this paper addresses these methods.
Ontology Engineering Aspects in the Intelligent Systems Development
535
3.2 Development Methods A tool without a knowledge on how to utilize it is virtually useless. One must add a method of operating the given tool. They can be expressed as a detailed algorithm, a formal standard or recommended guidelines (as, for example, these given in [10]). When developing an ontology one should be aware that there is no single correct way to model knowledge. The scope and the perception of the domain in question, the conceptualization process may all vary depending—among other factors—on the planned application of the given ontology (see: 4 Functionality of the ontology). A second assumption of ontology development states that it is an iterative process. It seems unlikely that one should obtain a complete, accurate model that fully meets the requirements at the first attempt. And even if, the ontology is often a part of a larger system which may—throughout its life-cycle—also enforce later changes. Thirdly, one should to assume that the ontological model reflects the concepts from the domain of interest. Because one of the purposes for creating ontologies is to share common understanding of the domain, it is inadvisable to model the domain in the way that would be operational for software agents but completely unreadable for any human except its originator. Bearing in mind these three assumptions, one can proceed with creating an ontology. One of the approaches divides this process into eight steps [1, p. 226]: ─ ─ ─ ─ ─ ─ ─ ─
Step 1.: Determine the domain and scope of the ontology. Step 2.: Consider reusing existing ontologies. Step 3.: Enumerate important terms in the ontology. Step 4.: Define the classes and the class hierarchy. Step 5.: Define the properties of classes (slots). Step 6.: Define the facets of the slots. Step 7.: Create instances. Step 8.: Check for anomalies.
As one can see, these steps correspond to the object-oriented approach. However, it is not identical with programmers’ point of view. It puts stress on structural properties of a class rather than on the methods and operational properties which well addresses the features of the OWL. Somewhat similar guidelines but more related to the software engineering were presented in METHONTOLOGY by Fernández, Gómez-Pérez and Juristo [6]. They put more stress on the documents such as requirements specification and the activities that can lead to the concurrent ontology development in the evolving cycle (see: Fig. 1). The third approach that can be mentioned refers to the design patterns concept [7, p. 69]. In this approach one can see ontologies as a “knowledge skeletons of domains”, a specific building blocks that represent some common knowledge models and which can be reused and developed incrementally. Gašević, Djurić and Devedžić [7, p. 177] write about ‘the gap between the knowledge of software practitioners and AI techniques’ and present a few software engineering approaches to ontology development such as UML representation and other Model-Driven Architecture-based standards that aim to adopt a standardized architecture named Ontology Definition Metamodel (ODM).
536
A. Czarnecki and C. Orłowski
Activity
States Conceptualization
Formalization
Integration
Implementation
Planification Maintenance
Specification
Activities Acquiring Knowledge Documenting Evaluating
Fig. 1. METHONTOLOGY: states and activities [6]
To summarize this section, one can state that while the selection of the language for the ontology development seems rather obvious as the OWL is recommended for the Semantic Web, the method for conducting all required activities is still discussed and tends to focus on the life cycle or on the craft of the ontology modeling. And as it still lacks of the general and complex development framework, perhaps one can expect some specific methods/methodologies in more narrow areas, for example in the rule- or fact-based expert systems.
4 Functionality of the Ontology The functionality of the ontology should be discussed in the aspect of the competency questions it should be able to answer. This determines the domain, scope and depth of the knowledge. There is also a matter of who or what will query the ontology—this information should be presented as a list of use cases. In this section authors refer to the Multi-Agent System for the Information Technology Assessment (MAS_IT)—an example of the agent-based system that is being under development [4]—to present their notes on the suggested functionality of ontologies in the agent-based systems that also utilizes knowledge bases and expert (sub)system. First, authors would like to divide a domain knowledge into two categories: 1. General terms and relations among them—so-called ‘meta-knowledge’. One may refer to them as to classes and associations in the object-oriented design (OOD). 2. A specific data sets that refers to the ‘meta-knowledge’ but holds an ‘operational knowledge’. That would be similar to the objects in the OOD. Such distinction is needed to draw a borderline between an ontology (point 1) and a knowledge base (point 2).
Ontology Engineering Aspects in the Intelligent Systems Development
537
System
Us er
Ask a Co m pe t e ncy Que s t io n <> Ontology Che ck / Fo rce Co rre ct ne ss
<>
Kno wle dge Enginee r
Input Kno wledge Ex pe rt
Fig. 2. Selected use cases of ontology in the exemplary agent-based expert system
As the ontology’s role is clear, its functionalities should be specified. During the analysis phase the following tasks were assigned to the ontology in the MAS_IT: 1. Semantics for knowledge bases content and for competency questions and answers. 2. Reasoning for the inferred class hierarchy search and for consistency check. This first set of functionalities has been transformed into use cases (Fig. 2). The reasoning—which could be a pillar of the ontology rationale in many intelligent solutions—is an auxiliary functionality in the given system and, for the time being, has not been fully devised, so it has been omitted on the diagram. Therefore the main task for the ontology is to provide a vocabulary for both use cases that include its request. More detailed role of ontology should be elicited in the specification phase.
5 Specifying Ontology The process of ontology specification must be prior to the strict development activities. The METHONTOLOGY [6] proposes—as a minimum—the following information to be provided: ─ The purpose of the ontology, including its intended users, scenarios of use, end users, etc. ─ Level of formality of the implemented ontology. ─ Scope, which includes the set of terms to be represented. One may ask how formal must the requirements specification document be to assure a quality of an ontology. The METHONTOLOGY mentioned above is provided with a sample ontology requirements specification [6, p. 37] that has a form of the text description with 8 labels (Domain, Date, Conceptualized by, Implemented by, Purpose, Level of Formality, Scope, Sources of Knowledge) corresponding to the list given above. This kind of document serves well as a initial source for ontology modeling
538
A. Czarnecki and C. Orłowski
requirements. However, as the project progresses to the conceptualization phase, more formal techniques to represent the knowledge should be used. The ontology is the ultimate stage of conceptualization, but one may ask how to transform a textual specification into a OWL (or any other ontology language) form. Is there a need for some middle layer that would have the formality and expressiveness sufficient to serve as a model of the ontology for the knowledge engineer (KE) and—on the other hand—would be clear enough for the domain expert to allow her/him to make changes in this document in order to provide an accurate description of the domain? Such question has arisen during the MAS_AIR project [2] in the chemical domain, where the ontology was used as a source for knowledge on the air pollution and atmospheric conditions and where KE was not a domain expert. The proposed answer was UML, especially its class diagrams. The premises to use Unified Modeling Language were as follow: ─ well defined and popular in software engineering language, ─ diagrams are, to some extent, easy to read even by a non-software specialist, ─ some structural similarities between UML and OWL concepts that can be mapped. MAS_AIR Ontology and its UML class-like representation one can see on fig. 3. This example shows, that concepts such as classes and their hierarchy is equal on both languages. UML class diagram also gives an aggregation association as a standard, while OWL requires a specific property to be created (named ‘consists_of’ on fig. 3). Generally, OWL properties (‘owl:ObjectProperty’) can be modeled as UML associations. Also OWL class values can be shown as UML class attribute values. Some more similarities can be found [7, p. 199], but there are also known incompatibilities that did not manifest in the simple chemical ontology. To address a few [7, p. 204]: ─ Ontology languages are monotonic (adding new facts cannot falsify previous facts), whereas UML and other object-oriented languages are non-monotonic. ─ A property is a first-class (independent) modeling element in ontology languages, while the UML notion of an association it is not. ─ In ontology languages one can specify a cardinality constraint for every domain of a property all at once, whereas in UML it must be specified for each association of the property. To overcome these and more incompatibilities, the Object Management Group (OMG) has issued the Ontology Definition Metamodel (ODM) to make the concepts of Model-Driven Architecture (MDA) applicable to the engineering of ontologies [11]. This area, however, requires a further research before authors would be able to comment results of its application. In addition, it’s worth to mention an opposite approach: Jasper and Uschold [9] propose using ontology as a specification in the software development. They point out such benefits of this method as documentation, maintenance, reliability and knowledge (re)use. It seems that if the ontology representation would be clear enough, any middle layer models would be obsolete. One of the suggestion is to use a Jambalaya plug-in for the Protégé application for the graphical view of the ontology (see: upper part of the fig. 3). The idea of using ontology as a specification document is close to
Ontology Engineering Aspects in the Intelligent Systems Development
539
the proposal of creating ontology-based metamodels of development, manufacturing or management standards and can be found in [3]. To conclude this section, the need for formal requirements modeling arises when the knowledge engineer is not a domain expert and the complexity of the ontology gives a sound premises to pre-model it in a manner that the expert and KE can share.
Obs erv a t ionSit e
Dosim e t e r
Obs erv a t ionDa t e
A v gTe m perat ure
A v gHum idit y
A v gA t m Pres sure
A v gWindSpe ed
Param e t e r +name +unit +lowerLimit +upperLimit +lowerAlert +upperAlert +dataType
Benz ene
Tolue ne
Input Param et er
Out putParam e te r 5 1
Et hy lbenze ne
p,m -Xy lene Predict ion +benzeneLevel +tolueneLevel +ethylbenzeneLevel +p,m-XyleneLevel +o-XyleneLevel
o-Xy le ne
Obse rv at ionDat aSet
Fig. 3. OWL and UML class diagram of the MAS_AIR ontology
6 Actors of the Ontology Engineering Process The problem of the requirements specification, implementation and other ontology engineering activities is closely connected with roles that are assigned to the project participants (actors) and their responsibility. In the ontology application scenarios Jasper and Uschold [9] identify the following actors: Ontology Author, (Operational) Data Author, Application Developer, Application User, and Knowledge Worker.
540
A. Czarnecki and C. Orłowski
Table 1. Essential roles and their responsibilities in the ontology development (to discuss) Actor Tasks
Deliverables
Client/Expert • Specify functional requirements. • A domain expert: a source of knowledge on notions, semantics etc. • Requirements validation. • List of functional requirements. • Glossary of terms.
Analyst • Knowledge acquisition: sessions with a client. • Representation of all requirements. • Requirements verification.
Knowledge Engineer • Ontology development on the basis of the ORS. • Knowledge acquisition: sessions with a client?
• Ontology Requirements Specification (ORS). • Conceptual model document. • Formalization document.
• Ontology. • Implementation document.
Authors of this paper would like to introduce a simplified roles partition in the ontology development process. Table 1 contains a list of actors, their tasks and deliverables. Those roles have been derived from the software engineering phase of the requirements specification. We assume—as it was in the MAS_AIR case—that the client, user and the domain expert roles are all held by one person called a ‘Client’ in our example. Table 1 is not an ultimate proposition to define roles in the ontology development process, but is put here to bring some questions. E.g. is an Analyst role (as a separate person) required? Cannot it be joined with the Knowledge Engineer? As it was discussed in the previous section of this paper (see: 5 Specifying ontology), a formal descriptions of the domain can be used in the knowledge acquisition and conceptualization activities. But if the representation of the ontology developed by the KE is clear to the Client, do they need an Analyst as a ‘middleman’? For the sake of the documenting the requirements such actor may be useful, but if there are no problems with reflecting requirements in the ontology and tracing the changes, such role and work would be redundant (in a negative sense). If we merge these two roles, an Analyst and a Knowledge Engineer, we can erase a question mark behind the task of knowledge acquisition assigned to the KE? Of course, then we have to take into account some social skills of the KE. She or he plays somewhat similar role as the software developer in the ‘classic’ software engineering project. Such person cannot be focused on her/his development environment, screen and keyboard, but has to actively cooperate with the domain expert(s) in order to acquire all information required in the ontology. To sum up the motivation behind the idea of the tight Expert-Knowledge Engineer collaboration, it should shorten the development time and improve the quality of the ontology (by reducing the ‘Whisper Down the Lane’ effect).
Ontology Engineering Aspects in the Intelligent Systems Development
541
7 Summary In this paper authors have addressed a few issues that emerge during the ontology development activities: choosing the framework, devising the functionality of the ontology as a module of the knowledge-base system, ways to specify the ontology and roles and their responsibilities in the development process. Ontology development is process where AI and software engineering methods meet. Skills of formalizing knowledge must be accompanied with the project life cycle awareness. On the basis of authors own experience and literature study, the following guidelines can be stated: ─ The OWL—as a standard, especially in the Semantic Web area—should be considered as a language for the ontology development with its DL sublanguage which provides a good balance between expressivity and performance. ─ The functionality of the ontology may vary, but most prominent use cases are: reasoning and providing a vocabulary. ─ A process of ontology specification should be documented. The level of formality of such documents and number of ‘layers’ that there are between the initial requirements specification and the complete ontology may be different. However, the less is best. ─ Same may apply to the number of the ‘middlemen’ in the ontology specification, knowledge acquisition and conceptualization process. ─ Ontology is just a part of the ‘Semantic Web layer cake’ and maturity of its engineering methods rely upon the maturity of other technologies engaged in the Semantic Web Applications development. This paper is a result of the research on applying the intelligent methods for the IT evaluation founded by the Ministry of Science and Higher Education.
References 1. Antoniou, G., van Hamerlen, F.: A Semantic Web Primer, 2nd edn. The MIT Press, Cambridge (2008) 2. Czarnecki, A., Orłowski, C.: Hybrid Approach to Ontology Specification and Development. In: Grzech, A., Borzemski, L., Świątek, J., Wilimowska, Z. (eds.) Information Systems Architecture and Technology: Service Oriented Distributed Systems: Concepts and Infrastructure, pp. 47–57. Wrocław University of Technology, Wrocław (2009) 3. Czarnecki, A., Orłowski, C.: Ontology As a Tool for the IT Management Standards Support. In: Jędrzejowicz, P., Nguyen, N.T. (eds.) Proceeding of the KES-AMSTA-2010 (to be published 2010) 4. Czarnecki, A., Orłowski, C.: Ontology Management in the Multi-Agent System for the IT Evaluation. In: Grzech, A., Borzemski, L., Świątek, J., Wilimowska, Z. (eds.) Information Systems Architecture and Technology. Information Systems and Computer Communication Networks, pp. 37–48. Wrocław University of Technology, Wrocław (2008) 5. Ehrig, M.: Ontology Alignment: Bridging the Semantic Gap. Springer, New York (2007) 6. Fernández, M., Gómez-Pérez, A., Juristo, N.: METHONTOLOGY: from Ontological Art towards Ontological Engineering. In: Proceedings of the AAAI 1997 Spring Symposium, pp. 33–40. AAAI Press, Menlo Park (March 1997)
542
A. Czarnecki and C. Orłowski
7. Gašević, D., Djurić, D., Devedžić, V.: Model Driven Engineering and Ontology Development, 2nd edn. Springer, Heidelberg (2009) 8. Guarino, N., Giaretta, P.: Ontologies and Knowledge Bases. Towards a Terminological Clarification. In: Mars, N. (ed.) Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, pp. 25–32. IOS Press, Amsterdam (1995) 9. Jasper, R., Uschold, M.: A Framework for Understanding and Classifying Ontology Applications. In: Benjamins, V.R. (ed.) Proceedings of the IJCAI 1999 Workshop on Ontologies and Problem-Solving Methods: Lessons Learned and Future Trends, vol. 18. CEUR Publications/University of Amsterdam, The Netherlands (1999) 10. Noy, F.N., McGuiness, D.L.: Ontology Development 101: A Guide to Creating Your First Ontology. Stanford Knowledge Systems Lab. Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report SMI-2001-0880, Stanford, USA (2001) 11. Ontology Definition Metamodel, Version 1.0, http://www.omg.org/spec/ODM/1.0/PDF 12. OWL 2 Web Ontology Language Document Overview, W3C Recommendation (October 27, 2009), http://www.w3.org/TR/owl2-overview/ 13. Passin, T.B.: Explorer’s Guide to the Semantic Web. Manning Publ., Greenwich (2004) 14. Smirnov, A., Levashova, T., Shilov, N.: Knowledge sharing in flexible production networks: a context-based approach. International Journal of Automotive Technology and Management 9(1), 87–109 (2001)
Supporting Software Project Management Processes Using the Agent System Cezary Orłowski and Artur Ziółkowski Gdansk University of Technology, Faculty of Management and Economics, Department of Information Technology Management, ul. Narutowicza 11/12, 80-233 Gdańsk, Poland {Cezary.Orlowski,Artur.Ziolkowski}@zie.pg.gda.pl
Abstract. The natural development of information technology area stimulates intense growth of various technologies which support basic organization processes. The range of technologies implies appropriate management thereof, as well as accurate correspondence with conducted activities. Abandonment of activities within individual branches in favour of project approach to performed assignments (especially those connected with software development) has recently become a popular subject of management. This approach involves a variety of technologies which support project tasks. Yet the question arises, how to harmonize proper technologies with project management in an effective way. Agent system that is built by authors is a kind of solution to this problem. That system is a kind of tool which serves to help project managers with the choice of a proper method and technology of project management. One of most important decision taken before project start is method and tools selection. When method is not fitted to project team it can provide to project’s failure. Also many times client’s knowledge about IT project area can be important for getting success. Keywords: IT project, agent system, decision supporting.
1 Reasons to Discuss the Subject Managing IT projects does not only entail classic management functions, but it is more often perceived as a kind of art that most of all requires the broad experience of managers and their abilities to optimise technologies applied during the realisation of such projects. It’s spoken that almost 70% of IT projects are unsuccessful due to exceeding one of the basic limitations (the schedule, budget or scope of the project). The fact that software project management is a kind of art is also confirmed by the general opinion on the unrepeatability of projects. It occurs that, contrary to e.g. construction projects, in IT projects, the range of changeability and unrepeatability is so wide that it is difficult to conduct two identical IT projects [1]. Such an affirmation introduces a set of management problems. It appears that in conditions of low repeatability and high changeability, basic (classic) management functions are insufficient to achieve a market success. These problems may be described as specific management problems. They may relate to problems with R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 543–552, 2010. © Springer-Verlag Berlin Heidelberg 2010
544
C. Orłowski and A. Ziółkowski
team-adjustment as well as with resource planning, setting a deadline and many more. In addition, such unrepeatability of IT projects entails the necessity to seek system solutions which facilitate the work of managers in IT projects. Therefore, project realisation models that interchangeably ensure the success of the projects they realise have been sought for many years. Thus commenced the so-called classic models of IT project realisation (also known as ‘hard’) and their further development and modifications (e.g. Rational Unified Process). Through the years, classic models appeared to be of low efficiency and could not adjust to the high dynamics of changes in certain situations (especially where high changeability of requirements appeared). These problems initiated new, so-called ‘soft’ project management methods (e.g. SCRUM). However, it is very often the case that regardless of the choice of the project realisation method, i.e. soft or classic, the percentage of unsuccessful projects is very similar. The question arises about the reasons for unsuccessful IT projects as well as about the possibility to counteract unsuccessful projects. Project managers agree on one issue. Managing IT projects requires system support, i.e. adopting certain structural solutions in projects. Therefore, it means that a methodological approach instructing the application of recommendations (good practices) constituting separate IT project realisation methods is justified. However, the above-mentioned unrepeatability of projects means that the choice alone of a project management method does not guarantee success. Success may only be guaranteed by the proper choice of a method, meaning adjusting it to those basic limits that a project manager has to face. It seems appropriate to Indicate key parameters which influence the decision of the project manager regarding the choice of a project management method. The above also means that the main axis of the research conducted by the authors of this paper is the identification of the parameters influencing the choice of a project management method and the mechanism of processing these parameters in order to successfully and effectively indicate the proper project management method in certain circumstances. Such an approach should serve as support for the project manager who, by describing the relevant parameters, chooses the method which suits the given project the best and guarantees its success, counteracting the possibility of exceeding the limitations(the schedule, budget or scope). However, the research problem presented here entails another relevant issue. Is the automation of such decision-making possible, in other words, can the choice of the project realisation method be electronically supported by an independent decision system? The answer to such a question seems evident as only the input parameters may be transformed into formal language. Therefore, a part of the research conducted by the authors focuses on such a record of relevant factors influencing the choice of a method, so that the proper IT system is able to indicate which method (or its elements) should be applied in the given project. A large part of the research focuses also on the IT system itself. This paper presents the concept of an agent system, supported by ontology and an expert system, which, using given input parameters, will be able to indicate in an arbitrary and objective way which method should be applied in the given project [2]. Applying elements of artificial intelligence seems particularly justified in such fields as software project management. This mainly arises from the above-mentioned fact that projects are characterised by high changeability. Changeability may also after some time result in the fact that the set of criteria deciding on the success of a project or the set of criteria influencing the choice of a method
Supporting Software Project Management Processes Using the Agent System
545
undergo change. Applying elements of artificial intelligence (including programme agents) enables a system to adjust in a flexible way to future changes, ensures calibration if the number of parameters increases and enables the inclusion of such a system in another device (e.g. supporting the realisation of IT projects). Summarising the above assumptions, it must be mentioned that it is incredibly important to prove the necessity to support the decisions of IT managers with intelligent IT systems which support project management in choosing specific methods and devices in project realisation.
2 The Issue of Support the Software Project Management Processes The problem of choosing methods to manage IT projects is becoming a more common managing ‘dilemma’. It appears that the improper choice of methods very often causes the failure of the project (by not fulfilling its conditions – the budget, scope or schedule). It must be deemed that support in decision-making regarding the methods and devices applied in the realisation of the project is a basic planning task of the project manager. It is thus particularly worth focusing on the planning function, because it is exactly within this function that decisions regarding the further realisation of the project are made. In other words, it seems that this managing function is fundamental as far as the further performance of the project is concerned. Errors made at the planning stage (regarding both the team and the schedule) may have many repercussions in the proceedings of the project itself. Therefore, it means that it is at the planning stage that key decisions regarding project management methods are made. It is also at the planning stage that the manager should indicate the method which he/she will apply in the realisation of the given project and choose the device that will support its realisation. The choice of the proper method for the realisation of the project may therefore influence its success. Choosing a method that is scarcely known to the manager or barely recognised by the team may result in delaying work on the project, and subsequently, an eventual failure due to for example exceeding the schedule. The above issue proves that the choice of the proper method is fundamental as far as the whole project is concerned. Choosing the proper method seems therefore fundamental for the whole project. It seems more justified to support managers at the stage of making planning decisions regarding the given IT project with an independent system giving advice on which method would be optimal as far as the nature of the project is concerned. It also means that the formulated hypothesis that managers should be systemically supported in choosing methods for the realisation of projects seems to be confirmed by the above-mentioned reflections. It is also worth adding that there are numerous formal methods of project management (about 10 main methods, i.e. RUP, PRINCE, SCRUM, XP). Apart from these, we can distinguish many methods individually created by software producers for their own needs (as in SAP). A system solution allowing the indication of the method of the realisation of the project as well as indicating the proper IT device for the project, based on input conditions, seems very useful and is related to the main aim of the conducted research, being the optimization of managing processes conducted by project managers [3, 5].
546
C. Orłowski and A. Ziółkowski
In order to describe the input parameters that may be relevant for the choice of the project management method, the most important elements existing at the very beginning of planning an IT project should be analysed. Such elements and their description will allow for the extraction of the basic parameters influencing management decisions regarding the choice of methods and devices.
3 Presentation of the Model to Support the Software Project Management In order to support management decisions, the concept of an agent system indicating a project management method based on certain circumstances indicated by a project manager was developed [4]. The system approach to this concept is described below. According to earlier assumptions, the basis of reasoning is to collect key information regarding input parameters.
Fig. 1. Key areas of the system
Experiences gained during the realisation of many projects enable the indication of three basic key variables for decision-making regarding the choice of an IT project management method. These variables include: a) Client maturity b) Organisation maturity c) Project entropy Client maturity means the level of client preparation to participate in a project. Increasingly, IT project management methods deal with the client’s participation and their influence on the changeability of the project. The necessity to modify the itineraries of the conducted projects is usually caused by client interference [7]. Depending on the level of familiarity with a project management field, we can distinguish 4 instances of client maturity: ADJUSTED AND PROPER – a situation when the client is familiar with the project management field, understands its processes and is open for cooperation. Project managers seek such client representatives. ADJUSTED AND IMPROPER - a situation when the client is open for cooperation but is not familiar with the project field. It is often the case that the client matures as the project develops.
Supporting Software Project Management Processes Using the Agent System
547
UNADJUSTED AND PROPER – a situation when the client is familiar with the project field but is difficult in cooperation. This situation often leads to endangering project finalisation. UNADJUSTED AND IMPROPER – a situation when the client is not familiar with the project management field, does not understand the conditions of the project and is barely engaged in it. This is the most difficult scenario for a manager. Organisation maturity relates to the immanent features of the team conducting the given project. It is said that modern IT organisations gain certain levels of maturity (e.g. the CMMI model describes this). Depending on the maturity level, a proper project management method should be adjusted. Soft methods requiring the selforganisation of work are much more recommended for more experienced teams (at a higher maturity level) than for young inexperienced teams. An exemplary gradation of this parameter may be described as follows (as in CMMI): •
LEVEL 1 – FUNCTIONING (processes engage the resources necessary to create the final result). LEVEL 2 – MANAGING (processes of such an organisation are planned and executed according to a given policy, the realisation of the project is evaluated according to its coherency with the plan). LEVEL 3 – DEFINED (an organisation is able to obtain information on indicators relating to the final result). LEVEL 4 – QUANTITATIVELY MEASURED (processes are controlled using statistical and quantitative techniques, the quality of products and services and process performance are measurable and controlled for all projects). LEVEL 5 – OPTIMISING (the organisation focuses on continuous improvement of process performance, all processes are controlled).
• • • •
Project entropy is the measurement of the disorganisation of information and therefore measures the uncertainty of the realisation of developing and managing processes in the project. It is often the case that before commencing the realisation of the project, a set of working, organisational sessions are conducted directing work and indicating the scope of activities. Unfortunately, it is also common that at the beginning, the final shape of the planned solution is not known. Regarding this uncertainty connected with the lack of knowledge and lack of precise expectations or inability to obtain the possible full scope of expectations from the client, the project should be divided into: • • •
LOW ENTROPY PROJECTS – small projects with their aims and basic assumptions stated very clearly. MEDIUM ENTROPY PROJECTS – projects where the basic assumptions need to be verified a few times. HIGH ENTROPY PROJECTS - uncertainty regarding the level of changeability in the project dominates; expectations towards the project are not clearly stated
In the context of the above-mentioned aggregated variables (client maturity, organisation maturity and project entropy), it seems justified to parameterise them in detail at the subsequent stage, which enables their full processing by the agent system. It is
548
C. Orłowski and A. Ziółkowski
also justified to equip agents with knowledge that will enable them to exactly define system input parameters. The first steps of the quantification of input factors have been taken. An example of such parameterisation may be a matrix-vector approach: Four stages of data processing were presented in the functioning of the agent system [6]: The stage of the initial processing of input variables includes procedures of the aggregation of an increase in the organization maturity level with a variable based on an increase in the maturity level of key processes. These processes were described with the variables (∆ok1) and (∆ok2) . Similarly, the increases (∆pt) and (∆kt) are defined respectively as aggregating variables (∆pnt) and (∆pdt) corresponding to increases in IT application and field knowledge levels (∆ppt) and (∆pot), corresponding to an increase in client adjustment and appropriateness level. The previously described variables (∆pt), (∆kt), (∆ot) are initially processed in a static system. The characteristics of their processing is presented with the equations (1, 2, 3): Pφ : Δo t → Δz t
and
Pφ : Δzt = Π * Δot
(1)
Pφ : Δp t → Δz t
and
Pφ : Δz t = Π * Δp t
(2)
Pφ : Δk t → Δz t
and
Pφ : Δz t = Π * Δk t
(3)
∆ot – the vector of an initial increase in organisation maturity level. ∆pt – the vector of an initial increase in project entropy. ∆kt – the vector of an initial increase in client maturity level. ∆zt – the vector of an increase in management level. Π – the methodological transformation matrix of the influence of initial increases in the maturity levels of supplier and client organisations and project entropy on an increase in the management level. Therefore, the function Pφ realising transformation processes of initial increases in the maturity levels of supplier and client organisations as well as project entropy on the initial increase in the project management level, may be illustrated as follows:
Pφ : Δzt = Π * ( Δot + Δpt + Δk t )
(4)
Since in the model it was assumed that the vector of an initial increase in the supplier organisation maturity level is described with elements ∆ok1t and ∆ok2t, its form may be formulated as follows: ⎡ Δ ok 1t ⎤ (5) Δo t = ⎢ ⎥ ⎣ Δ ok 2 t ⎦ ∆ok1t - means an increase in processes in key areas 1, described ∆ok1t∈J[5,100] ∆ok2t - means an increase in processes in key areas 2, described ∆ok2t∈J[5,100] The vector of the initial increase in client maturity level may be similarly described with elements ∆ppt respectively representing increases in the adjustment level and ∆pot increases in the appropriateness level
Supporting Software Project Management Processes Using the Agent System
⎡ Δpp t ⎤ Δk t = ⎢ ⎥ ⎣ Δpo t ⎦
549
(6)
∆ppt - means an increase in client adjustment level, ∆ppt∈J[5,100]. ∆pot - means an increase in client adjustment level ∆pot∈J[5,100]. Also, the vector of initial increases in the project entropy may be described with elements ∆pnt respectively representing an increase in field familiarity level ∆pdt device familiarity
⎡ Δpnt ⎤ Δp t = ⎢ ⎥ ⎣ Δpd t ⎦
(7)
∆pnt - means an increase in field familiarity level, ∆pnt∈J[5,100]. ∆pdt – means an increase in device familiarity level; ∆pdt∈J[5,100] Increases in the maturity levels of supplier (∆ot), and client (∆kt) organisations and a decrease in project entropy (∆pt) influence an increase in the project management level ∆zt which may be described in a vector form:
⎡ Δk t ⎤ Δz t = ⎢ Δot ⎥ ⎢ ⎥ ⎢⎣ Δp t ⎥⎦
(8)
In order to calculate these amounts, the factors of Π [7] matrix need to be defined. This (9) involves applying management factors αi, βi, χi, i = 1,2 enabling the processing of initial increases in (organisation and client) maturity and project entropy: ⎡α1 α 2 ⎤ (9) Π = ⎢ β1 β 2 ⎥ ⎢ ⎥ ⎣⎢ χ1 χ 2 ⎦⎥ αi - factors of matrix transformation of a supplier organisation maturity level,
α i ∈R [0,1]
βi - factors of matrix transformation of a client maturity level βi∈R[0,1]. χi - factors of matrix transformation of a project entropy level χi∈R[0,1]. R[0.1] - a set of real numbers from a closed range [0,1] These factors illustrate the influence of maturity levels of an organisation and client and project entropy.
4 Implementation of the Model For the needs of implementing the management decisions of the support system, hierarchy dependencies between the agents of the system were adopted: 3 main categories of agents were defined:
550
C. Orłowski and A. Ziółkowski
• • •
Intermediary agent(s) – responsible for communication with the user, Searching agent(s) – responsible for searching the system resources (including communication with the expert system), Managing agent – responsible for supervising other agents.
There is possibility of using some others agents like ontology agent (to communicate with ontology that is one of system component) or evaluating agants. The scope of their activity is equivalent to separate functionalities of the system (such as obtaining information, processing, sending results, etc.). The system architecture is presented below:
Fig. 2. Architecture of an IT assessment agent system
It is worth examining the mechanism of the activity of the constructed system. The main function – reasoning, is conducted based on an embedded expert system that relevant agents communicate with. Depending on the values of input parameters (relating to project complexity, organisation maturity and client maturity) the evaluation process is activated by the agent system. As a result of evaluation, the person inquiring (the user) receives an indication of which IT project realisation method should be applied for the needs of creating a corporate architecture. At the first stage of system activity, the intermediary agent obtains information entered by the user in the inquiry field. Such information is transferred to the managing agent. Based on such information, the managing agent makes a decision about using the resources possessed by the system. Such a flow of activities results from the adopted organisation of the system whereby agents are gathered according to their functions and hierarchy. Based on the knowledge base, the managing agent analyses the received inquiry. In the event of a search, the managing agent activates the searching agent and forwards the order to search the knowledge resources available in the system (the fact base). In the event of
Supporting Software Project Management Processes Using the Agent System
551
an inquiry requiring calculations, the managing agent activates the particular calculating module (the expert system, neuron net) and orders the searching agent to download the results. After that, the results are forwarded to the managing agent. Results obtained through calculations are submitted to the client via the intermediary agent.
5 The Conducted Verification Experiment The subsequent step for the created model was to check the correctness of the adopted solution. It was deemed that the initial system verification should be conducted based on data from the technical system. Due to the fact that input data in the evaluation of technology for software project management is mainly quality data (e.g. project complexity, organisation maturity, etc.) the verification required a dedicated evaluation model. The development of such a model required a variety of actions leading to the conversion of quality data into quantity data. Such quantification is essential in order to ensure the proper data flow in the above-mentioned reasoning process. Due to the lack of a ready evaluation model, the data of the technical system regarding atmosphere parameters (such as temperature, humidity) were adopted in the initial verification, where their influence on the level of air pollution in the Tri-city agglomeration was described (measured by the concentration of such substances as toluene, benzene, etc.) After the development of the complete version of the agent system model, the decision was made to commence the process of its verification. In practice, this meant the need for an implementation model with the application of devices for rapid application development - RAD. The basic environment for systems implementing agent structures is JADE [B], but for the needs of testing the whole system, the environment VBA (Microsoft Visual Basic for Application), available in the MS Excel application, was selected. According to the RAD approach, it was said that the first implementation should be conducted with such devices and environments that would ensure the possibility of fast system implementation and are perfectly known to the authors of the system concept. The obtained aforementioned data were applied in testing the mechanisms of the performance of the agent system, particularly the behaviour of each subject (the agents, ontologies and knowledge bases), separately, as well as examining their behaviour as a whole. The advantage of using data from the technical system related to pollution is undoubtedly the fact that the quality of repeatable results may be quickly evaluated, i.e. checking whether the achieved results are within the acceptable range – whether the generated results are correct. Only such an approach gives the additional possibility of so-called “system tuning” meaning the verification of all the system’s subjects and the evaluation of the accuracy of the achieved results. The developed procedures of resource-searching and forecast-generating based on the gathered data became the basis of the verification of the performance of the developed model. The data obtained from the technical system proved to be appropriate for testing the information flow between the subjects of the multi-agent system. Conclusions from the conducted tests of the performance of the system may be presented as follows: the system agents act properly, react to stimuli according to expectations and the sequential flow of messages is realised according to assumptions; each subsequent message implies activity specific to the given type of message agent; the agents
552
C. Orłowski and A. Ziółkowski
commence an action only when there is a signal from a client – expert (inquiry); all predicted functions of agents were applied in the testing proceedings.
5.1 Summary The tasks of the agent system and the data input to the system, illustrated in this way, indicate the next step of the presented concept. In further stages of work, it seems justified to parameterise in detail the input variables (client maturity, organisation maturity and project entropy) by identifying measurable values, which will enable their full processing by the agent system. It is also justified to equip the intermediary agent with such knowledge that will enable it to exactly define the input parameters of the system. Testing the implemented solution seems to be an important stage in the verification of the presented concept as well. The initial development work has already commenced regarding the creation of the complete agent system. Subsequent experiments will show the behaviour of the agents in situations where data come from a sociotechnical or social system and how the adopted architecture allows managers to apply the created system in IT project planning.
References [1] Altmann, J., Gruber, F., Klug, L., Stockner, W., Weippl, E.: Using Mobile Agents in the real World: A Survey and Evaluation of Agent Platforms. In: Proceedings of the Second International Workshop on “Infrastructure for MAS and Scalable MAS”, Montreal, Canada, May 28-June 1 (2001) [2] Angryk, R., Galant, V., Paprzycki, I.M.: Travel Support System - an Agent-Based Framework. In: Proceedings of the International Conference on Internet Computing. CSREA Press, Las Vegas (2002) [3] Galant, V., Tubyrcy, I.J.: Inteligentny Agent Programowy, The Intelligent Programme Agent, Prace Naukowe AE, Wrocław (2001) [4] Jennings, N.R.: An agent-based approach for building complex software systems. Communications of the ACM (2001) [5] Hendler, J.: Is There an Intelligent Agent in Your Future? Nature 11 (March 1999) [6] Ziółkowski, A., Orłowski, C.: Definicja zadań inteligentnych agentów do oceny technologii informatycznych, The definition of the tasks of intelligent agents in the evaluation of IT, Komputerowo Zintegrowane Zarządzanie, t.2, Oficyna Wydawnicza Polskiego Towarzystwa Zarządzania Produkcja, Opole 2008 (2008) [7] Wachowiak, P. (red.): Pomiar kapitału intelektualnego przedsiębiorstwa, Measuring the intellectual capital of a company, Szkoła Główna Handlowa w Warszawie, Warszawa (2005)
Knowledge-Based Virtual Organizations for the E-Decisional Community Leonardo Mancilla-Amaya1, Cesar Sanín1, and Edward Szczerbicki2 1
School of Engineering, Faculty of Engineering and Built Environment The University of Newcastle, Australia University Drive, Callaghan, 2308, NSW [email protected], [email protected] 2 Gdansk University of Technology, Faculty of Management and Economics [email protected]
Abstract. Virtual Organizations (VOs) promote dynamic interaction between individuals, groups and organizations, who share their capabilities and resources to pursue a common goal and maximize their benefits. Among these resources, knowledge is a critical one that requires special attention in order to support problem-solving activities, decision making processes and provide strategic advantage. In this paper, we present an initial proposal for the creation of dynamic knowledge-based VOs inside the E-Decisional Community, an integrated knowledge sharing platform that aims at the creation of markets where knowledge is provided as a service. This approach will provide technological support for discovering, re-using evolving and sharing experiential knowledge represented by means of the Set of Experience Knowledge Structure (SOEKS) and Decisional DNA among several entities. Keywords. E-Decisional Community, Virtual Organizations, SOEKS, Decisional DNA, Software Agents.
1 Introduction The E-Decisional Community [10] is a Community of Practice (CoP) that allows sharing experiential knowledge across different organizational levels. It uses a standard knowledge representation called Set of Experience Knowledge Structure (SOEKS), which in turn, comprises Decisional DNA [17]. This platform based upon the principles of different computing technologies, namely, software agents, Grid, and Cloud computing. The main goal is to create a knowledge-based market, where experience and knowledge are provided as services to support complex decision-making processes. One of the features defined for the E-Decisional Community is dynamic group formation. These groups will support the execution of knowledge-intensive tasks, when an individual’s knowledge is not enough to provide a suitable solution. Given that the E-Decisional Community wants to incorporate Grid computing concepts into the knowledge management arena, dynamic groups are be based on the concept of Virtual Organizations (VOs) under the context of Foster, Kesselman, & Tuecke [4]. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 553–562, 2010. © Springer-Verlag Berlin Heidelberg 2010
554
L. Mancilla-Amaya, C. Sanín, and E. Szczerbicki
The key difference is that the resources shared in our proposal do no refer to computers, software or data; resources in the context of our proposal are knowledge and experience represented by Decisional DNA. As a consequence, we call these dynamic groups Knowledge-Based Virtual Organizations (KBVOs). These KBVOs are not actually limited to inter-organizational relations. We generalize the concept, to include every single group of interest and organizational unit in an enterprise. Each one of them has different knowledge resources to share, and even inside the same organization, access policies and rules access can vary from group to group. Because of this, we can represent each gathering of workers as a separate KBVO, and the entire system as a composition of organizations. Taking into account that the E-Decisional Community uses software agents to represent workers in the system, we will eventually have a large Multi-Agent System (MAS). In addition, because the E-Decisional Community is a people-oriented platform, the study of social and behavioral factors surrounding VOs is of great importance. Since the concept of KBVOs encompasses every team, group of interest and organization in the system, a comprehensive literature review has been performed to identify key elements that can be later mapped into the E-Decisional Community to better reflect human behavior and make the KBVO model as complete as possible. The remaining of this paper is structured as follows: section two presents a theoretical background on social theories of virtual teams and organizations. Section three, firstly presents the initial proposal for KBVOs, describing the initial set of requirements and their conceptual relationships; secondly, the initial model to measure trust and reputation is presented, along with a validation prototype and experimental results. Finally, section four presents a discussion and future work plans for the E-Decisional Community.
2 Background Virtual interactions have become increasingly popular as a new way to perform workrelated activities. However, the success of such environments in any organization depends not only upon technological factors, but also on social and behavioral elements related directly to the workforce and the firms themselves. Researchers have analyzed virtual environments from different perspectives in order to understand them better and make them more efficient. For instance, Spaulding [18] used a value chain approach, social contracts and trust theory, to determine how an enterprise can obtain value from its interaction with different virtual communities. M.-J. J. Lin, Hung, & Chen [9] indentified that elements like reciprocity, trust, perceived advantage and compatibility, among others, can influence knowledge sharing behavior. Additionally, Panteli & Sockalingam [12] proposed a framework in which trust and conflict are considered as important elements in knowledge-oriented processes. Knowledge management strategies that allow virtual organizations and communities to acquire competitive advantage have also been explored. Elements like policies, performance measures, fast adaptation and knowledge as a basis for process integration, are some of the components that researchers such as Burn & Ash [2] have proposed to assure optimal decision making.
Knowledge-Based Virtual Organizations for the E-Decisional Community
555
Finally, the contract metaphor has been extended from the legal universe into the virtual enterprise world. For instance, Hoffner et al. [6] present a framework that explores the management of business relationships in the virtual world in an automated and efficient way. This is achieved by defining a novel contract framework and a virtual market technology, among other features, that assist enterprises in the process of searching for partners and negotiate the characteristics of contracts.
3 Knowledge-Based Virtual Organizations in the E-Decisional Community 3.1 Initial Requirements As a people-oriented platform, the E-Decisional Community must use indicators, measurements and mechanisms that reflect human behavior in a computer-based environment. This will also give support for software agents to engage in knowledge sharing activities like real people would do. Table 1 summarizes the features that we have selected as relevant at this stage of our research. We have selected features that are common in literature, therefore widely used and accepted. Also, the selected items are sufficient at this stage of development to create a prototype to validate knowledge sharing based on dynamic groups for the E-Decisional Community. Table 1. Elements to be used in the E-Decisional Community for dynamic community formation Trust Reputation (personal, group) Performance Knowledge Quality and Quantity Community Participation Quality of KS Satisfaction
[3], [7], [9],[12],[15],[21] ,[22] [3],[15],[22] [8],[13], [15] [7],[11], [15] [12],[22] [6] [8],[22]
Community size and age Rewards/Punishment Coordination & Communication Community Promotion Conflict Contracts
[13] [8],[15] [3] [7] [11],[15] [8],[23]
A more detailed analysis of table 1 shows that trust is considered by many as a key aspect, and its relevance leads us to argue that it can act as the driving factor surrounding community formation for decision making. Trust does not refer only to the degree of confidence between entities, but also to the precision and usefulness of the knowledge they share. In spite of being intrinsic characteristics of MAS, communication and coordination are explicitly identified to emphasize their importance in the community formation process, and their relationship with the other components, as a way to reflect human behavior. The remaining elements in figure 1 are discussed in the following section. Finally, as part of the requirements for VO formation in the E-Decisional Community, a life cycle for such groups must be defined. We will use a commonly accepted
556
L. Mancilla-Amaya, C. Sanín, and E. Szczerbicki
life cycle model for virtual organizations, based on four stages as described in [16]: requirement identification, partner selection, task execution and group dissolution. The items described throughout this support this life cycle. 3.2 Concept Relationships Figure 1 illustrates the existing relationships between the concepts defined in section 3.1; in the diagram, (+) indicates a positive relation and (-) a negative relation. Reputation, quality of knowledge sharing (QoKS), contracts, conflict, and rewards/punishment are important characteristics related to trust. For example, a low level of QoKS leads to a decrease in trust from other agents, and vice versa. Also, well-managed conflict or contracts ([1],[12]) can help in the process of building trust among entities. Moreover, in the early stages of a relationship, trust can be built based on rules ([12]), which can be translated into a reward/punishment scheme. (+/-)[12] Satisfaction
(+)[22] Rewards/ Punishment
Trust
(+/-)[8]
Conflict
(+)[12] Performance
(+/-)[15],[3]
Coord. & Comm.
(+/-)[8] Knowledge Quality
(+/-)[15],[22] (+/-)[22]
(+)[22]
(+)[1] (+/-)[22],[1]
Contracts
Reputation
(+)[22]
(+)[22],[15] Comm. Participation
(+)[7]
(+)[15]
(+/-)[15]
Knowledge Quantity
Quality of KS
(+)[6]
Fig. 1. Identified relationships for KBVO formation
Coordination and communication can have a great influence in the final performance of a group and in the process of defining contracts. Good coordination leads to better task execution, and appropriate communication between entities will allow them to promote their knowledge, make bids and carry on negotiation processes. The model illustrated in figure 1 also shows how reputation can be a motivating factor for individuals or groups to share knowledge. One way to increase community reputation is by sharing more knowledge, and participating in the community more actively. However, this is not enough to increase reputation; shared knowledge must be of acceptable quality to the other members, or reputation may decrease instead.
Knowledge-Based Virtual Organizations for the E-Decisional Community
557
Community promotion and community size and age, are not illustrated in the diagram because they are independent measurements. For example, we can measure the number of members in a group (size), and the period of time for which they have been interacting together, which indicates their possible level of experience. Additionally, community promotion in our context refers to the process of advertising knowledge services and experience in a group. We assume this process is willingly carried out by agents regardless of any other factors. 3.3 Initial Models: Trust and Reputation This section presents the conceptual model produced to describe trust and reputation in the E-Decisional Community. Trust and reputation were selected mainly because their degree of importance according to figure 1 (i.e. the number of connecting relationships with other concepts) is the highest. During this section, it is assumed that an agent ai is part of a system Ω which has at least two agents,
Ω = {a1 , a 2 , a3 ,..., a n }; n >= 2
3.3.1 Trust Model In the context of our proposal, every interaction between agents can have two possible outcomes just like in a Bernoulli trial: success or failure. Consequently, the interaction result between two agents is defined as:
(
)
I ai , a j ∈ {SUCCESS , FAILURE}; ∀i ≠ j
(1)
Our approach is oriented towards knowing how much trust should be placed in an agent based on the outcomes of previous interactions. In other words, agent A wants to know in advance what is the probability of a success before interacting with agent B, using its experience to calculate such result. To model such behavior, Laplace’s Rule of Succession was used to define T, the level of trust between two agents, as follows:
(
)
T ai , a j =
( ( )) ; ∀i ≠ j ∧ 0 ≤ T (a , a ) ≤ 1 i j H (I (ai , a j )) + 2 S I ai , a j + 1
(2)
In the previous equation, S(I(ai,aj)) is the total is number of successful interactions and H(I(ai,aj)) is the total number of interactions including failures. Additionally, following Zhuge’s approach presented in [22], trust will decay over time making older results less important. As a consequence, the decay factor λ is introduced as a way to automatically decrease trust levels amongst agents when there are no new interactions. Therefore, the value of trust after a periodic decay is calculated by:
(
) (
) ((
) )
T ai , a j = T ai , a j − T ai , a j ⋅ λ ; 0 ≤ λ < 1
(3)
3.3.2 Reputation Model Reputation in the E-Decisional Community is basically an average of the trust that many agents have placed in another one over time. If agent A reports a high level of
558
L. Mancilla-Amaya, C. Sanín, and E. Szczerbicki
trust in agent B, reputation of B in the community will increase, and vice-versa. Also, since reputation decays over time, reputation will also be affected. The reputation level R(a i ) for an agent ai is simply the accumulation of previous reputation levels divided among the total number of opinions (feedback) from other agents F (ai ) :
R(ai ) =
1 ⋅ F (ai )
F (ai )
∑ Rk (a1 ); 0 ≤ Rk (ai ) ≤ 1 ∧ F (ai ) ≥ 1
(4)
k =1
When the system runs for the first time and there is no record of previous interactions between agents, reputation will be used as the initial trust level to use by entities. In addition, when a new agent enters the system, its reputation level will be 0.5, since it is no possible to indicate beforehand whether it is a trustworthy agent or not. 3.3.3 Case Study and Experiment Design An agent prototype was developed using JADE 4.0 [19]. As part of this process, a preliminary API version for the E-Decisional Community was generated. The prototype’s goal is to validate how trust changes over time based on experiences represented by Decisional DNA and automatic decay, and to examine how reputation behaves in such conditions. In this application, it is assumed that all agents know each other, and that they belong to the same organizational unit. There is no need to create knowledge-based virtual organizations, yet. This is out of our scope at this point, mainly because validation and formalization of the remaining features is needed before implementing the full virtual organization model. The application’s context is the following: an agent wants to know about a cheap mechanic to fix its car. It asks its fellow agents, who should return information when the average cost per repair of their known mechanics is less or equal to the one proposed by the initiator agent. Based on the result of each interaction, agents recalculated trust and reported the change to a reputation service. To simulate “bad” agents, a random generator was used to decide whether or not to return knowledge that did not comply with the query. In addition, each agent has a behavior that runs periodically to decay trust at a rate of 10% each time. Trust was modeled as a component of each agent’s mental state, meaning that each agent holds a map of their known peers and the respective trust level. Reputation is provided as a service by a specialized agent and each peer must register itself against the reputation service the first time they enter the system. For the experiments, one dummy agent and ten working agents were created. The dummy agent sent a message to trigger a reaction in the working agents, that generated trust and reputation recalculations based on interaction experience. The experiment ran fifty independent trials. 3.3.4 Experimental Results The following graphics illustrate the trust level from an agent, called AG1, with other agents in the system. For the sake of simplicity, four agents out of the remaining nine have been selected because their results illustrate different scenarios, depicted in figure 2:
Knowledge-Based Virtual Organizations for the E-Decisional Community
559
TRUST VALUES FROM AG1
Trust Value
1 0.8 0.6 0.4 0.2 0 0
5
10
15
20
25
30
35
40
45
50
Iteration
TRUST_AG4
TRUST_AG7
TRUST_AG8
TRUST_AG9
Fig. 2. Trust values for the experiment
− AG7: Depicts the behavior of a trustworthy agent that keeps a “good” behavior for the duration of the experiment. − AG4: Shows an “intermediate” behavior; trust values for this agent stay around 0.6 in average for the majority of the tests. − AG8: Shows how an agent starts with a sequence of “bad” responses, and after changing its attitude, it gains trust from the others again. It takes a long time for reputation to start increasing again after several misbehaviors. − AG9: Illustrates how an agent gets bad trust after repeatedly replying with inaccurate information. In this case reputation keeps going down.
Figure 3 shows clearly a stair-like behavior in trust levels. This means that trust decays λ percent while agents do not interact with each other. After a new interaction is executed, trust is recalculated again and increases/decreases by a small amount. In TRUST DECAY IN AG1
Trust Value
1 0.8 0.6 0.4 0.2 0 0
50
100
150
200
250
300
Observation TRUST_AG4
TRUST_AG7
TRUST_AG8
Fig. 3. Trust values for the experiment
TRUST_AG9
560
L. Mancilla-Amaya, C. Sanín, and E. Szczerbicki
Reputation
REPUTATION VALUES 1 0.8 0.6 0.4 0.2 0 0
10
20
30
40
50
Iteration REP UTA TION_VA LUE_A G4
REP UTA TION_VA LUE_A G7
REP UTA TION_VA LUE_A G8
REP UTA TION_VA LUE_A G9
Fig. 4. Reputation values for the experiment
this series of experiments, trust decays only if agents have not interacted in the last minute. For applications in an organizational context, this time-scale should be extended to days or weeks to resemble human-behavior in a better way. Finally, figure 4 shows the behavior of reputation through time. As expected, the diagram shows that reputation behaves depending on trust, just like it has been defined on the model. The variations in the reputation values show also a more stable behavior that trust, in the sense that variations due to success or failure are not reflected as drastically as in figures 2 and 3. This means that a bad opinion about a fellow agent does not necessarily have a great impact on the perceived reputation inside the community; to seriously affect reputation, an agent must have several bad reviews from its peers.
4 General Discussion and Future Work The social and technical aspects of knowledge-based virtual organizations have been presented and discussed in this paper, with the main goal of establishing a set of initial requirements to support dynamic group formation in the E-Decisional Community for automated knowledge sharing and decision-making. Other researchers explored and measured trust and reputation for virtual organizations and agent-based systems in greater detail, using sophisticated mathematical models, but they do no deal with knowledge sharing ([5],[14],[20]). However, these approaches identify some key elements that cannot be left aside when designing a new alternative for virtual organization operation. Consequently, the most relevant elements of the existing solutions are used as a foundation for our present and future work. Our intention is not to create a novel trust and reputation model, but to develop a platform that takes into account the aforesaid ideas and concepts as the driving force that will help us create a knowledge market using SOEKS and Decisional DNA. The work and experiments described in this document are a first step towards the development of such platform, and because of that, further refinement, validation and testing needs to be carried out.
Knowledge-Based Virtual Organizations for the E-Decisional Community
561
Inclusion of additional measures, based on the concepts described in sections 3.1 and 3.2, will eventually allow us to develop and test the knowledge-based virtual organization formation model more thoroughly. As mentioned in section 3.3.3, the initial tests just want to measure the behavior of the proposed model, without implementing any intricate mechanisms. The results obtained so far reflect in some way, according to our perception, human-like behavior in computer systems. For instance, the perceived trust towards a peer can be drastically changed just by a single misleading act. After losing trust in someone that has previously deceived us, it takes a long time and effort on the other part to gain that trust back. On the other hand, if another person always helps us with true knowledge, our perceived trust towards that person will increase. These attitudes inherent to humans are reflected in the set of experiments carried out. Finally, another topic in the future research agenda for the E-Decisional Community is Decisional DNA appropriation, which means using the elements defined in section 3 to determine how new knowledge is merged with the one that an entity already possesses. Determining the quality of knowledge and user satisfaction, among other criteria, will support this process. Desirably, once agents have assimilated knowledge after several interactions, the answer of a query should eventually converge to a common and globally accepted solution.
References 1. Arenas, A., Wilson, M.: Contracts as Trust Substitutes in Collaborative Business. Computer 41(7), 80–83 (2008), 0018–9162 2. Burn, J.M., Ash, C.: Knowledge management strategies for virtual organisations. In: Kisielnicki, J. (ed.) Modern Organizations in Virtual Communities, pp. 1–18. IRM Press (2002) 3. Chiu, C.-M., Hsu, M.-H., Wang, E.T.G.: Understanding knowledge sharing in virtual communities: An integration of social capital and social cognitive theories. Decision Support Systems 42(3), 1872–1888 (2006), 0167–9236, doi:10.1016/j.dss.2006.04.001 4. Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Int. J. High Perform. Comput. Appl. 15(3), 200–222 (2001), 1080667 5. Hermoso, R., Centeno, R., Billhardt, H., Ossowski, S.: Extending virtual organizations to improve trust mechanisms. Paper Presented at the Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems, Estoril, Portugal, vol. 3 (2008) 6. Hoffner, Y., Field, S., Grefen, P., Ludwig, H.: Contract-driven creation and operation of virtual enterprises. Comput. Netw. 37(2), 111–136 (2001), 511672 7. Koh, J., Kim, Y.G.: Knowledge sharing in virtual communities: an e-business perspective. Expert Systems with Applications 26(2), 155–166 (2004), 0957–4174, doi:10.1016/S09574174(03)00116-7 8. Lin, C., Standing, C., Liu, Y.-C.: A model to develop effective virtual teams. Decision Support Systems 45(4), 1031–1045 (2008), 0167–9236 9. Lin, M.-J.J., Hung, S.-W., Chen, C.-J.: Fostering the determinants of knowledge sharing in professional virtual communities. Computers in Human Behavior 25(4), 929–939 (2009), 0747–5632, doi:10.1016/j.chb.2009.03.008
562
L. Mancilla-Amaya, C. Sanín, and E. Szczerbicki
10. Mancilla-Amaya, L., Sanín, C., Szczerbicki, E.: Smart Knowledge-Sharing Platform For E-Decisional Community. Cybernetics and Systems: An International Journal 41(1), 17–30 (2010) 11. Norman, T.J., Preece, A., Chalmers, S., Jennings, N.R., Luck, M., Dang, V.D., et al.: Agent-based formation of virtual organisations. Knowledge-Based Systems 17(2-4), 103– 111 (2004), 0950–7051, doi:10.1016/j.knosys.2004.03.005 12. Panteli, N., Sockalingam, S.: Trust and conflict within virtual inter-organizational alliances: a framework for facilitating knowledge sharing. Decision Support Systems 39(4), 599–617 (2005), 0167-9236, doi:10.1016/j.dss.2004.03.003 13. Papaioannou, T.G., Stamoulis, G.D.: Reputation-based estimation of individual performance in collaborative and competitive grids. In: Future Generation Computer Systems (2009) (in Press) (Corrected Proof), 0167–739X, doi:10.1016/j.future.2009.05.007 14. Patel, J.: A Trust and Reputation Model for Agent-Based Virtual Organisations. University of Southampton (2006) 15. Patel, J., Luke Teacy, W.T., Jennings, N.R., Luck, M., Chalmers, S., Oren, N., et al.: Agent-based virtual organisations for the Grid. Multiagent and Grid Systems - Smart Grid Technologies & Market Models 1(4), 237–249 (2005), 1233713 16. Preece, A.: Toward a Marketplace Infrastructure for Virtual Organisations. AAAI Technical Report WS-00-04, pp. 54-62 (2000) 17. Sanin, C., Mancilla-Amaya, L., Szczerbicki, E., CayfordHowell, P.: Application of a Multi-domain Knowledge Structure: The Decisional DNA. In: Nguyen, N.T., Szczerbicki, E. (eds.) Intelligent Systems for Knowledge Management, vol. 252, pp. 65–86. Springer, Heidelberg (2009) 18. Spaulding, T.J.: How can virtual communities create value for business? Electronic Commerce Research and Applications 9(1), 38–49 (2010), 1567-4223, doi:10.1016/j.elerap.2009.07.004 19. Java Agent DEvelopment Framework, http://jade.tilab.com/ 20. Yu, B., Singh, P.M.: An evidential model of distributed reputation management. Paper Presented at the Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 1 (2002) 21. Zhai, Y., Qu, Y., Gao, Z.: Agent-Based Modeling for Virtual Organizations in Grid. In: Jin, H., Pan, Y., Xiao, N., Sun, J. (eds.) GCC 2004. LNCS, vol. 3252, pp. 83–89. Springer, Heidelberg (2004) 22. Zhuge, H., Guo, W.: Virtual knowledge service market–For effective knowledge flow within knowledge grid. Journal of Systems and Software 80(11), 1833–1842 (2007), 01641212, doi:10.1016/j.jss.2007.02.028 23. Zuzek, M., Talik, M., Swierczynski, T., Wisniewski, C., Kryza, B., Dutka, L., et al.: Formal Model for Contract Negotiation in Knowledge-Based Virtual Organizations. Paper Presented at the Proceedings of the 8th International Conference on Computational Science, Part III (2008)
Decisional DNA Applied to Robotics Haoxi Zhang, Cesar Sanin, and Edward Szczerbicki Faculty of Engineering and Built Environment, School of Engineering, The University of Newcastle, Callaghan, NSW, Australia 2308 [email protected], {Cesar.Sanin,Edward.Szczerbicki}@newcastle.edu.au
Abstract. As a domain-independent, flexible and standard knowledge representation structure, Decisional DNA allows its domains to acquire and store experiential knowledge and formal decision events in an explicit way. In this paper, we explore an approach that integrates Decisional DNA with robots in order to test the usability and availability of this novel knowledge representation structure. Core issues in using this Decisional DNA-based method include capturing of knowledge, storage and indexing of knowledge, organization of the knowledge base memory, and retrieval of knowledge from memory according to current problems. We demonstrate our approach in a set of experiments, in which the robots capture knowledge from their tasks and are able to reuse such knowledge in following tests. Keywords: Decisional DNA, Set of Experience Knowledge Structure, knowledge representation, robotics, XML.
1 Introduction Nowadays, robots have been widely used in many different fields, such as manufacturing, space exploration, military, and medicine. The successful development of autonomous robots apparently requires capabilities of making decisions and acting properly by the robot itself. Recently, the experience-based approach, as one of the solutions, has become a topic of growing interest. It enables robots to reuse and evolve experiences that have been accumulated through day-to-day operations in order to support decision-making and behavior control. Several experience-based researches and theories in robotics can be found in literature. Goel et al. [7] present an experience-based approach to help autonomous mobile robots do navigational path planning. In their method, new problems are solved by retrieving and adapting the solutions to similar problems met before. The robot learns from the successes and failures of its pervious performance, and new solutions are stored in the case memory for potential future reuse. Beetz et al. [8] applied experience-based learning techniques to their AGILO robot soccer team. By using the AGILO simulator, soccer robots were trained adequately from a large variety of tasks. As a result, the performance of ball approaching, role assigning, and coordinated plays were improved. Stulp et al. [9] present an approach combining imitation, analysis, and R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 563–570, 2010. © Springer-Verlag Berlin Heidelberg 2010
564
H. Zhang, C. Sanin, and E. Szczerbicki
experience-based learning to collect training data. Then the robot uses the acquired training data to learn accurate models for a classic mobile operation task: approaching a worksurface in order to grasp an object. The experience-based learning is the key for transforming the model to robot’s own skills, in which the robot pilots to different positions and captures whether or not it successfully grasps the goal object from these positions. Nicolescu and Mataric [10] present a method that enhances the capabilities of robots, enabling robots to learn representations of multiple tasks from their own experiences accumulated through interactions with humans: a learner robot maps its internal behaviors to its own study of the environment by following a human or robot teacher, and builds a behavior-based representation of the experienced task at run-time. Decisional DNA, as a domain-independent, flexible and standard knowledge representation structure [1], can potentially solve and improve the problem just mentioned in the above paragraph: it not only captures and stores experiential knowledge in an explicit and formal way [2], but also can be easily applied to various domains to support decision-making and standard knowledge sharing and communication among these systems. In this paper, we present an approach that integrates Decisional DNA with robotics to achieve robot control. Also, we demonstrate this approach in a set of experiments in order to test the usability and availability of Decisional DNA on supporting decision-making and knowledge sharing for robotics. This paper is organized as follows: section two describes the knowledge representation structure — Decisional DNA; section three presents the implementation of the Decisional DNA applied to robotics; section four demonstrates the experiment results of our approach. Finally, in section five, concluding remarks and future work are drawn.
2 Set of Experience Knowledge Structure and Decisional DNA The Set of Experience Knowledge Structure (SOEKS or shortly SOE) is a domain-independent, flexible and standard knowledge representation structure [1]. It has been developed to acquire and store formal decision events in an explicit way [2]. It is a model based upon available and existing knowledge, which must adapt to the decision event it is built from (i.e. it is a dynamic structure that depends on the information provided by a formal decision event) [3]; besides, it can be represented in XML or OWL as an ontology in order to make it transportable and shareable [4] [5]. SOEKS is composed of variables, functions, constraints and rules associated in a DNA shape permitting the integration of the Decisional DNA of an organization [3]. Variables normally implicate representing knowledge using an attribute-value language (i.e. by a vector of variables and values) [6], and they are the centre root of the structure and the starting point for the SOEKS. Functions represent relationships between a set of input variables and a dependent variable; moreover, functions can be applied for reasoning optimal states. Constraints are another way of associations among the variables. They are restrictions of the feasible solutions, limitations of possibilities in a decision event, and factors that restrict the performance of a system. Finally, rules are relationships between a consequence and a condition linked by the statements IF-THEN-ELSE. They are conditional relationships that control the universe of variables [3].
Decisional DNA Applied to Robotics
565
Additionally, SOEKS is designed similarly to DNA at some important features. First, the combination of the four components of the SOE gives uniqueness, just as the combination of four nucleotides of DNA does. Secondly, the elements of SOEKS are connected with each other in order to imitate a gene, and each SOE can be classified, and acts like a gene in DNA [3]. As the gene produces phenotypes, the SOE brings values of decisions according to the combined elements. Then a decisional chromosome storing decisional “strategies” for a category is formed by a group of SOE of the same category. Finally, a diverse group of SOE chromosomes comprise what is called the Decisional DNA [2]. In short, as a domain-independent, flexible and standard knowledge representation structure, SOEKS and Decisional DNA provide an ideal approach which can not only be very easily applied to various domains, but also enable standard knowledge communication and sharing among these systems.
3 Implementation of Decisional DNA Applied to Robotics 3.1 Software Architecture The Decisional DNA-based application consists of the User Interface, the Integrator, the Prognoser, the XML Parser and the Decisional DNA Repository (see Fig. 1). The Section 3.2 explains the components in detail. 3.2 Software Components 3.2.1 Integrator The Integrator is the place where the robot’s scenario data is gathered and organized. In our case, we link each experience with a certain scenario. Therefore, the scenario is transformed into Variables, Functions, Constrains, and Rules describing the circumstance under which experience is acquired, such as the values of the robot’s motors, servos, joints, sensors and other peripherals. The robot’s scenario data is essential for estimating the robot’s status and making decisions. The Integrator organizes the robot’s scenario data into a string of numbers using PORT•VALUE format, and send it to the Prognoser for further processing. 3.2.2 User Interface The User Interface is developed to interact with the user. In particular, user sets and configures the system by using the user interface. Like the user can set the system date, choose which mode to run (e.g. training or test), and read feedback from the User Interface. 3.2.3 Prognoser The Prognoser is in charge of sorting, analyzing, organizing, and creating experience. It sorts data received from the Integrator first, and then, it analyzes and organizes the data according to the system configuration. Finally, it interacts with the Decisional DNA
566
H. Zhang, C. Sanin, and E. Szczerbicki
Fig. 1. System Architecture for Decisional DNA Applied to Robotics
Repository and the XML Parser in order to store and reuse experience depending on the purpose of different tasks. 3.2.4 XML Parser The XML Parser is the converter that translates knowledge statements generated by the Prognoser into the Decisional DNA experience structure represented in XML format; and interprets the retrieved XML-represented Decisional DNA experience for reusing. 3.2.5 Decisional DNA Repository The Decisional DNA Repository is the core software component for our approach. It is the place where experiences are stored and managed. It uses standard XML to represent experiential knowledge, which makes standard knowledge sharing and communication become easier. It is composed of the Repository Manager and XML files: a) Repository Manager. The Repository Manager is the interface of the Decisional DNA Repository. It answers operation commands sent by the Prognoser and manages the XML files. There are two main tasks in it: searching experiences and managing XML files.
Decisional DNA Applied to Robotics
567
b) XML files. We use a set of XML tags described in [1] to store Decisional DNA in XML files. In this way, Decisional DNA is represented explicitly, and ready to be shared among different systems. 3.3 Hardware and Firmware Platform To test the Decisional DNA applied to robotics, we introduced the LEGO® Mindstorms® NXT 2.0 robot, equipped with a 32-bit ARM7 microcontroller, a 256Kbytes FLASH, a 64Kbytes RAM [11], a color sensor and one ultrasonic sensor. Since the Decisional DNA is defined in Java [1], we burnt the LeJOS into the NXT 2.0 as the firmware. The LeJOS is a Java programming environment for the NXT® robot [12].
4 Experiment We tested our concepts on two 70cm×100cm maps (see Fig. 2). There are three lines drawn in different colors, and each line ends at a final spot. The spot B is the goal for the robot and it is colored in red; the other two final spots are drawn in yellow. The robot starts at A. The idea is that the robot uses the Decisional DNA to capture and reuse experience in order to find out the best line to follow, and eventually reach its goal (i.e. spot B). At the first time, we train the robot. It follows every line and gathers information of these lines. After training, the robot gets its own experience on which line reaches the goal. As a result, the robot chooses the right line to follow, and reaches the goal by using its experience gained in training. In order to demonstrate the advantages by using Decisional DNA, we compared the time cost between using the Decisional DNA and not using it. When using the Decisional DNA, the robot queries its Decisional DNA Repository first, and gets the
Fig. 2. Experiment Maps
568
H. Zhang, C. Sanin, and E. Szczerbicki
experience on which line reaches the goal. Then, it detects the lines’ color, and follows the right one. While when not using Decisional DNA, the robot scans the map until it finds the red spot (i.e. the goal). We ran each test 30 times both for using Decisional DNA and not using Decisional DNA. Fig. 3 and Fig. 4 illustrate the details of time cost for our test on map 1. As we can see, the time cost is much less and more stable when using Decisional DNA in comparison with not using Decisional DNA.
450000
Time Cost (milliseconds)
400000 350000 300000 250000
Using D-DNA
200000
Not Using D-DNA
150000 100000 50000 0
Fig. 3. Time Cost Details Comparison on Map 1
450000 Time Cost (milliseconds)
400000 350000 300000 250000
Using D-DNA
200000
Not Using D-DNA
150000 100000 50000 0 Min Time
Avg.Time
Max Time
Fig. 4. Time Cost Overall Comparison on Map 1
Fig. 5 and Fig. 6 show the test result on map 2. By using Decisional DNA is still much quicker and more stable than not using Decisional DNA.
Decisional DNA Applied to Robotics
569
Time Cost (milliseconds)
300000
250000
200000 Using D-DNA
150000
Not Using D-DNA
100000
50000
0
Fig. 5. Time Cost Details Comparison on Map 2
Time Cost (milliseconds)
300000 250000 200000 Using D-DNA
150000
Not Using D-DNA
100000 50000 0 Min Time
Avg.Time
Max Time
Fig. 6. Time Cost Overall Comparison on Map 2
5 Conclusions and Future Work We presented an approach that allows a robot to capture and reuse its own experiences by applying Decisional DNA to robotics. We described the architecture of our approach, and implemented and tested our concepts on a LEGO® Mindstorms® NXT 2.0 robot and two maps. The Decisional DNA, as a novel knowledge representation structure, not only can be easily applied to robotics, but also indeed improves the performance of robots when it is using experience.
570
H. Zhang, C. Sanin, and E. Szczerbicki
Since the Decisional DNA applied to robotics is at its early development stages, there are further research and refinement remaining to be done, some of them are: − Refinement of the requirements of Decisional DNA for robotics, such as inter-process communication strategies, life cycle management and protocols need to be explored in detail. − Enhancement of the compactness and efficiency of Decisional DNA: optimization of algorithms, evaluation and comparison of the different strategies for Decisional DNA repository storage and query. − Technical review of distributed systems and middleware to determine their viability for a future implementation of the Decisional DNA applied to robotics. Including shareability of gained experience among different robots.
References 1. Maldonado Sanin, C.A.: Smart Knowledge Management System, PhD Thesis, Faculty of Engineering and Built Environment - School of Mechanical Engineering, University of Newcastle, E. Szczerbicki, Doctor of Philosophy Degree, Newcastle (2007) 2. Sanin, C., Szczerbicki, E.: Experience-based Knowledge Representation SOEKS. Cybernetics and Systems 40(2), 99–122 (2009) 3. Sanin, C., Mancilla-Amaya, L., Szczerbicki, E., CayfordHowell, P.: Application of a Multi-domain Knowledge Structure: The Decisional DNA. In: Intel. Sys. For Know. Management. SCI, vol. 252, pp. 65–86 (2009) 4. Sanin, C., Szczerbicki, E.: Extending Set of Experience Knowledge Structure into a Transportable Language Extensible Markup Language. International Journal of Cybernetics and Systems 37(2-3), 97–117 (2006) 5. Sanin, C., Szczerbicki, E.: An OWL ontology of Set of Experience Knowledge Structure. Journal of Universal Computer Science 13(2), 209–223 (2007) 6. Lloyd, J.W.: Logic for Learning: Learning Comprehensible Theories from Structure Data. Springer, Berlin (2003) 7. Goel, A., Donnelan, M., Vazquez, N., Callantine, T.: An integrated experience-based approach to navigational path planning for autonomous mobile robots. In: Working notes of the AAAI Fall Symposiumon Applications of Artificial Intelligence to Real-World Autonomous Mobile Robots, Cambridge, MA, pp. 50–61 (1992) 8. Beetz, M., Schmitt, T., Hanek, R., Buck, S., Stulp, F., Schroter, D., Radig, B.: The AGILO Robot Soccer Team - Experience-Based Learning and Probabilistic Reasoning in Autonomous Robot Control. Autonomous Robots 17, 55–77 (2004) 9. Stulp, F., Fedrizzi, A., Zacharias, F., Tenorth, M., Bandouch, J., Beetz, M.: Combining Analysis, Imitation, and Experience-based Learning to Acquire a Concept of Reachability in Robot Mobile Manipulation. In: Proc., the International Conference on Humanoid Robots, Paris (2009) 10. Nicolescu, M.N., Mataric, M.J.: Experience-based Representation Construction: Learning from Human and Robot Teachers. In: Proc., IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Hawaii, pp. 740–745 (2001) 11. LEGO, What is NXT, http://mindstorms.lego.com/en-us/whatisnxt/default.aspx 12. LEJOS, Introduction, http://lejos.sourceforge.net/nxt/nxj/tutorial/ Preliminaries/Intro.htm
Supporting Management Decisions with Intelligent Mechanisms of Obtaining and Processing Knowledge Cezary Orłowski and Tomasz Sitek Gdańsk University of Technology, Department of Management and Economics, ul. Narutowicza 11/12, Gdansk, Poland {cezary.orlowski,tomasz.sitek}@zie.pg.gda.pl
Abstract. This paper is a summary of the authors’ work on a model of support for decision processes in organisation management. Key decisions in organisations are often made without sufficient knowledge and by people lacking the proper qualifications and experience. Without full support in decision processes, true threats to organisations are created. Therefore, to support decision processes in numerous organisations, apart from experts in their field, knowledge-based systems (expert) are applied. Those designed to support decision processes in technical systems are well described and applied. Those devoted to managers acting in sociotechnical systems are more complex and generally difficult to describe. An example of such a system for the needs of an IT organisation has been presented in this paper. The authors identify possible problems of an IT organisation and introduce a decision system supplemented with applicable elements in technical and sociotechnical systems. Keywords: knowledge base, uncertain knowledge, reasoning, expert system, information technology.
1 Decision Processes in Management Making decisions is a complex management process strongly influencing an organisation’s performance. Its dynamic character in a multidimensional institutional system results in multipronged decisions that require wider knowledge from a decisionmaker. It entails more responsibility, and in the event of defective decisions – considerable costs. An example of complex conditions for decision making is running an organisation where business decisions are made. These conditions are one consequence of an increase of the technical and technological level in existing companies that are treated as complex sociotechnical systems. Such systems require modern and precise methods of decision making [1]. 1.1 The Essence of Management Decisions In management a decision process is defined as the choice of one action from a number of solutions available at the time. A decision process can also be defined as a conscious restraint on making a choice – which may also be treated as a choice [2]. Therefore, a decision process may be described as a number of mental operations R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 571–580, 2010. © Springer-Verlag Berlin Heidelberg 2010
572
C. Orłowski and T. Sitek
leading to the solution of a decision issue. No matter how different the definitions are in the literature regarding the number of such operations or their names, it is always a sequence of actions in the form of an algorithm – the subsequent steps shall be a direct effect of the previous steps. A decision process consisting of five steps is shown in figure 1. It includes the definition of the situation, analysis, research of alternatives, choice and implementation.
DEFINITION OF THE SITUATION
ANALYSIS
RESEARCH OF ALTERNATIVES
CHOICE
IMPLEMENTATION
DECISION MAKER
Fig. 1. Diagram of a decision process [3]
Therefore, making decisions is a fundamental management process, conducted regardless of the level or type of activity. It should be based on the decision-maker’s consideration of all available circumstances (to minimise the risk level). Such circumstances may be divided into two groups [4]: • objective – information available at the given moment, originating from the surrounding data (e.g. statistical), reports, indicators (i.e. KPI - Key Performance Indicators); generally, they are elements of a company's information system, • subjective – aspects dependent on a manager’s experience related to his/her educational background, but also intuition; so-called soft factors. It is unlikely that all the mentioned factors are available at a given moment in the organisation’s reality. Therefore, management decisions are made with uncertainty and risk. Such an approach results from the following threats (appearing in decisionmaking in sociotechnical systems): • lack of sufficient information or information of low quality, • data and information do not constitute knowledge yet; the available information may be difficult to transform into knowledge or such a process may be too timeconsuming or/and too expensive, • the available knowledge may have some defects, such as: uncertainty, lack of precision, incompleteness, • management personnel may have no experience in the given area; no good practices are implemented, processes in a company are immature (they are not repeatable), • no decision support. 1.2 The Need for Support for Decision Processes in Management In view of the decision problems identified above, it must be assumed that a number of decisions made in an organisation are defective. A required solution would be a
Supporting Management Decisions with Intelligent Mechanisms
573
decision-making model where, together with the decision-maker, there is a second entity – a field expert. Such a model is presented in figure 2 below.
Fig. 2. Decision model with the decision support aspect
The model consists of two entities and the links between them together create an environment for decision making. The constant need for decision support for a manager who is about to make a decision is a key assumption here. Each decision, before being made, should be consulted with an expert from a given field (or a few). The decision-maker should pass to the expert the status of knowledge about the problem and the conditions related to it. He should also deliver description of the required status as precisely as possible (quantified – if needed). Only after such consultation, a decision-maker can make a decision possessing the received suggestions. Nevertheless, there are a few fundamental drawbacks of such an approach: • • • •
How can we gain experts from a given field? How can we verify the “quality” of experts? (the issue of trust) Can we count on the full availability of experts? How should we act if there is contradictory information from a few experts?
These doubts illustrate the need to design a model where the above-mentioned problems would be reduced or eliminated. A model has been presented in this paper that aims at supporting decisions in IT organisations. This model may be implemented in an organisation where complex management decisions are made. The concept described in the next chapter relies on expert-based systems (also known as expert systems).
2 The Concept of Support for Management Decisions An expert system is defined as a information technology system containing specialised knowledge on a given area of human activity, giving advice on issues that the user has difficulties to cope with [5]. Such systems provide advice, guidance, and diagnosis concerning problems from the field they are used in. They are able to reason from the resources of properly formalised knowledge, where knowledge bases are separated from reasoning modules. There is no explicit algorithm for solving decision problems.
574
C. Orłowski and T. Sitek
2.1 A Decision Support Model Based on the Expert System Concept In the context of the mentioned defects in the decision-making module, the weakest link is the human factor. Expert systems, in direct cooperation with experts, in some way eliminate defects. The decision-making model mentioned before may therefore be supplemented with an additional element – knowledge bases together with instruments supporting the obtaining and processing of knowledge. The scheme of this model is presented in figure 3 below.
Fig. 3. A decision model including the application of a knowledge-based system
It should be noted (fig. 3) that the decision process proceeds in a similar way to that in fig. 2. In this case, before making the decision there is interaction with an expert system, not an expert. This system is able to support a decision maker by providing a valuable suggestion or advice. However, making such a decision relies on knowledge (and its quality) from the knowledge bases of an expert system. Therefore, a key aspect of building such a model is the adequate preparation of knowledge and the development of the methods of processing it.
3 The Proposed Model of Decision Support in Management Expert systems are applied in numerous fields of human activity. The majority of recorded applications practise decision support for technical systems in reality. A distinctive feature of technical systems is the extensive availability of data that is processed into knowledge. Such data usually come from measurement systems or data bases. They are of objective and indisputable character. The performance of technical systems using these data is clear and well-described. From the point of view of obtaining and processing the knowledge (based on such data) the bases created are complete and are based on rules and/or facts. Contrary to technical systems, social systems are not fully recognised and therefore, hard to describe. This results from fragmentary knowledge or total lack of it. Very often, the obtained knowledge is imperfect. It is also difficult to obtain dependable or sufficiently precise knowledge. It is rarely complete. Therefore, it is difficult to forecast the management of an organisation in such circumstances. Decisions that have to be made in such circumstances must allow for knowledge resulting from objective sources of information as well as any “soft” aspects that are difficult to measure. It also becomes necessary to treat an organisation (a company) as a sociotechnical system [6].
Supporting Management Decisions with Intelligent Mechanisms
575
This paper presents a decision support system in a management area taking into consideration problems arising from its social nature. The authors’ work is an attempt to adapt solutions from technical systems to social and sociotechnical environments. 3.1 IT Companies as an Environment for a Decision Support Model The proposed model is designed to be a general solution. It was developed as a solution to be applied in any organisation where decisions are made in conditions requiring support. These organisations may be companies, non-commercial organisations, science and education entities [7]. 3.2 Knowledge Division Knowledge in expert systems is usually presented via facts and rules. Technical systems aim at complete knowledge models, so the base of rules should include a description of all possible states resulting from a number of combinations of input variables. Physically, there are usually two types of structure – a fact base and a separate rule base. In social systems, this kind of knowledge division proves to be too simple. Rule knowledge may come from various sources, which leads to e.g. possible incoherency. It may appear to be partially uncertain (experts may show their doubts in some aspects). From the point of view of the criteria applied in technical systems, this should thus be rejected. However, in management, this knowledge may prove to be a valuable reference for a decision maker, therefore, it should be kept. Nevertheless, it is important to properly differentiate the knowledge of different “ranks”. Assuming the application of knowledge in the desired form (certain, complete) and imperfect knowledge, priorities for them should be set. A reasoning machine should always use certain knowledge first. In order to have the possibility to set such a chronology, it is necessary to establish a proper logical division of knowledge. It is recommended to activate a number of separate bases. The diagram of such a division is presented in figure 4. Instruments supporting obtaining knowledge during consultations with experts should place each rule or fact in the proper base with the proper tag. As an example, uncertain knowledge should have established certainty factors (CF). This division is not assumed to be constant, knowledge should be continuously monitored and after meeting the given conditions may be moved from one base to another [8].
Fig. 4. Division of knowledge resources into dedicated bases and the pattern of knowledge flow [8]
576
C. Orłowski and T. Sitek
3.3 Quality of Knowledge Imperfection of knowledge obtained in sociotechnical systems has already been mentioned. It needs to be said whether this results from the nature of an organisation or from the improper selection of experts. Perhaps knowledge in the same area from a different source would lack defects. The question about the quality of knowledge leads to questions about trust for the developed model for supporting management decisions. Should the knowledge that is obtained go directly to production bases and act as a base for conducted reasoning? It seems that the risk related to making possible wrong decisions is too high. The model should establish appropriate mechanisms for the verification of the obtained knowledge. In order to minimise the risk of introducing “garbage” into knowledge bases, a mechanism of a so-called knowledge buffer was developed. This concept is schematically presented in figure 5. Adding knowledge Verifying knowledge
EXPERT
BUFFER
PRODUCTION
MANAGER
Fig. 5. Concept of the function of a knowledge buffer [8]
A knowledge buffer assumes that for each knowledge base there is its second instance. Such a copy is identical with the original base as far as aim, structure and links with other bases are concerned. A set of such base-copies creates a non-productive environment where the new knowledge should be moved. It acts as a quarantine for rules and facts which, for a given time, are verified in isolated structures (unavailable for a reasoning machine). It is assumed that the knowledge in a buffer is public. All experts participating in a project have access to it. Therefore, they can assess, comment on and evaluate chosen rules and facts finally leading to their rejection or acceptance and their forwarding to productive bases. 3.4 Cooperation with Experts The extent of obtaining knowledge in the presented model is tightly dependant on positive experiences from cooperation with experts. This aspect of building a model seems "softer". It is hard to describe due to the fact that a lot depends on the experts themselves and links between them and the project. The authors’ observations confirm that cooperation with specialists involved in building a system for decision support in management differs from cooperation with external experts. It is proved that deficiencies in obtaining knowledge result from its incompleteness on the specialists’ side as well as low effectiveness of knowledge engineering (e.g. too long sessions become arduous for experts). “System - person introducing knowledge” interaction must be automatic.
Supporting Management Decisions with Intelligent Mechanisms
577
The proposed model envisages that such problems might appear. An environment for direct contact with experts is distinguished by [9]: • proper logic of the obtaining knowledge process (optimal with regard to the time criterion and to applying criteria which is hard or impossible to measure e.g. ease of the process of adding knowledge based on the authors’ experience), • a user interface that is ergonomic and resistant to errors, where an expert is at each step supported by the system. It must be noted that such problems do not occur within technical systems. Within these, the knowledge is based on objective data, e.g. measurements, and not on the necessity to develop special procedures with the application of sociotechnics. 3.5 Reasoning from Uncertain and Imprecise Knowledge The awareness of possible imperfections in knowledge must be transferred into proposals for solving such a problem in a given model. There are numerous methods of how to deal with this knowledge, distinguished by: • incompleteness, • lack of precision, • uncertainty. Such problems rarely occur in technical systems as imperfect knowledge is rejected due to the need to provide high quality. If the model is to be a base for decision support in sociotechnical systems, the large part of the available knowledge base may be considerably incomplete. Should we use such a base? Yes, because there may be no other option. Therefore, we need to be aware of less precise advice and suggestions. The proposed model assumes the application of diffuse logic (for imprecise knowledge) and certainty levels (for uncertain knowledge). It is vital that knowledge may require the application of both these methods simultaneously. Therefore, correct reasoning will only then be possible if imperfections are noticed already at the stage of obtaining knowledge. The order of applying methods is also important here (first collecting knowledge in a diffuse form, then analysis aimed at indicating the uncertainty level).
4 Verification of the Model The model is a projection of reality designed on theoretical grounds. Its relevance will be proved only when it is verified. A set of experiments was conducted for the proposed solution. They aimed at showing the correctness of certain assumptions and at determining the direction of future work. The authors also wished to check if the model can be used in organisations from various branches of industry (the model was developed relying on experiences in IT organisations). It was established that during the research special emphasis would be put on the following aspects:
578
C. Orłowski and T. Sitek
• Relevance of dividing knowledge bases and knowledge flow (in the area of obtaining knowledge), • Ensuring quality of the added knowledge (in the area of obtaining knowledge), • Developing methods of cooperation with experts (in the area of obtaining knowledge), • Analysis of reasoning mechanisms (in the area of processing knowledge). The first experiment was conducted for a prognostic technical system. The model was applied in predicting the concentration of pollution in the air . Knowledge of chemical concentrations of pollution in the air with measured meteorological parameters was collected from experts in this field in a few iterations. The rule and fact bases were prepared, a buffer mechanism was applied, a simple application model was implemented. It was forwarded to experts who, after adjusting and comparing the results with other models, evaluated the project very positively. The second experiment was conducted for a sociotechnical system. It was applied for decision support in IT companies in the area of selecting the proper technologies (methods/devices) to manage an IT project. Three groups of input variables that should influence such decisions were distinguished: • The maturity of organisations conducting the project – in order to precisely understand the influence of this variable on the choice of methodology to manage IT projects, surveys and laboratory examinations were conducted; a maturity scale was established (based on the one used in the CMMI assessment model [10]) factors determining the evolution of IT organisations were analysed; the example rule (general form, Prolog notation): End_state :- Begin_state, Transition_processes. • Client maturity – analyses of client behaviour conducted in this field with a survey gave knowledge about client capacities (e.g. their adjustment and adequacy for the project) [10], Client_maturity :- Client_adequacy, Client_fit. • The entropy (disorganisation) of the project – it was assumed that specific examples of IT projects would be analysed – implementing corporate architecture; such projects are characterised by high complexity, therefore, they require the precise choice of project management devices and methods; variables were established based on the TOGAF standard [11]; the example rule: Project_entropy :Architecture_Type, Generated_Documentation, Area_Of_Management_Developement. The variables were pre-processed and determining factors were defined for each of them. In addition, dedicated knowledge bases were created. It must be noted that this is a sociotechnical system. Such a multilevel model allowed for the verification of all the established presumptions, at the same time indicating further work.
Supporting Management Decisions with Intelligent Mechanisms
579
5 Summary The paper is a summary of the authors’ work on the model of decision support in management. Based on experiences gained during cooperation with IT companies, the main problems of IT managers or project managers were identified. Key decisions in organisations are made in circumstances far from perfect – without sufficient knowledge in the given field, often by people lacking the proper qualifications and experience. Therefore, the lack of decision support may be a true threat. The need for such support may be realised through the assistance of IT systems. The best solution is a class of systems based on obtained and formalised expert knowledge developed in the previous century. These systems were designed to perform properly reasoning from well-defined knowledge. Unfortunately, managers do not act in such comfortable conditions. Therefore, the proposal to supplement the classic decision model in management with an expert device will only be justified if such a concept is adopted in sociotechnical systems. The designed model was verified through a series of experiments that confirm theoretical presumptions. These experiments prove that the model might be applied both in technical and sociotechnical systems. However, one question remains unanswered: can this model be applied in any organisation? In order to confirm this, more research (experiments) needs to be conducted. Recipients of this solution should also be considered. It might appear that for some companies or institutions a change of the classic decision model is too serious a revolution. They might not be able to integrate their own processes with the applied tool, for example due to the immaturity of such processes. For further research, the authors have chosen organisations that seem able to best accommodate such projects – intelligent organisations (also known as learning organisations). In such organisations, knowledge is regarded as the key resource and its conscious processing might be supported with a dedicated tool.
References 1. Czermiński, J.: Decision Support Systems in Company Management. Wydawnictwo Dom Organizatora, Toruń (2002) 2. Runiański, J.: Before Decission. PWE, Warszawa (1965) 3. Leigh, A.: Perfect Decision-Making. Rebis, Poznań (1999) 4. Koźmiński, A.K., Piotrowski, W.: Management Theory and Practice. PWN, Warszawa (2002) 5. Mulawka, J.: Expert Systems. Wydawnictwa Naukowo-Techniczne, Warszawa (1996) 6. Orłowski, C.: Introduction. In: Orłowski, C. (ed.) Management of Information Technology. Status and Prospects. Pomorskie Wydawnictwo Naukowo-Techniczne, Gdańsk (2006) 7. Sitek, T., Orłowski, C.: Model of Management of Knowledge Bases in the Information Technology Evaluation Environment. In: Wilimowska, Z. (ed.) Information Systems Architecture And Technology: Models of the Organisation’s Risk Management. Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław (2008)
580
C. Orłowski and T. Sitek
8. Sitek, T., Orłowski, C.: Knowledge Base in Multi-agent System for the Evaluation of Information Technology - Division of Knowledge and Knowledge Bases. In: Knosala, R. (ed.) 11th Conference “Computer Integrated Management”. Oficyna Wydawnicza Polskiego Towarzystwa Zarządzania Produkcją, Opole (2008) 9. Orłowski, C., Rybacki, R., Sitek, T.: Methods of Uncertain and Incomplete Knowledge Acquisition In Knowledge Processing Environment. In: Knosala, R. (ed.) 12 Conference Computer Integrated Management, vol. 2. Oficyna Wydawnicza Polskiego Towarzystwa Zarządzania Produkcją, Opole (2010) 10. Carnegie Mellon University, Software Engineering Institute, The Capability Maturity Model: Guidelines for Improving the Software Process. Addison-Wesley Professional, Boston (1995) 11. Chabik, J., Orłowski, C., Sitek, T.: Intelligent Knowledge-Based Model for IT Support Organization Evolution. In: Szczerbicki, E., Nguyen, H.T. (eds.) Smart Information and Knowledge Management: Advances, Challenges and Critical Issues. SCI, vol. 260. Springer, Heidelberg (2010) 12. Orłowski, C., Sitek, T.: Implementation of Enterprise Architecture Project as the Verification of Knowledge Bases in Multi-agent System. In: Orłowski, C., Kowalczuk, Z., Szczerbicki, E. (eds.) Application of Information Technology in Knowledge Management. Pomorskie Wydawnictwo Naukowo-Techniczne, Gdańsk (2009)
Finding Inner Copy Communities Using Social Network Analysis Eduardo Merlo, Sebasti´ an A. R´ıos, ´ H´ector Alvarez, Gaston L’Huillier, and Juan D. Vel´ asquez Department of Industrial Engineering University of Chile {emerlo,halvarez}@ing.uchile.cl, {srios,jvelasqu}@dii.uchile.cl, [email protected] Abstract. Nowadays, the technology usage is a massive practice where internet and digital documents are considered as powerful tools in both professional and personal domains. Although, as useful as they can be in a proper way, wrong practices can appear easily, where the copy & paste or plagiarism phenomenon is not far away from this. Documents’ copy & paste is a world-wide growing practice, and Chile is not the exception. Therefore, all levels of educational fields, from elementary school to graduate students, are directly affected by this. Regarding to this concern, in Chile it’s been decided to tackle the plagiarism problem among students. For this, we apply Social Network Analysis to discover groups of people associated to each other by their documents’ similarity in a plagiarism detection context. Experiments were successfully performed in real reports of graduate students at University of Chile.
1
Introduction
The copy & paste syndrome is growing around the world. This unwanted behaviour is produced when students copy information for their written works or assignments instead of writing their own documents. This is generally done by using information acquired from written sources (web page, books, images, videos, pdf documents, etc.) or classmates’ documents, without making the appropriate references from the original sources [3,8]. Today’s technology allows everyone to access and manipulate huge amount of digital contents making available information and knowledge [15]. Therefore, it is simple to manage all the information gathered from these, or other sources, to create a new document. In Chile, as well as many other countries, the massive proliferation of schools with access to Internet is allowing students to replace their information sources, from printed books to digital media. Some web sites, such as Wikipedia in Spanish, are popular among students to research topics for written work development. Unfortunately, students do not usually cite information sources or used them properly. This proliferation of copy & paste is affecting many fields where education has been severely affected [3,15]. According to recent studies [8], plagiarism is nowadays close to 52% in schools and higher education institutions. In Chile, a total R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 581–590, 2010. c Springer-Verlag Berlin Heidelberg 2010
582
E. Merlo et al.
number of 3200 students and 300 teachers in 16 educational institutions (high schools, universities, and professional institutes) were surveyed by researchers at DOCODE Project1 . Here, results shows that textual copy occurs in nearly 45% of research assignments. It is important to mention that the copy & paste phenomenon per se is not a fault, for which this problem arises when neither citations nor references are properly made. In order to enhance the learning process, we propose to use Social Network Analysis (SNA) techniques [1,9,11] for the plagiarism detection problem. SNA provides a good complementary application for the representation, visualization, and analysis of multiple data sources that make possible to identify patterns and interaction between social individuals [5]. We distinguish two different copy analysis types which are internal and external copy. On the one hand, external copy refers to the use of a textual fragment from any document found in the Web. On the other hand, internal copy refers to the usage of classmates works’ textual fragments. Despite previously mentioned plagiarism techniques, the focus of this work is on internal copies. Furthermore, we defined a document similarity measure in order to visualize the obtained Inner Copy Communities (ICC) (definition in section 3). Specifically, SNA is used to generate a similarity mapping representation among all courses’ documents, where each node role is identified and described in order to determined the ICC. By using the proposed methodology, the aim is to dissolve the set of ICCs obtained inside a given course. The paper structure is the following: First, in section 2, related work on document copy detection is over-viewed. Then, the proposed methodology to determine the ICCs and main contribution of this work is described in section 3. In section 4, a real application and its main results are presented. Finally, main conclusions are detailed in section 5.
2
Related Work
Plagiarism detection for document sources can be classified into several categories [3]. From exact document copy, to paraphrasing, different levels of plagiarism techniques can been used in several contexts [7,16]. Likewise, pairs of documents can be described into different categories as unrelated, related, partly overlapped, subset, and copied [13]. According to Schleimer et al. [12], copy prevention and detection methods are combined to reduce plagiarism. While copy detection methods can only minimize it, prevention methods can fully eliminate it and decrease it. Notwithstanding this fact, prevention methods need the whole society to take part, thus its solution is non trivial. Copy or plagiarism detection methods tackle different levels, from simple manual comparison to complex automatic algorithms [10]. Among these techniques, document similarity detection, writing style detection, document content similarity, content translation, multi-site source plagiarism, 1
Fondef D08I-1015 DOcument COpy DEtection (DOCODE) project.
Finding Inner Copy Communities Using SNA
583
and multi-lingual plagiarism detection methods have been previously proposed [2,3,7,8,10,12,15]. Recent methods subdivides a document into Document Units (DU ), such as words, sentences, paragraphs or the whole document [2,3,7]. As describe in PAN’092 an intrinsic and external detection process can be applied to a given corpus [10]. Intrinsic detection refers to single document analysis where plagiarism must be determined by using exclusively the document text [16], and external detection considers to compare all documents with each other or with an original reference corpus. [10,16]. The effectiveness of detection methods depends on the selected DU , which determines the complexity of the algorithm, hence the execution time, and similarity measures to analyze the obtained results [3,7]. Future trends in plagiarism detection have shown an evolution to Natural Language Processing (NLP) methods, which are contributing to further plagiarism detection techniques [3,4].
3
Proposed Methodology
All communities are defined by their social nature, for which all of them are different. Communities are commonly represented for social behaviour analysis among other particular objectives like enhancing interaction, diffusion and share of knowledge, dissolving bad influences, and several other applications [5]. As previously described in section 1, the main goal of this work is to represent existing copy & paste relations among students. In the following, the proposed methodology is presented for which its main elements are defined. 3.1
Course Document Network (CDN)
Considering a corpus of N documents as D = {D1 , . . . , DN }. Each document Di is generated from a partition of the vocabulary set V , P(V ). Then, the Copy Similarity Relation from corpus D is defined as, CSRij = δ(Di , Dj ), ∀i, j ∈ {1, . . . , N }, i =j
(1)
CDN = (V, E)
(2)
where δ : P(V ) × P(V ) → Ê is a similarity measure for two documents, and CSRii = 0, ∀i ∈ {1, . . . , N }. A Course Document Network (CDN ) is defined as a one-mode simple valued network,
where each vertex represent a students’ document, E ⊆ V ×V V(Vertices) := set of Di with i ∈ D E(Edges) := set of CSRi,j measures 2
http://www.webis.de/research/corpora
(3) (4) (5)
584
E. Merlo et al.
In this undirected graph, edges quantify the similarity among students documents’ content. This case defines a small network given that the average quantity of students is a low number per course. Different CDN instances are produced from both vertices and edges’ characteristics, depending on roles and interaction [5,14]. 3.2
Inner Copy Community (ICC)
We refer to the Inner Copy Community (ICC) as a non-collaborative network. Its members do not want to be publicly related and its interaction does not promote knowledge or learning (e.g. Virtual Communities of Practice) which is associated with most of the nowadays communities. In practice, it is not expected to provoke any interaction among nodes from this type of network where the ideal case is an unconnected network which means no one is related in the Copy Community. Emerging Copy Communities must be detected to avoid establishment of endemic structures or organizations of students which could systematically incur in this bad practice. For this an optimal network layout can be used to reorder nodes to clearly identify each component. A component is defined as a set of one or more nodes, but to talk about community we restrict this definition to at least two nodes [5]. Based on both nodes and relations metrics, it’s possible to describe each node with a defined role inside the community, creating components that forms cohesive groups between them. A cohesive group is defined as components with many ties among members. At this point, it is important who is related and who is not, and how much important is this relation, in terms of centrality and betweenness [5,9]. In terms of Copy Community roles association we propose three types: – Good Student : non connected, or isolated peripheral nodes, located far from the center of the network with no amount of document similarity content. – Suspicious Student : low connected semi peripheral nodes located around the center of the network with a intermediate amount of document similarity content. – Bad Student : high connected core nodes located near the center of the network with a considerable amount of document similarity content. The general hypothesis of Social Networks, according to [5], states that people who match on social characteristics will interact more often. Likewise, people who interact regularly will foster a common attitude or identity, illustrated by dense zones of people who ”stick together”, or whose vertices are interconnected. It is possible to find cohesive groups generated from the last two roles interaction. Bad Students groups can be related with Suspicious Students or vice versa. Furthermore, the presence that each role has in a given network will establish the courses’ yield in a document content creativity context.
Finding Inner Copy Communities Using SNA
3.3
585
The SNA-KDD Process
Our methodology involves the SNA with the Knowledge Discovery in Data Bases (KDD) Process. In this case, the process consists on the following steps, – Data Selection and Preprocessing. The selection of the corpus will be oriented towards documents from students from a given course. Students’ documents are transformed from their original sources, to simple text documents for their manipulation on the CSR evaluation step. As a difference to common text-mining applications, in this case stop-words removal and stemming are considered in the data selection and pre-processing step. However, these steps could not be required by the CSR function evaluation. – Network Construction. Once documents are selected and pre-processed, the comparison method for the CSR evaluation must be defined. Here, different CSR measures could be considered for the construction of several CDN s, allowing the analyst, reviewer, or evaluator to infer the needed information for the decision step. – Analysis, Evaluation, and Knowledge Extraction. Depending on the CDN s’ number of nodes and relations, network visualization algorithms like Kamada-Kawai [6], could be applied. Also, clustering techniques that allows to determine node grouping measures (like degree, core, etc.) and blockmodeling that helps on detecting cohesive groups, roles and positions.
4
A Real Application
Experiments were performed with Spanish documents given from a graduate student’s course of the University of Chile. 4.1
Data Selection and Preprocessing
The assignment’s topic used in the experiment was “how internet works”, where students were asked to explain how TCP/IP worked when transmitting information from a sender to a receiver and vice versa. In total, there were 23 submitted documents with an average of 41 paragraphs. Assignments’ data consisted on Spanish written documents available in digital formats as docx, doc, and pdf. The corpus was transformed into ASCII codified text, excluding all graphics, tables, images or any other data type different from text. 4.2
Network Construction
Once documents were pre-processed, a comparison method and a proper similarity measure was determined. For this, we chose the document paragraph as a minimal copy unit (DU ). As described in section 3.1, given a corpus D = {D1 , . . . , DN }, for which each document Di is represented by n paragraphs
586
E. Merlo et al.
Di = {pi,1 , . . . , pi,n }, where pi,k represents the k th paragraph of document i. A simple document similarity measure (equation 6) is proposed for textual copy, CPi,j =
n m
d(pi,k , pj,l )
(6)
k=1 l=1
where function d : V × V → {0, 1} takes the following values: 1 if pi,k = pj,l or pi,k ∈ pj,l or pj,l ∈ pi,k d(pi,k , pj,l ) = 0 otherwise
(7)
then the CSR matrix for the networks’ construction is determined by CP similarity measure, CSRi,j = CPi,j , ∀i ∈ {1, . . . , n}, j ∈ {1, . . . , m}
(8)
With this measure, in terms of this comparing method we make O(n2 ) comparisons (where n is the number of documents). However, other comparison techniques could be used for the network construction [10]. We used the Pajek 3 program for visualization and SNA metrics evaluation. In network analysis terms, the proposed CSR helps distinguish between isolated nodes (CPi,j = 0) and connected nodes (CPi,j > 0). In order to discover the ICC is required to construct the CDN . To do so, there were 946 extracted paragraphs where we found out that 96 tuples from these paragraphs were exact match. Considering this first step, we can visualize a good approximation for textual similarity in the CDN , which holds the ICC we need. The reason is that not all similar paragraphs correspond to copy for example, some of them may have the similar web site source reference. Using these results we can find some isolated nodes that for sure hold no sign of copy and dense zones of nodes that are in no doubt suspicious. 4.3
Analysis, Evaluation, and Knowledge Extraction
We realized more accurate results depend on the paragraph quality. This factor could be solved if there would be available an intelligent automatic paragraph extractor. As this situation is not solved yet a proper clean up must be made, taking in consideration the paragraphs’ criteria to clearly identify whether the similar paragraphs are in fact copy. We elaborated a PHP application that shows all similar paragraphs occurrences, allowing to visually check and mark the paragraph as copy. As shown in figure 1, the CDN illustrates students documents relationship. There are two nodes nearly the center which hold a huge amount of similarity. These nodes are extremely close. From the CDN analysis, the ICC is formed by 2 nodes which hold 10 suspicious paragraphs in common. Based on this information both nodes suit perfectly in the Bad Students role (section 3.2). All other nodes 3
http://vlado.fmf.uni-lj.si/pub/networks/pajek/
Finding Inner Copy Communities Using SNA
587
Fig. 1. CDN from the student homework corpus using the proposed CSRs
suited in the Good Students role, finding cases which similar information site sources were properly cited. No further analysis techniques previously described were applied due to the network size. For evaluation purposes we wanted to apply two different perspectives. Only one case of copy was detected when we analyzed the ICC obtained. We proceed to the visual inspection of documents, confirming our results. Additionally, we reviewed all the other documents which according to the experimental results did not have any textual similarity in a copy context. Proving again the concordance between the results and our revision, we did not find any proof of textual copy in the rest of documents. Therefore, our methodology was successfully evaluated based on these experimental results. While constructing the network we applied some metrics for each document in order to have some insights about the results (these metrics are defined in table 1). As presented in table 2, seven documents present a lot of similarity. Their scores overpass the average score of 9%. Most of documents are evenly connected, that is why the connectivity measure is also overpassed. We need to consider all possible cases, so metrics by themselves are not sufficient. It is required to combine these metrics involving score and connectivity. In this way SC displays both similarity and connectivity. Only 2 documents highlight in this aspect. This indicates they are both highly similar and connected. 4.4
Academic Actions
In the second case, when evaluators’ results showed no sign of copy, there was only one assigned evaluator who in this case, did not review all the documents at the same time. As a matter of fact, it could happen that the copy cases were not identified due to timing differences. When the evaluator got the information
588
E. Merlo et al. Table 1. Metrics definitions
Metric Definition SP Amount of similar paragraphs occurrences with the rest of documents SU P Amount of paragraphs that present at least one detected similarity P Number of paragraphs S Score measure obtained from SU P/P C Connectivity measure that indicates the % of documents’ associations SC Score-connectivity measure obtained from S × C
Range
0 0 0
[0, 1] [0, 1] [0, 1]
Table 2. Documents information for top-ten students ordered by Score (S) Student 1 2 3 4 5 6 7 8 9
SP 39 22 12 8 10 11 9 1 7
SU P 24 18 4 7 6 2 1 1 6
P 63 50 22 60 53 20 11 12 80
S = SU P/P (%) 0,38 0,36 0,18 0,12 0,11 0,1 0,09 0,08 0,08
C (%) 0,65 0,52 0,52 0,39 0,26 0,52 0,43 0,09 0,17
S × C (%) 0,25 0,19 0,09 0,05 0,03 0,05 0,04 0,01 0,01
about the copy case detected with our method, the first impression was scepticism. The situation was now double confirmed from both parts, so actions had to be taken. This implied to get in contact with the involved students. After a long conversation with both, one of them confessed as the author of the copy. As a resolution they were both punished with bad marks. The idea was to make a statement to all students about copy cases reprisals. Based on this fact, we think people will tend to avoid sharing their own written works due the reputation risk involved in a copy menace. Finally all students were cautioned about the future copy cases implicants.
5
Conclusions
We had shown that educational evaluators got a difficult problem in their hands with the documents’ copy and paste situation and the way technology is being used for this purpose. Accordingly, it is not a minor fact to count with precious informations about the course similar works in order to reduce the detection problem of identifying copy cases. In this work, we have successfully applied SNA to find an interesting hidden community which unlocks the possibility to make a change in actual students’ culture when doing written works. To the best of our knowledge, until now SNA has not been used in a copy detection context. It has been used to modelled and analyze communities, roles interaction, behaviour and evolution through time to name a few among other social applications [5]. SNA generates many insights allowing to diagram the
Finding Inner Copy Communities Using SNA
589
course document network, finding communities and calculating metrics from the full network, nodes and its relations. Until today there are not copy detection methods combined with SNA approaches to find communities or anything alike. However we also realize that SNA alone cannot suffices to resolve the problem. Regarding to an evaluators’ point of view (school teachers, university professors, etc.), our methodology allows to divide revision from detection. It’s not necessary to read all the documents, just to check the suspicious paragraphs. It clearly reduces the time consuming detection task with better and more accurate results. Besides it also adds additional information about the corpus in a content context. We can identify the documents that share the same site sources information, the same contents words, and so on. We selected real documents sample from an undergoing course and prove our SNA copy detection approach. We successfully identified one copy case, allowing an evaluator to take actions regarding the students involved. We conclude with this work that SNA adds value in finding ICC in terms of students’ roles and behaviour. The SNA visualization helps to illustrate a global network representation which gives many insights for the course analysis in a copy detection context. We have introduced the concept of Copy Communities in the copy detection context. As future work, this approach can be complemented with proper text mining techniques or pattern recognition algorithms in order to detect more complex plagiarism levels. Although the present work focuses on the educational field, this application can be extended to other areas. In enterprises it is possible to use its focus to identify work groups who shares similar content documents and promote interaction and align information. In the terms of industrial property, it could be used in the revision process of commercial patent. In the research field its usage allows to order information and form structures of similar content data. Even though this application is addressed to language, it could be adapted to programming code using a different comparison method.
Acknowledgement Authors would like to thank Felipe Aguilera for his useful comments in Social Network Analysis. Also, we would like to thank continuous support of Instituto Sistemas Complejos de Ingenier´ıa (ICM: P-05-004- F, CONICYT: FBO16; www.sistemasdeingenieria.cl); FONDEF project (DO8I-1015) entitled, “DOCODE: Document Copy Detection” (www.docode.cl); and the Web Intelligence Research Group (wi.dii.uchile.cl).
References 1. Batagelj, V., Mrvar, A.: Network analysis of texts. In: Erjavec, T., Gros, J. (eds.) Proc. of the 5th International Multi-Conference Information Society - Language Technologies, pp. 143–148 (2002)
590
E. Merlo et al.
2. Brin, S., Davis, J., Garc´ıa-Molina, H.: Copy detection mechanisms for digital documents. In: SIGMOD 1995: Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, pp. 398–409. ACM, New York (1995) 3. Ceska, Z.: The future of copy detection techniques. In: Proceedings of the 1st Young Researchers Conference on Applied Sciences (YRCAS 2007), Pilsen, Czech Republic, pp. 5–107 (November 2007) 4. Ceska, Z.: Plagiarism detection based on singular value decomposition. In: Nordstr¨ om, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 108–119. Springer, Heidelberg (2008) 5. de Nooy, W., Mrvar, A., Batagelj, V.: Exploratory Social Network Analysis with Pajek. Cambridge University Press, New York (2004) 6. Kamada, T., Kawai, S.: An algorithm for drawing general undirected graphs. Inf. Process. Lett. 31(1), 7–15 (1989) 7. Kang, N., Gelbukh, A.F., Han, S.-Y.: Ppchecker: Plagiarism pattern checker in document copy detection. In: Sojka, P., Kopeˇcek, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 661–667. Springer, Heidelberg (2006) 8. Maurer, H., Kulathuramaiyer, N.: Coping with the copy-paste-syndrome. In: Bastiaens, T., Carliner, S. (eds.) Proceedings of World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education 2007, Quebec City, Canada, pp. 1071–1079. AACE (October 2007) 9. Musial, K., Kazienko, P., Br´ odka, P.: User position measures in social networks. In: SNA-KDD 2009: Proceedings of the 3rd Workshop on Social Network Mining and Analysis, pp. 1–9. ACM, New York (2009) 10. Potthast, M., Stein, B., Eiselt, A., Barr´ on-Cede˜ no, A., Rosso, P.: Overview of the 1st International Competition on Plagiarism Detection. In: Stein, B., Rosso, P., Stamatatos, E., Koppel, M., Agirre, E. (eds.) SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 2009), pp. 1–9 (September 2009), CEUR-WS.org 11. R´ıos, S.A., Aguilera, F., Guerrero, L.A.: Virtual communities of practice’s purpose evolution analysis using a concept-based mining approach. In: Vel´ asquez, J.D., R´ıos, S.A., Howlett, R.J., Jain, L.C. (eds.) Knowledge-Based and Intelligent Information and Engineering Systems. LNCS, vol. 5712, pp. 480–489. Springer, Heidelberg (2009) 12. Schleimer, S., Wilkerson, D.S., Aiken, A.: Winnowing: local algorithms for document fingerprinting. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 76–85. ACM, New York (2003) 13. Seo, J., Croft, W.B.: Local text reuse detection. In: SIGIR 2008: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 571–578. ACM, New York (2008) 14. Thimbleby, H., Oladimeji, P.: Social network analysis and interactive device design analysis. In: EICS 2009: Proceedings of the 1st ACM SIGCHI Symposium on Engineering Interactive Computing Systems, pp. 91–100. ACM, New York (2009) 15. Vandehey, M.A., Diekhoff, G.M., LaBeff, E.E.: College cheating: A twenty-year follow-up and the addition of an honor code. Journal of College Student Development, 468–480 (2007) 16. Eissen, S.M.z., Stein, B.: Intrinsic plagiarism detection. In: Lalmas, M., MacFarlane, A., R¨ uger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 565–569. Springer, Heidelberg (2006)
Enhancing Social Network Analysis with a Concept-Based Text Mining Approach to Discover Key Members on a Virtual Community of Practice H´ector Alvarez1 , Sebasti´ an A. R´ıos1 , Felipe Aguilera2 , 1 Eduardo Merlo , and Luis Guerrero2 1
Department of Industial Engineer, University of Chile {halvarez,emerlo}@ing.uchile.cl, [email protected] 2 Department of Computer Science, University of Chile {faguiler,luguerre}@dcc.uchile.cl
Abstract. In order to have a successful VCoP two important tasks must be performed: on the one hand, it is always important that community provide useful information to every member by a good organization of contents and topics; on the other hand, to understand the behavior of members (i.e. which are the key members or experts, discover communities, etc). Social Network Analysis (SNA) is a powerful tool to understand the communities’ members, however, our theses is that state-of-the-art in SNA it is not sufficient to obtain useful knowledge from a VCoP. Moreover, we think that traditional SNA may lead to discover wrong results. We propose to combine traditional SNA with data mining techniques in order to produce results closer to reality and gather useful knowledge for VCoPs’ enhancement. In this work, we focused in discovering key members on a VCoP combining SNA with concept-based text mining.We successfully tested our approach on a real VCoP with more than 2500 members and we validate our results asking the community administrators.
1
Introduction
Virtual Communities of Practice (VCoP) have experienced an explosive growth in Internet in recent years. VCoPs’ value is to communicate people who wants to share or learn about a specific topic, by interacting on an ongoing basis[9]. The Web facilitates interaction between community members without presentational contact needed, ubiquitous environment and atemporal virtual communication space. To support this virtual interaction are comonly used forums, wikis, and other similar tools. For the VCoPs is very important to generate, store and keep knowledge resulting from members’ interaction. The success of a VCoP depends on a governance mechanism[7] and key members’ participation (so called leader[2] or core R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 591–600, 2010. c Springer-Verlag Berlin Heidelberg 2010
592
H. Alvarez et al.
members[7]). Likewise, every VCoP members’ goal is to learn specific knowledge from the community, therefore, the contents of posts, which specific topic is interesting for a member, etc. must be considered. There are many deffinitions about what is a key member: the most participative member[7], the member who answers the others members’ questions[5] or the member who encourage others members to participate[2]. However, non of these definitions take into consideration the content or the meaning of their interactions (posts, reply, etc.). Interaction is usualy measure by user A reply post of user B, if A answer or not the question successfully it is not taken into account. Those approaches they just consider sequence of posts, even if the posts talk of a topic totally different than the thread which contains them. Therefore, we hypothesized that key members defined this way lead to wrong results and invalidate the results obtained from the aplication of SNA techniques (core discovery techniques, experts discovery techniques, discovery of sub communities, etc.). In our approach, a key member is defined as the member who participate (asking or answering) according to a specific purpose of VCoP. The more aligned to VCoPs’ purposes define the greater degree of importance a member has. This way, a key member must be obtained by combining text mining techniques with SNA techniques in a single process. To measure members’ participation it is very common to apply SNA. The results of this kind of process is a graphical representation which helps to find community core, sub communities, network clusters, peripheral members, etc. key members belong to communities’ core, therefore, we should apply core algorithms to discover them such as HITS. For above reason we use Concept-Based Mining approach developed by R´ıos et al.[8], which measures the purpose accomplishment by entire community. We will focus on discovery of key members using HITS algorithm. To do so, we modeled the community using two representations: reply to the owner of a certain post, and reply to specifyc post. Then, we will build a graph based on clasic manner and how these both tpologies change when using our proposal. Finally, we will extract the top 10 key members administrators’ surveys and we will compare them with HITS applied to those topologies. We showed the benefits of using concept-based mining for SNA representation in a real VCoP with more than 2500 members and 8 years of data. We discovered interesting and promising results.
2
Related Work
There are different kinds of communities. Kim et al. [4] organize Social Web Communities describing the kind of users, uses and needed features for every kind of community. Important missions for a Community of Knowledge are sharing user-created content and how to hold users, specially key members. In this context, VCoP is also a Community of Knowledge, therefore, it should accomplish such missions, but also it must care that community members fulfil their goals (or purpose) when using the VCoP.
Concept-Based SNA on VCoP
593
In order to make communities’ enhancements, there are many approaches to face the analysis and understanding of a community these will be explained in following sections. 2.1
Social Network Analysis
When administrator wants to make changes in the community structure or contents, it is necessary to know how it works the community. However, this implies the understanding of human relationships. How members relate one to each other. SNA helps to understand such relations by graph representation based on users (nodes) and relations (arcs). Then there are several algorithms to extract experts (or key members), classify users according his relevance within the community, discovering and describing resulting sub-communities, among other. Besides, depending on the kind of community the meaning of the analysis may vary. Yulupula et al.[10] made a social network based on e-mails sent between members of an enterprise, in order to predict the organizational structure using SNA. They have mismatches produced by data quality and possibly because they made the network only in a ”sent-mail” approach, losing useful information like the mails’ body. In this case, they only want to know how the community interacted. In other case, Pfeil et al.[6] analyses an older people community. The main idea is to study how emotional communication content exchanged alter the community structure. They found different network configurations for every supportlevel category verifying his hypothesis. The study was made in a community of 47 users with 1.5 years of data (400 messages). That amount of data should not be enough to make other kind of analysis like the relationship between the content and the network structure, finding key members of the community or study the evolution through time of the community itself. 2.2
Classification Methods
Another way to answer: what do you want to enhance? It is classifying some community features. This makes the community improvement decision easier. Even if you do not know what to do, the classification would help to find the improvements that community needs. Sometimes, classification is used as an exploratory data tool to understand the community. However, information obtained by this tools, provide greater benefits to the community. Hong et al.[3] uses Q-A discussion boards to classify posts in question-posts or reply-posts and then find the proper reply to a question. This classification would improve the community in three ways: enhance search quality, bringing suggestions when users ask a similar question and finding experts according to the replies provided. Recognising experts would help to solve questions faster and better in the community. Amatriain et al.[1] improve a recommendation system using the communitys’ experts opinion. They compare user and experts preferences to know which experts are more similar to him. Then prepared a recommendation list based on
594
H. Alvarez et al.
classified experts’ opinion. The advantage of this work is that experts are identified, making a good example of expert-oriented enhancements. Unfortunatelly this approach assume that we already know who are the experts. In both cases we are concern about who the experts are and how they are obtained. For that reason, this work focuses on finding a new approach to discover key members (experts or core members). The better you identifying them the more relevant will be results obtained by other tecniques. Thus, it will be possible to obtain better enhancements for the whole community. 2.3
Concept-Based Text Mining to Enhance VCoP
VCoP talks about specific main topics, one of his worries is that community do not deviate of them. Also, VCoPs’ have purposes (or goals) that administrators want to accomplish. Therefore, it is important for VCoP to supervise the goals through time and analyse its evolution. Then, enhancements will surfaced based on this temporal purpose evolution analysis[8]. R´ıos et al.[8] define the VCoP goals of a website in collaboration with community members. When goals are well established and defined, it is possible the application of concept-based text mining to evaluate the accomplishment of communities’ goals. The concept-based text mining uses fuzzy logic theory to assign a ”goal score” to every forum in the community. This ”goal score” show how aligned to this specific goal is the text inside a community forum. Having these scores, VCoPs’ administrators evaluate the goals’ accomplishment to allow administrators to make the proper enhancements. For example, if two forums have similar score of a specific goal, an enhancement could be merge both. On the other hand, if a category has very high scores in two specific goals, it is possible to split the forum in two independent forums closer to each goal. In this work, the objective was to evaluate the goal accomplishment of VCoPs’ forum, but this does not help to evaluate how users contribute to their purpose accomplishment. Thus, making difficult finding the VCoPs’ key members. However, the use of concept-based mining will improve the search accuracy.
3
Enhancing SNA with Concept-Based Text Mining
Main question of present work is how to enhance key members discovery. This question has no simple answer, the first step is to obtain a graphic representation of the inner social community. The second step is to apply an core members’ algorithm (like HITS) to this representation. As a result of the algorithms’ application, we will obtain a rank of all community members leaving in the top of the rank the experts (core or key) members. We distinguish key member from expert member. Since a key member has several characteristics that define him/her. Firstly, a key member may be expert in a field or not. Secondly, he/she may increase the interaction in the community because he ask interesting questions, which produce answers from the experts on the field. This means that questions are very specific in a field, therefore,
Concept-Based SNA on VCoP
595
only experts are able to answer them. In other words, a key member is a person totally aligned with the VCoPs goals and topics. Thus, producing contents which are very relevant to satisfy other members interests. The only way to measure a key member as we define him is using an hybrid approach of SNA combined with semantic-based text mining. Likewise, the mining process must include always the different purposes of the community. This is why we chose the concept-based text mining approach [8]. 3.1
Network Configuration
As mentioned before, to build the social network we take into account members’ interaction. In general, members’ activity is followed according members’ participation. Participation appears when a member post in the community, which is also the case when modeling a VCoP. Because the activity of VCoP is described according members’ participation, the network will be configured this way: the nodes will be the VCoP members, and the arcs will represent interaction between members. How to link the members and how to measure theirs interactions to complete the network is our main concern. There are forums when you know whos’ replying who, but in other forums this is not that clear. We are going to work with the second one. To face this problem, we will describe two VCoPs’ network representation according to whom member is replying: 1. Creator-oriented Network: when a member create a topic, every reply will be related to him/her. 2. Last Reply-oriented Network: every reply of a topic will be a response of the last post. We can see a typilcal Forum structure in Fig.(1), then in Fig.(2) we can observe how the Forum is converted into a graph. In Fig.(2), arcs will represent members’ reply and nodes represent the users who made the posts. In our first approach, the weight of arcs will be a counter of how many times a member reply to other. The problem is that we are not considering if the reply of members is according to the community purpose (for any of these configurations). We have to filter noisy post. This will be done using the concept-based text mining applied to posts’ texts. Of course, networks will be different from those obtained before, because, in order to draw an arc now we will compute the similarity between concepts on a post and its reply. If a post and a certain reply are sufficiently close, then we say they are similar, therefore, post and reply are relevant in a specific topic (concept). 3.2
Concept-Based Text Mining for Network Filtering
Previous work[8] brings a method to evaluate community goals accomplishment, now we will use this approach to classify the members’ posts according VCoPs’ goals. These goals are defined as a set of terms, which are composed by a set of keywords or statements in natural language. To obtain a goal accomplishment
596
H. Alvarez et al.
FORUM A
POST 1
U1 U2
POST 2
POST 4
POST 3
U3
POST 5
U1
U4 U3
POST 6
Fig. 1. A tipical forum structure, in circles are the users who posted Creator-oriented Network
Last Reply-oriented Network
1 U1
U1 U2
1 U4
U2
2 U3
U4
U3
Fig. 2. Two different network models to represent Forum from Fig.1
score, we use fuzzy logic to evaluate how much a goal is contained in a singular post. Then, we will have a post vector in which the components will be the goals accomplishment scores of the post. The idea is to compare with euclidean distance two members’ posts and if the distance it is over a certain threshold, there will be interaction between them. We support the idea that this will help us to avoid, filter or erase irrelevant interactions. For example, in a VCoP with k goals, let puj the post j of user u that it is a reply to post i of user u (puj ). The distance between them will be calculated with Eq.(1). gik gjk u u d(pi , pj ) = k (1) 2 2 k gik k gjk
Concept-Based SNA on VCoP
597
Where gik is the score of goal k in post i. It is clear that the distance exists only if puj is a reply to puj . After that, we calculate the weight of arc uu (wuu ) with Eq. (2). wuu =
d(pui , puj )
(2)
i,j u d(pu i ,pj )≥θ
We used this weight in both configurations previously described (Creator-oriented and Last Reply-oriented). Afterwards, we applied HITS to find the key members on the different networks configurations. These key members will be validated with the VCoPs’ administrators.
4
Experiment in a Real VCoP
Our experimental VCoP was plexilandia.cl virtual community. Plexilandia1 is a VCoP formed by a group of people who have met towards the building of music effects, amplifiers and audio equipment (like “Do it yourself” style). Although, they have a web page with basic information of community, most of their members’ interactions are produced by the discussion forum. Today, plexilandia count more than 2500 members in more than 8 years of existence and about 2100 active members . All this years they have been shearing and discussing their knowledge about building their own plexies, effects. Besides, there are other related topics such as luthier, professional audio, buy/sell parts. In the beginning the administration task was performed by only one member. Today, this task is performed by several administrators (by 2010 they count with 3 administrators). In fact, the amount of information generated weekly makes impossible to let the administration task in just one admin. We processed 59.279 posts created by 2107 active members from september 2002 to april 2009. We set up a threshold for the dot product in 0, 5. The process takes about 1 hour. We used creator-oriented and last reply-oriented network in the classic manner. Afterwards, we modeled using concept based. We used the same concepts used in previous work [8] and we defined a threshold of 0.5 in order to use Concept-based Text Mining to build VCoPs’ network. To begin with experiments, we asked administrator which are for him the key members of Plexilandia. Although, we are able to perform monthly or weekly analysis, we performed an analysis per year. Since, it is easier to ask administrator which are the key members this year, last year, and so on; than, last week, or week 35th last year, etc. Resulting survey is on Table 1. Afterwards, we developed Plexilandia’s networks with the two topologies mentioned before. Then we build the same topologies but using concept based 1
“Plexi” is the nickname given to Marshall amp heads model 1959 that have the clear perspex (a.k.a plexiglass) fascia to the control panel with a gold backing sheet showing through as opposed to the metal plates of the later models.
598
H. Alvarez et al. Table 1. Key members based on administrators survey User user2 user1254 user37 user808 user4 user1825 user999 user210 user240 user874 user321 user234 user33
Note Administrator – Administrator he is not participating lately – – he is not participating lately – – – participation occasionally –
Table 2. Top 13 key members found with HITS in every approach #rank hits-creator hits-reply hits-cb-creator hits-cb-reply 1 user2 user2 user2 user2 2 user8 user8 user8 user8 3 user31 user31 user226 user210 4 user20 user210 user210 user240 5 user210 user226 user23 user380 (-) 6 user12 user162 user33 user4 7 user380(-) user20 user29 user12 8 user162 user29 user37 user33 9 user226 user12 user4 user226 10 user23 user380 (-) user20 user128 (-) 11 user4 user33 user240 user31 12 user33 user240 user12 user37 13 user211 (-) user4 user31 user29
approach. We used PAJEK2 to draw the networks. Afterwards, HITS was applied to every network topology. These results are shown in Table 2. Finally, we summarized all results on Table 3. This table shows the intersection between key members extracted using an algorithm and key members from the survey on Table 1. We mark users in both lists with an X. We can observe that concept based approach discovered one additional key member in every topology used. Concept-based text mining is used to discover new knowledge, therefore, we wonder what kind of users are those which are not marked with an X on the table. To do so, once more we ask the community expert. This time we showed the list of key members gathered from every algorithm (Table 2). Surprisingly, 2
http://vlado.fmf.uni-lj.si/pub/networks/pajek/
Concept-Based SNA on VCoP
599
Table 3. Key Members’ Summary user hits-creator hits-cb-creador hits-reply hits-cb-reply user2 X X X X user1254 user37 X user808 user4 X X X X user1815 user999 user210 X X X X user240 X X X user874 user321 user234 user33 X X X X
he recognized that most of the users on the lists of key members were in fact key members. He forgot many of them since we are using data from 2002, however, when he saw them on the list he remember them, validating almost every user as key member. We marked with a (−) sign on Table 2 those members which are not key members. We can observe that hits-cb-creator dicovered 100% key members. Unfortunatelly, hits-cb-reply was worse than hits-reply to detect key members.
5
Conclusion
We propose to combine traditional SNA with data mining techniques in order to produce results closer to reality and gather useful knowledge for VCoPs’ enhancement. We applied two network topology to represent the VCoP, creator-oriented and last reply-oriented networks. We used Plexilandia.cl which is a VCoP with more than 2100 active members from 2500 members’ base. We showed that SNA combined to concept based text mining approach outperforms SNA alone to discover VCoPs’ key members in the case of a creatororiented network topology. However, in the case of last reply-oriented network SNA outperformed SNA plus concept based approach. We think results were promising since we used all history to perform the analysis. Besides, we did not took into consideration ranking possition, which seems to be much closer to reality in concept based SNA approaches. Thus, we need more experimentation in order to show the real impact of our proposal.
Acknowledgments Authors would like to thank the continuous support of Instituto Sistemas Complejos de Ingenier a (ICM: P-05-004- F, CONICYT: FBO16); Initiation into
600
H. Alvarez et al.
Reseacrh Founding, project code 11090188, entitled “Semantic Web Mining Techniques to Study Enhancements of Virtual Communities”; and the Web Intelligence Research Group (wi.dii.uchile.cl).
References 1. Amatriain, X., Lathia, N., Pujol, J.M., Kwak, H., Oliver, N.: The wisdom of the few: a collaborative filtering approach based on expert opinions from the web. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, MA, USA, pp. 532–539. ACM, New York (2009) 2. Bourhis, A., Dub, L., Jacob, R., et al.: The success of virtual communities of practice: The leadership factor. The Electronic Journal of Knowledge Management 3(1), 23–34 (2005) 3. Hong, L., Davison, B.D.: A classification-based approach to question answering in discussion boards. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, MA, USA, pp. 171–178. ACM, New York (2009) 4. Kim, W., Jeong, O., Lee, S.: On social web sites. Information Systems 35(2), 215–236 (2010) 5. Liu, X., Croft, W.B., Koll, M.: Finding experts in community-based questionanswering services. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, Bremen, Germany, pp. 315–316. ACM, New York (2005) 6. Pfeil, U., Zaphiris, P.: Investigating social network patterns within an empathic online community for older people. Computers in Human Behavior 25(5), 1139–1155 (2009) 7. Probst, G., Borzillo, S.: Why communities of practice succeed and why they fail. European Management Journal 26(5), 335–347 (2008) 8. R´ıos, S., Aguilera, F., Guerrero, L.: Virtual communities of practices purpose evolution analysis using a Concept-Based mining approach. In: Knowledge-Based and Intelligent Information and Engineering Systems, pp. 480–489 (2009) 9. Wenger, E., McDermott, R.A., Snyder, W.: Cultivating communities of practice. Harvard Business Press, Boston (2002) 10. Yelupula, K., Ramaswamy, S.: Social network analysis for email classification. In: Proceedings of the 46th Annual Southeast Regional Conference on XX, Auburn, Alabama, pp. 469–474. ACM, New York (2008)
Intelligence Infrastructure: Architecture Discussion: Performance, Availability and Management Giovanni Gómez Zuluaga1, Cesar Sanín2, and Edward Szczerbicki2 1
Telecomunications Engagement Architect, Sun Microsystems, Inc. Bogota, Colombia [email protected] 2 School of Engineering, Faculty of Engineering and Built Environment The University of Newcastle, Australia University Drive, Callaghan, 2308, NSW {Cesar.Sanin,Edward.Szczerbicki}@newcastle.edu.au
Abstract. Organizations develop a growth strategy of their informational platform based on certain key drivers. This determines business plans and their future informational and analytical capacity. However, each solution has a different deployment depending upon the selected architecture and the management model. In the specific case of Business Intelligence (BI) infrastructures, this should be decided according to the speed of the decision making processes, which is usually executing on real time. Therefore, it determines the flexibility rate at which the business can grow. Businesses grow but the key drivers can remain the same. This paper analyzes the elements required for an optimal deployment of business architecture. Keywords: Business Intelligence, Architecture Management, Architecture Performance, Service Availability, Sizing Architecture.
1 Introduction Nowadays, applications on demand are deployed in real time platforms. Such applications must be restricted in terms of response time and therefore require a protocol which allows detecting and correcting errors during execution and response times under certain restrictions. If they do not follow specific response times, it can be said that the application has failed. In order to guarantee a correct behaviour under a correct response time, the application must be predictable. Then, the required infrastructure for such applications must have scalability characteristic. Therefore, it is said that “real time” is one of the most costly variables in terms of infrastructure because it defines the Service Level Agreement (SLA) that the user is willing to pay. Applications are designed taking into consideration non functional business components; however, on any case, three basic variables must be determined when designing a Business Intelligence (BI) platform: Performance, Availability and Management. Following, these three aspects are analyzed in terms of BI architectures. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 601–610, 2010. © Springer-Verlag Berlin Heidelberg 2010
602
G. Gómez Zuluaga
2 Business Intelligence Performance 2.1 Deployment Time Performance is determined by the way in which deployed components are executed in the least amount of time. Experience and literature offer reference benchmarks that allow estimating used times in BI architectures. However, the best way to determine execution times is understanding, which resource generates the major conflict on every tier and confronting the problem by neutralizing and understanding the dynamic workload aggregation model (resources on demand). There are two important points on deployment of BI Architectures: Tiers and resource consumption. Table 1 illustrates each tier and its resource consumption (“+”: the resource is used; “++”: the resource is highly used; “-”: the resource does not have implications in the tier). Table 1. Tier resource consumption requirements.
Each application behaves differently; however, particularly in BI, resources are consumed according to what is occurring. Services are presented and, according to the usage schema, they must be consumed; however, this is determined by the kind of application. What about the resources? It is required to observe a way in which the resources are consumed in the deployment model. In our analysis, granularity is defined by tiers. 2.1.1 Multi Tier Architecture for BI Applications Tiers allow understanding of resources’ consumption. In multi-tier programming, access to resources is assigned in a dynamic form. In order to understand a BI architecture by tiers, it is required to model the resources by using a multitier architecture. The multi tier architecture includes three tiers: Presentation Tier or User Tier, Business and Logic Tier and Data Tier [2]. All these tiers must be in different servers; thus, services are segmented for BI. Even though it is usual for the Presentation Tier to reside in several servers (server architecture), the Business and Logic Tier, as well as the Data Tier, can reside in the same computer. Nevertheless, if the size or complexity of the data base increases, they can be separated in two or more computers which would receive the requests from the server that contains the Business and Logic Tier. In case that complexity obliges to segment the Business and Logic Tier instead, this tier could reside in a series of servers that would perform requests to a unique data base. Additionally, in very complex systems, it is possible to have a series of
Intelligence Infrastructure: Architecture Discussion
603
servers that execute the Business Tier and a different chain of servers to operate the data base. 2.2 Resource Consumption: CPU Design From table 1, it is possible to infer that memory is a resource that acts as “commodity” in regards to urgency, that is, if there is a problem in any application, it can be neutralized by the use of memory. For instance, if the problem is about CPU contention, it is possible to generate a dynamic access to virtual memory as a way to avoid operation processing bottle necks and improve performance. If the case is with I/O or networking, the solution can be the same. Memory is the most important resource within hardware, and characteristics such as error correction and problem detection on run time turn the memory into an upmost component in terms of execution of applications. Following the above analysis, it is possible to determine that memory is a resource that must be taken into account for this paper discussion. Thus, It is clear that it is a critical resource, however, it is equally critical for all tiers; therefore, it is decided to ignore it in our analysis. The Presentation Tier represents applications allowed to execute in the users’ view. Such applications must abide to general security, precisely because of its condition of being exposed. Some time ago, this tier use to be named DMZ (demilitarized zone) [3]. Nevertheless, nowadays, it is not just about services provided, but about the load each application creates. Examples of applications within the Presentation Tier are: net services, authentication services, web services, portal tier 1 services, and authorization services. In that sense, it is possible to refer to BI as a service as well. BI models determine the existence of services of: net access, authorization and authentication; therefore, there is a web service that contains documents offered for users’ consultation. BI platforms already developed are thought to be multi platform architectures, that is, eBusiness platforms which use .NET and JAVA as deployment methods. However, the pseudo languages or scripting languages offered by them such as PHP, Perl, Phyton, are not gaining market due to lack of support and scalability. Applications in the Presentation Tier, and according to Table 1, are network consuming, implicating a requirement of good network design and scalability availability in the stack TCP/IP. Variables such as the operative system (OS) or communication control are fundamental in terms of scalability performance. Even though the CPU is not a fundamental resource in this tier, it is very important to count with multithread CPU in order to apply dynamic instancing of the exposed applications. Processors such as Power from IBM [4] or Sparc from ORACLE/Sun [5] allow opening multithread in real time; thus, such servers can respond to scalability needs established for BI. Clock speed becomes a variable of much less importance than the means used to access the resources from the CPU. Therefore, manufacturers aim to improve the performance by optimizing the way in which instructions and buses complement each other. Having a “high clock” indicates that applications must execute faster, that is, with high clock frequency; however, they still must wait for the time they are required. One of the reasons why applications designed for 1 Thread did not succeeded due to the way in which the CPUs can be accessed generating latency. In this order of ideas, the aim of designing any processor is to allow processing resources to be used 100% of the time as much as possible (i.e. always). This should not worry platform
604
G. Gómez Zuluaga
managers who prefer to count with idle resources as a way to develop decision making strategies when a problem arises. If the platform is using 100% of the resources, it means the platform is having an optimal use of resources. On the other hand, what everyone would prefer is to minimize idle times and have an infrastructure that is being used as much as possible. Such idea is just possible having into account stateof-the-art technologies. All threading models aim to achieve that. In general, nowadays microprocessors industry is turning into multiple threads models. And improvements in clock speed are not as they used to be in term of performance such as the concept of extended efficiency. Now, having a look at market tendencies, it is possible to identify that infrastructure design is leading precisely to what the market requires: (i) Network computing is thread rich, (ii) Moore’s law provides an ever-increasing transistor budget, (iii) Worsening memory latency and (iv) Growing complexity of processor design. Four trends, two positive and two negative are all converging. The big bang is happening now and we are all influenced by it. We need to look into these issues very closely as we design our next generation microprocessors. BI applications must support a multithreading model [6] in order to improve deployment performance. This is implemented when the source code of the solution is created, that is, several independent execution paths are coded and can be executed in parallel. The importance of the solution management is that it protects: primitives for parallel programming, multithreaded libraries and parallelized compilers that can be used to generate multithreaded applications. As a result, a program is executed containing several programmable instances called: lightweight processes, or lightweight threads or just threads. At the same time, software applications must support multithread deployment and of course, it must provide support to hardware execution of multiple threads. In such a case, if the hardware deployment is multithreaded, there is no programming control. SMP systems (Symmetric Multi Processor) [7] were the first step towards multithreading on processor by processor base. In SMP, threads work in parallel, but there is no additional utilization of the new idle hardware (comparing to hardware multithreading). Cellular Multithreading (CMT), Simultaneous Multithreading (SMT) and Virtual Terminal (VT) try to solve SMP limitations from different perspectives. Summarizing, in terms of processing, there are two approximations that aim at protecting new generation applications’ deployment while offering a better informational business deployment. Firstly, ILP (Instruction Level Parallelism) [8] establishes better performance according to the instructions type and how they can access through the same bus the processor or the executing instructions. Secondly, TLP (Thread Level Parallelism) [9] which proposes similar executions to processor level, that is, more instructions can be executed at the same clock moment. 2.3 Disk IO Performance 2.3.1 Disk Architectures Three possible disk architectures can be taken into account; however, their characteristics must be consulted in other publications [10]. We will refer to them as: DAS: Direct Attached Storage, NAS: Network Attached Storage and SAN: Storage Area Network.
Intelligence Infrastructure: Architecture Discussion
605
2.3.2 BI Shared Disk Features Disc architecture for BI must allow a tiers deployment in order to improve data access performance. Therefore, it is not required to use a SAN architecture; however,it is possible to combine some disk architectures depending on the homogenization and costs generated. In is then required to consider some BI data display characteristics as follows: Latency, Connectivity, Distance, Availability and Security. From the beginning, safety in shared disks has been a key factor due to the possibility of a system being able to access a device that does not represent or interfere with the flow of information; reason for which a zoning technology [11] has been proposed and implemented. In other words, that means that groups of elements are isolated from the rest to avoid the above problems. Zoning can be implemented by hardware, software or both; for instance, it is possible to group them by port or WWN (World Wide Name), or an additional technique implemented at the storage device is to have a LUN (Logical Unit Number) [12] accessible only by a predefined list of servers or nodes (deployed with the WWN). 2.4 Networking: Share Resource Access in BI The concept of networking is very important when designing BI applications. Taking into account that resources (network) are in the access and business tiers, there should be more restrained due to its exposure condition in a tiered architecture. Usually, networking is focused on data transfer between tiers; however, network access is represented by transactions per second generated by applications. A good network design has positive implications in the architecture deployment; thus, it has to take into consideration: Service Segmentation, Grade separation and Security. Properly sizing of a BI service network to solve the user’ needs must consider the following features which describe networking as a resource in tiers: Speed, Safety, Availability, Scalability, and Reliability.
3 BI Availability: How to Reach Six 9's High availability cannot be achieved by merely installing failover software and walking away. Computer systems are very complex and solutions that make them work best are complex as well. They require planning, work, testing, and a general ongoing effort by knowledgeable people at every implementation level. Applications must be failure tolerant, and even more important, hardware must be able to automatically recover from failures. Therefore, if you want this functionality to be always present, you must regularly test your recovery procedures. It is important to continually check their operation and measure response times. In case they do not work as expected, the time to find out is not when something has gone wrong and the pressure is on. Consequently, the goals are to help designing better BI systems for maximum availability and predictable downtime and recovery times. For such aim, it is required a deeper looking at these architectures from end to end, and from network, services, and application perspectives. In a real system, if one component fails, it is repaired or replaced by a new component. If this new component fails, it is replaced by another, and so on. The fixed
606
G. Gómez Zuluaga
component is considered in every stage as a new component. Thus, during its lifetime, the components can be seen in any of the following states: Running or repairing. Running state indicates that the component is operational, and repairing means it has failed and has not yet been replaced by a new component. Subsequently, the first problem is to measure the availability of such component in a BI solution understanding that availability of 100% does not exist. Conversely, an ideal state is to bring about ‘infinite’ 9's, i.e. 99.99999...%. In other words, that means that, as a probabilistic measure, close to 100% availability should be conceived from the design in all architecture tiers and applications. Maximum availability is considered the minimum availability of any component that is deployed in a BI solution. Hence, that is where managing availability becomes so complex due to the existence of non-functional components that turn everything more complex at measuring times. For instance, electrical components are not necessarily part of a straightforward BI design and implementation. Therefore, we must conceive everything as a probabilistic measure in which any component should be viewed in detail. The example below illustrates a user who tries to access a system which must have some transactions on the Application Server and on the File Server. If the network interface of his/her PC does not work, by extension, the BI solution will not work. Although the network interface is not formally part of the BI solution (BI design has disjunctive components), it is obvious that the user's perception is that the ‘system is unavailable’. 3.1 Levels of Availability by Architecture 3.1.1 Tier 1 – Fail Over / Fail Back This is the first phase of an availability deployment. Nowadays, it is considered that there is no system that presents preferences for the Failover concept, considering that all market components such as disks, CPUs, memory and networks allow you to apply Failover. If a hardware failure occurs in any machines, the hardware itself is able to automatically continue servicing because it has a replicated component (failover component). Once the failed machine recovers, services are migrated back to the original machine (failback). Such recovering ability guarantees high availability of services offered by servers throughout hardware, minimizing failure perception by the users. 3.1.2 Tier 2 – Load Balancing In case of multiple nodes, components can be balanced allowing nodes to act as failure recovery elements among themselves. This is used for network services providing that not many data must be synchronized. Services that allow you apply load balancing can improve access to applications while avoiding total system down time. Thus, in this way, there is always a node responding, and it is possible to implement automatic reconfiguration of the services depending on who is delivering the information. Load Balancers can operate at the application level by using algorithms like Round Robin or Work Load Aggregation [13]. Most business applications support load balancing architectures allowing greater components availability. Nevertheless, Load Balancers can be operated outside of the hardware level, which makes them
Intelligence Infrastructure: Architecture Discussion
607
independent of what is being run on the load balancing tier. This provides better performance due to the way network resources are being consumed. Additionally, by providing an Application Pool on each of the switches that are deploying services and by using cascade architecture Level 7 [14], it is possible to run multiple tasks at a higher level balancing; a feature that is not allowed Load Balancers at the application level. 3.1.3 Tier 3 – High Availability Clustering [15] A cluster is a management group with at least a service running on multiple nodes. This cluster is continually testing the environment in order to keep the system safe of component failures. If a component fails, the management group intervenes the node by reconfiguring the Intrusion Prevention System (IPS) and the application. This is an automatic operation, and the difference with the Load Balancer relies on the way the data synchronization is done. In case of clusters, there is a lot of information to be synchronized, and therefore, external data and SAN disks are used (as discussed above). 3.1.4 Business Intelligence Availability In BI architectures, the key issue relies on identifying the type of Tier and deployment to be used. The best model for an extended model of availability in BI is to have a combination of different Tiers. That is, in term of: • • •
Access Logic Data
Use Load Balancers, Use Load Balancers plus Clusters Use Clusters Active/Passive [16]
This architecture model allows a deployment with better performance, higher availability and ensuring a higher level of service to users accessing the application. Services of each tier bring components that allow evaluating the way in which the deployed services will not be committed. Web services allow you to execute multiple segmented documents, and as a result, virtualization plays an important role in this kind of architecture. In service applications, Load Balancers are already prepared for a component-oriented design. And finally, a high availability data model is possible according to motor, that is, in which way the service is presented (availability and performance). 3.2 Resource Optimization: Virtualization and Deployment Resources Virtualization has taken many ways in technology, and unfortunately, it has become in a buzzword. A historically review of the concept tells us that the concept comes from mainframes and the way in which resources were exposed. Nevertheless, at the moment, it has a more important context because of its costs and ease of management. In architecture deployment, virtualization refers to the abstraction of server resources, called a hypervisor or VMM (Virtual Machine Monitor) [17], which creates an abstraction layer between the hardware of the physical machine (host) and the operating system virtual machine (guest). In this way, the architecture creates a virtual version of a resource such as a server, a storage device, network, or even an operating system, and divides the resources into one or more execution environments.
608
G. Gómez Zuluaga
This software layer (VMM) operates, manages and arbitrates the four main resources of a computer: CPU, Memory, Network, and Storage. Subsequently, the architecture can dynamically allocate those resources among all virtual machines defined by the central computer. This way, it is possible to have multiple virtual machines running on the same physical computer. Finally, one of the most important characteristics of virtualization is the possibility of hiding technical details through encapsulation. We recommend a virtualization solution for BI architectures for multiple reasons. They are: • • • • • • • • •
Fast integration of new resources for virtualized servers. Reduced costs and space consumption. Global centralized and simplified administration. Allows data centre management as a resource pool or by grouping all processing power, memory, network and storage available on the infrastructure. Improved cloning processes and system backups. Isolation because a general failure of a virtual machine system does not affect other virtual machines. Reduces downtimes. Dynamic balancing of virtual machines among physical servers. High overall satisfaction of the users.
In BI solutions, it is recommended to optimize the access and application tiers while the data layer should not be virtualized due to the way in which IO resources are accessed in a shared resource architecture.
4 BI Management – Provisioning and Management Although the ‘Cloud’ concept is still a paradigm, one of the challenges nowadays is to provide computing services through internet. And this applies to all resources that have been described in the paper. In a cloud model, everything that can offer a BI system would be offered as a service; thus, users can access the available services in the cloud without knowledge (or at least without being experts) of management of the resources they use. According to Hewitt [18], “Cloud Computing is a paradigm in which information is permanently stored in servers on the internet and cached temporarily on clients that include desktops, entertainment centers, table computers, notebooks, wall computers, hand-helds, sensors, monitors, etc”. Although PCs capabilities have improved dramatically these days, the paradigm is explained with the fact that much of PCs power is wasted due to their condition of being general purpose machines. However, in terms of blocks, it is important to formulate cloud infrastructures and platforms from two different perspectives that allow us to interpret what sort of resources can be offered by each of them. And therefore, management is traversed because no matter how resources are offered and how they are consumed, what is important in this architecture is the way resources are managed.
Intelligence Infrastructure: Architecture Discussion
609
5 Recommendations and Conclusions On accessing platforms, the most important characteristic is to present a model that executes many processes at the same time; hence, the trend is multi-threaded execution. Consequently, it is all about parallel programming and the way in which network resources are distributed, offered and consumed. In general terms, a good network design does not necessarily imply a very good bandwidth, but the channel optimization and the dynamic aggregation of the stack. Even a 10Gb Ethernet can be a bad deployment choice if you provide a channel without multiplexing to the application that consumes the resources. In this case, there would be a very good channel, but improper use if there are not connection management applications. In terms of network resources, each network interface server opens a stack of TCP/IP, if an instance is opened several times using tagging or card virtualization, the access can be improved. In such a case, you should not expand the bandwidth, but improve it. Network applications such as DNS, DHCP, or any other should segment the services for the network resource by using multiple connections. Methods such as virtualization allow you to implement that. The simplest case is to manage a network interface with two IP addresses by using aggregation network or just VIP Tagging operating system. From our experience, the most optimal way to improve performance on the Net is to have multiple physical Network Interface Controllers (NIC). This prevents competing for resources and prevents conflicts and collision handling routes. Applications have different types of workloads. Particularly in Business Intelligence, there is resource consumption in concurrence with what is being presented. Services are exposed, and in the usability outline, these must be consumed, but this method is determined by the type of applications. For our analysis, granularity is determined by the tiers and deployment of BI applications is optimized by using horizontal scaling and vertical applications for databases, or by a grid configuration with horizontal scalability. Finally, the architecture must be sized within their functional resources and according to the load profile of the software component that executes on a specific layer.
References 1. IBM. Three-tier architectures, http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index. jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/covr_3tier.html 2. McLaughlin, B.: Business Logic, Part 1. Building Java Enterprise Applications, Architecture, vol. I.O’Reilly and Associates (2002) 3. Maiwald, E.: Network Security: A Beginner’s Guide, 2nd edn. McGraw-Hill, Osborne (2003) 4. IBM. Semiconductor Solutions, http://www-03.ibm.com/technology/index.html 5. ORACLE. Sun SPARC Enterprise M5000 Server, http://www.oracle.com/us/products/serversstorage/servers/sparc-enterprise/031732.htm
610
G. Gómez Zuluaga
6. Sun Microsystems. Multithread Programming Guide. Part No: 816–5137–12 (2008) 7. Elbaz, R., Champagne, D., Gebotys, C., Lee, R., Potlapally, N., Torres, L.: Hardware Mechanisms for Memory Authentication: A Survey of Existing Techniques and Engines. In: Gavrilova, M.L., Tan, C.J.K., Moreno, E.D. (eds.) Transactions on Computational Science IV. LNCS, vol. 5430, pp. 1–22. Springer, Heidelberg (2009) 8. Kasim, H., March, V., Zhang, R., See, S.: Survey on Parallel Programming Model. In: Cao, J., Li, M., Wu, M.-Y., Chen, J. (eds.) NPC 2008. LNCS, vol. 5245, pp. 266–275. Springer, Heidelberg (2008) 9. Kejariwal, A., Casçaval, C.: Parallelization spectroscopy: analysis of thread-level parallelism in hpc programs. In: Proc. of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Raleigh, NC, USA, pp. 293–294 (February 2009) 10. Wikipedia. File Area Network, http://en.wikipedia.org/wiki/File_area_network 11. Curtis, W.: Using SANs and NAS: Help for Storage Administrators. O’Reilly Media, Sebastopol (2002) 12. Sun Microsystems: Sun Cluster 3.x With Sun StorEdge A1000 or Netra st A1000 Array Manual for Solaris OS. Part No: 817–0171–10 (2004) 13. Speitkamp, B., Bichler, M.: A Mathematical Programming Approach for Server Consolidation Problems in Virtualized Data Centers. IEEE Transactions on Services Computing 99 (2010) 14. Kalligeros, K.: Platforms and real options in large-scale engineering systems. Massachusetts Institute of Technology. Engineering Systems Division. Massachusetts Institute of Technology (2006) 15. Chee-Wei, A., Chen-Khong, T.: Analysis and optimization of service availability in a HA cluster with load-dependent machine availability. IEEE Transactions on Parallel and Distributed Systems 18(9), 1307–1319 (2007) 16. Bianco, J., Lees, P., Rabito, K.: Sun Cluster 3 programming: integrating applications into the SunPlex environment. Sun Microsystems Press, Santa Clara (2005) 17. Tulloch, M.: Understanding Microsoft Virtualization R2 Solutions, pp. 23–29. Microsoft Press (2010) 18. Hewitt, C.: ORGs for scalable, robust, privacy-friendly client Cloud Computing. IEEE Internet Computing 12(5), 1–34 (2008)
Geometric Considerations of Search Behavior Masaya Ashida and Hirokazu Taki Wakayama University, 930 Sakaedani, Wakayama, Wakayama, Japan [email protected]
Abstract. In this article, behavior of a heuristic search with a heuristic function divided into two parts is discussed based on a geometric analysis, and a method for improving the heuristic function is proposed. One part of the divided heuristic function is called short-term forecast. Another part is called long-term forecast. Based on the discussion, it is suggested there exists a possibility that the difference of accuracy of both forecasts leads the search to an incorrect path. Since there exists a possibility that the search becomes efficient and that the search does not select an incorrect path by improving accuracy of a long-term forecast, a method to update the value of a long-term forecast is proposed. It is shown that the proposed method is effective under the tree structure which cost of the edge observed at deep location is greater or equal than that observed at shallow location.
1
Introduction
In a problem that is solved by a tree search method, the process of problem solving becomes efficient by evaluating each node in the tree structure described a state space. There are two types of evaluation of a node. One is a path cost from the initial node to the current node, the other is a path cost from the current node to the closest goal node. The latter evaluation is given by a heuristic function. Efficiency of a search method with a heuristic function depends on the accuracy of its heuristic function. Using inaccurate heuristic functions, there is a possibility of leading the search away from an appropriate solution or of becoming cautious. As a result, revisiting some nodes causes a decrease in efficiency. In the case of problems that it is difficult to revisit expanded nodes, there is a possibility of a decline in solutions quality by inaccurate heuristic functions. The value of a heuristic function is regarded as the forecasts of requisite costs in future. Under the environment which the costs in future can be forecasted from a situation, improvement of the search efficiency is expected by changing the heuristic values. For example, it is difficult for a car driver who knows driving time only in normal traffic about a given route to find an alternative route in traffic congestion efficiently. An experienced driver who can forecast the driving time in traffic congestion about any street in a town will find another route more efficient than the novice driver. The experienced driver will find another route even more R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 611–619, 2010. c Springer-Verlag Berlin Heidelberg 2010
612
M. Ashida and H. Taki
efficient by forecasting the situation of a distant place near the destination in addition to the present location. In this article, we divide a heuristic function into two parts. One is shortterm forecast which estimates a cost of each edge appears in expanding current node and the other is long-term forecast which estimates a path cost from the expanded node to the closest node. In terms of the difference of accuracy between these two forecasts, we discuss the behavior of the search with these two forecasts. In order to consider the behavior we illustrate the relation of values of two forecasts and their accuracy on two-dimensional coordinate system. As a result of the discussion, it is suggested that desirable accuracy of two forecasts is almost the same level to avoid finding inappropriate solutions. Based on this result and the tendency that short-term forecast is more accurate than long-term forecast, we propose a method to update value of a heuristic function. it is shown that the proposed method has properties which is possibility of improving the long-term forecast under some assumptions about the state space. In the following sections, we first show the notation of this article and a search strategy under our consideration. Second, the behavior of the search is discussed in terms of accuracy of the forecasts. Finally, an update method of heuristic value is proposed and the properties of the method are shown under some assumptions.
2 2.1
Notation and Search Strategy Notation
A state space is described as a tree structure. v and e denote a node and an edge, respectively. Let vi and vi+1 be adjacent nodes. An edge between these two nodes is denoted by ei = [vi , vi+1 ]. Each edge has a positive cost denoted by c(vi , vi+1 ) and the cost can be estimated. We denote the estimated cost by cˆ(vi , vi+1 ). A path from a node, vi , to another reachable node, vj , is denoted in p = [vi , . . . , vj ]. Let w be a node and gi be one of the reachable leaf nodes from w. The cost of a path, pi , from w to gi is denoted by hpi (w) and its estimation is ˆ p (w). Since there is each path from w to gi , we denote the minimum denoted by h i ˆ value of hpi (w) by h(w) and denote its estimation by h(w).
Fig. 1. v is the current node. S = {w1 , · · · , wi , · · · , wm } is a set of generated nodes. w ∈ S which has the minimum value of fˆ(v, wi ) is selected.
Geometric Considerations of Search Behavior
2.2
613
Search Strategy
Let S = {w1 , . . . , wm } be a set of generated nodes in expanding v, and g be one of the reachable leaf nodes from wi . In a search strategy under our consideration, in order to decide the node, w ∈ S, which is selected next, it is evaluated that the estimated cost of a path from v to each g through wi . Since actual cost is unknown the evaluation function is defined as follows: ˆ i ). fˆ(v, wi ) = cˆ(v, wi ) + h(w
(1)
The evaluation function consists of two parts. cˆ(v, wi ) corresponds to short-term ˆ i ) corresponds forecast which estimates each cost of edge between v and wi . h(w to long-term forecast. Since there are some paths corresponding to each leaf node the long-term forecast estimates the minimum value in costs of these paths. In a heuristic search method with list of OPEN(openlist) like A*[1], all nodes with its evaluation in S are merged into openlist and all nodes in openlist is sorted in ascendant order. After that the head of openlist will be selected as the current node. In a heuristic search method without openlist like RTA*[2], the node in S will be selected such that fˆ(v, wi ) is the minimum value.
3 3.1
Behavior and Accuracy of Heuristic Function Path Space
Path space is a plane with x-y coordinate system[3]. The x-axis refers to c(v, w), and the y-axis refers to h(w). If it is known that the actual cost, f (v, w) = c(v, w) + h(w), of a path from v to one of the reachable leaf node g through w, the path is plotted as a point on the path space at (x, y) = (c(v, w), h(w)). Consider a line on path space: y = −x + c(v, w) + h(w),
(2)
f (v, w) appears as y-intercept of the line. Let wi be a fringe node, which is in openlist or in a set of generated nodes recently, and v be the parent node of wi . A search process of selecting wi which has the minimum value of f (v, wi ) in all fringe nodes is replaced with finding a line which has the minimum value of y-intercept in all lines corresponding to each path on path space. 3.2
Behavior in Selecting Node
Let 0 < α ≤ 1 and 0 < β ≤ 1 be the accuracy of short-term forecast and longterm forecast, respectively. Using these factors and actual costs, we express the values of both forecast as: cˆ(v, w) = αc(v, w) ˆ h(w) = βh(w)
(3)
614
M. Ashida and H. Taki
Let w be the selected node and w be other nodes in fringe nodes except w. Each value of the evaluation function for w and w satisfy the inequality: cˆ(v, w) + ˆ ˆ ). Applying equation (3) with this inequality, we obtain: h(w) < cˆ(v, w ) + h(w α h(w ) − h(w) > − (c(v, w ) − c(v, w)). β
(4)
This inequality shows that other paths, (x, y) = (c(v, w ), h(w )), except (x, y) = (c(v, w), h(w)) exist in the region expressed as follows: α y > − (x − c(v, w)) + h(w) β
(5)
The gradient of its boundary line of region (5) changes due to the ratio, α/β. Behavior of the search with short-term forecast and long-term forecast is characterized by the ratio. There are three types of characteristics: 1. α/β = 1, that is, the accuracy of two forecasts are the same level. 2. α/β > 1, that is, the accuracy of short-term forecast is higher than that of long-term forecast. 3. α/β < 1, that is, the accuracy of long-term forecast is hight than that of short-term forecast.
Fig. 2. (a) Path space is x-y coordinate system plotted points which correspond to possible paths from v to g via w when v is expanded. (b) Boundary line divides the path space into two regions.
Fig. 3. (a) The boundary line for α/β = 1. (b) This boundary line does not cause the missing region.
Geometric Considerations of Search Behavior
615
Case1:α/β = 1. In this case, the existence region of all paths, (c(v, w ), h(w )), except (c(v, w), h(w)), just corresponds to the region (Fig.2(b)). This means that the search in this case does not select any w as next node (Fig.3). Case2:α/β > 1. The gradient of a boundary line in this case (Fig.4(a)) is steeper than the boundary line (Fig.2(b)). The existence region of all paths, (c(v, w ), h(w )), except (c(v, w), h(w)) which derived from accuracy ratio, α/β > 1, does not cover the region (Fig.2(b)). The missing region is α y > −x + c(v, w) + h(w) and y < − (x − c(v, w)) + h(w) β
(6)
If a path, (c(v, x), h(x)), exists in this missing region, the line with the gradient α/β > 1 through (c(v, x), h(x)) has the minimum value of y-intersect instead of the line which has the same gradient through (c(v, w), h(w)). As a result the search has possibility that incorrect node, x, is selected as next node. Case3:α/β < 1. The gradient of a boundary line is gentler than the boundary line of the region (Fig.2(b)). The effect is similar to Case2 because a missing
Fig. 4. (a) An example of boundary line for α/β > 1. The gradient of the boundary line is steeper than the gradient of −1. (b) Missing region arises by the boundary line of Fig.4(a). (The gradient of thick line is −1).
Fig. 5. (a) An example of boundary line for α/β < 1. The gradient of the boundary line is steeper than the gradient of −1. (b) Missing region arises by the boundary line of Fig.5(a). (The gradient of thick line is −1).
616
M. Ashida and H. Taki
region arise by the same reason as Case2. If a path, (c(v, x), h(x)), exists in the missing region, the search selects an incorrect node. Consequently, in a search strategy with a short-term forecast and a long-term forecast, it is important that each accuracy of forecasts is the same level in order not to select incorrect node.
4 4.1
Improvement of Long-Term Forecast Assumption
In light of a consideration in previous section and a general tendency that a long-term forecast is more inaccurate than a short-term forecast, it is necessary to improve accuracy of a long-term forecast in order to find a solution efficiently by our search strategy. We assume some assumptions about a tree structure described a state space under consideration. Assumption 1. Let [v0 , v1 , . . . , vn ] be a path from the root node to one of the reachable leaf nodes. Each cost of two edges connected at vi satisfies the following inequality: (7) c(vi−1 , vi ) ≤ c(vi , vi+1 ), for i = 1, 2, . . . , n − 1. This assumption expresses that the cost of edge which appears at shallow location is less or equal than that of edge which appears at deep location over a path. Assumption 2. Let cˆ0 (vi , vi+1 ) be the initial value of estimated cost of an edge, and it be known. The initial value of the estimated cost and its actual cost satisfy the following inequality: cˆ0 (vi , vi+1 ) ≤ c(vi , vi+1 ), for i = 0, 1, . . . , n − 1.
(8)
This means that the estimated cost is less or equal than its actual cost. 4.2
Update Method
We propose a method to update the value of a long-term forecast. First, we define a estimated cost of a path from a node to a reachable leaf node. Definition 1. Let p = [vi , vi+1 , . . . , vn ] be a path in the sub tree which vi is the root node. Using the estimated cost of edges on the path, the estimation of the path cost is defined as follows: ˆ p (vi ) = h
n−1 j=i
cˆ(vj , vj+1 ).
(9)
Geometric Considerations of Search Behavior
617
Second, we define a long-term forecast. Definition 2. Let {p1 , p2 , . . . , pm } be a set of paths in the sub tree which vi is ˆ p (vi ) be a estimation of a path cost. A long-term forecast the root node, and h k is defined as follows: m ˆ i ) = min h ˆ p (vi ) h(v (10) k=1
k
Finally, we define a rule to update each cost of edges in the sub tree. Definition 3. Let [vj , vj+1 ] be an edge in the sub tree which vi is the root node, and c(vi−1 , vi ) be an observed actual cost. Each estimated cost, cˆ(vj , vj+1 ), is updated by the rule: c(vj , vj+1 ), c(vi−1 , vi )} . cˆ(vj , vj+1 ) = max {ˆ
(11)
This rule is applied to all edges in the sub tree where vi is the root node when vi is selected. Process of the update method is as follows. Each estimated cost of the edges is updated or is preserved by the rule. After being applied the rule, each estimated cost of paths, hpk (vi ), is recalculated. Then, the value of a long-term forecast is also recalculated. 4.3
Theoretical Properties
It is shown that the update method improves accuracy of the heuristic function, ˆ h(v), under Assumption 1 and Assumption 2. Lemma 1. Each estimated cost which is updated by the update rule does not decrease during the search. This lemma is trivial by the definition of the update rule, because the estimated cost is replaced with the same value as previous value or greater value than previous value. Lemma 2. The actual cost of each edge and its estimated cost satisfy the following inequality: cˆ(vj , vj+1 ) ≤ c(vj , vj+1 ). (12) This lemma expresses that each estimated cost which is updated by the update rule do not exceed its actual cost during the search. Proof. Let c(vi−1 , vi ) be the actual cost which is observed recently and cˆ(vj , vj+1 ) be a estimated cost in the sub tree which vi is the root node. case1: This is a case that cˆ(vj , vj+1 ) < c(vi−1 , vi ) is satisfied. In this case, the estimated cost is replaced with c(vi−1 , vi ), according to the update rule. Based on Assumption 1, c(vi−1 , vi ) is less or equal than any other actual costs in the sub tree. Therefore, we obtain inequality: cˆ(vj , vj+1 ) = c(vi−1 , vi ) ≤ c(vj , vj+1 ).
(13)
618
M. Ashida and H. Taki
case2: This is a case that cˆ(vj , vj+1 ) ≥ c(vi−1 , vi ) is satisfied. If this inequality has been preserved since the beginning of search, the estimated cost keeps the initial value. Based on Assumption 2, inequality (12) is satisfied. The case that the estimated cost is replaced with other observed actual cost for the first time is already shown above case1. Once replacement of the value occurred, inequality (13) is satisfied. Therefore, we obtain inequality: c(vi−1 , vi ) ≤ cˆ(vj , vj+1 ) ≤ c(vj , vj+1 ).
(14)
Inequality (13) and (14) show that the estimated costs don’t exceed their own actual cost during the search. Theorem 1. Under Assumption 1 and 2, the value of a long-term forecast which is applied to the update rule keeps previous value or improves within the actual cost. Proof. It must be shown that the value of a long-term forecast does not decrease and does not exceed its actual cost during the search. Let v be a node. p and q are denotes two different paths from v to a reachable leaf node, respectively. ˆ Suppose that h(v) is given by p at search step T . Let p(T ) and q(T ) be the estimated cost of p and q at step T , respectively. According to Lemma 1, the following inequality is holding. p(T ) ≤ p(T + 1) p(T ) ≤ q(T ) ≤ q(T + 1)
(15) (16)
ˆ Therefore, we find that h(v) does not decrease at step T + 1. Replacing cˆ(vj , vj+1 ) with c(vj , vj+1 ) in equation (9), we obtain hp (vi ) =
n−1
c(vj , vj+1 ).
(17)
j=i
ˆ ˆ p (v) is the minimum value of any other When h(v) is given by ˆ hp (v) (that is, h forecast values), according to lemma 2 and equation (17), we obtain : ˆ ˆ p (v) ≤ hp (v). h(v) =h
(18)
ˆ p (v). Since h(v) is also given by a certain long-term Now, suppose that h(v) < h forecast, there exists a path, q, such that ˆ q (v) ≤ hq (v) = h(v), h
(19)
ˆ q (v) is the minimum value of any other forecast values. Hence, we have the and h ˆ q (v) exists such that, contradiction that h ˆ p (v). ˆ q (v) < h h ˆ Therefore, h(v) does not exceed h(v).
(20)
Geometric Considerations of Search Behavior
5
619
Conclusions
In a heuristic search method for tree structures, we divided a heuristic function into two parts like RTA*. The first part is a short-term forecast, which is estimated cost of an edge connected current node and its child node. The second part is a long-term forecast, which is estimated cost of a path from current node to a reachable leaf node. Our consideration suggests that it is important that accuracy of both forecasts are the same level in order not to select an incorrect path. In general there is a trend that a long-term forecast is more inaccurate than a short-term forecast. Additionally, it is fact that an accurate heuristic function has possibility that the search becomes efficient. Upgrading a long-term forecast contributes improvement search efficiency and leading the search to correct direction. We proposed a update method that the value of a long-term forecast is updated by actual cost which is observed recently. This method has properties that the value of a long-term forecast does not decrease and does not overestimate actual value during the search execution under some assumptions. Compared to a search method without this update method, by the properties, it is expected that the search method with the update method keeps its performance or improves it.
References 1. Nilsson, N.J.: Problem-Solving Methods in Artificial Intelligence. McGraw-Hill Pub. Co., New York (1971) 2. Korf, R.E.: Real-time heuristic search. Artificial Intelligence 42(2-3), 189–211 (1990) 3. Vanderbrug, G.J.: A geometric analysis of heuristic search. In: Proc. AFIPS 1976, pp. 979–986 (1976) 4. Russel, S., Norvig, P.: Artificial Intelligence a Modern Approach, 2nd edn. Prentice Hall, Englewood Cliffs (2002)
A Web-Community Supporting Self-management for Runners with Annotation Naka Gotoda1, , Kenji Matsuura2 , Shinji Otsuka1 , Toshio Tanaka3 , and Yoneo Yano4 1
Graduate School of Advanced Technology and Science 2 Center for Advanced Information Technology 3 Center for University extension 4 Institute of Technology and Science The University of Tokushima, 2-1, Minamijosanjima, Tokushima, 770-8506, Japan {gotoda,otsuka,yano}@is.tokushima-u.ac.jp, [email protected], [email protected]
Abstract. This paper proposes a web-based community environment called “e-running” that supports the development of self-management for runners. The objective of the support is to promote the physical skill development which stabilizes their heart-rate by adjustments of their pace while running. In this scenario, runners wear GPS (Global Positioning System) and HRM (Heart Rate Monitor) in training, and after that, refer to the pace-simulation with avatars based on GPS data on the online community in order to remember their improvements toward the next training. In this simulation, runners are given a map-based annotation along with such improvement sections. Concretely speaking, an agent function provide opportunity to reflect upon the management of their pace and help the comprehension about the relationship between their heart-rate and their pace. It also covers a variety of landscape on the course as the environmental elements that affect their pace. We report the idea and design. Keywords: physical skill development, map-based annotation, animated pace-simulation, e-running.
1
Introduction
The numbers of runners in Japan has been increasing steadily as people become more health conscious caused by social influence such as government’s health promotion law or metabolic syndrome [1]. Running is well known as a form of exercise. In addition, running is free of exclusive equipments, and runners can enjoy daily training alone. People who keep irregular living hours are recognizing benefits of running in such a situation. However, such a flexible rule of running brings in high popularity for them, on the other hand, gives runners the possibility to confront some problem. They might sometimes lose their motivation
Research Fellow of the Japan Society for the Promotion of Science.
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 620–629, 2010. c Springer-Verlag Berlin Heidelberg 2010
A Web-Community Supporting Self-management for Runners
621
because of lack of support from fellows or no clear-cut goal. Some runners who accomplish improvements of their own health might look for another objective including the participation of the race-event. Physical skill development about running is essential for the one of spontaneous motivations of such runners [2]. A conceivable solution to overcome their needs may be in a running community including a club-like organization [3]. A running community acts as a place for people to share the knowledge of how to run. We have already started research project of an SNS (Social Networking Service) based learning support to keep motivation on the online. We call this project as “e-running” [4]. The latest style of this project provides the animated pace-simulation on the geographic map after their each training. The simulation with avatars based on GPS (Global Positioning System) tracking data reminds runners of their improvements toward the next training [5]. Hence, this paper follows the background of the previous work and overcomes physical skill development of running with such an SNS. The framework helps runners’ comprehension about the relationship between their heart-rate and their pace with a map-based annotation and an agent function considered in a variety of landscape.
2 2.1
Physical Skill Development of Running Theoretical Background
In the field of sports science, human motor skill has long history as one of the studying domain [6]. According to Gentile’s well known taxonomy for motor skill, human skill could be divided into two dimensioned space by environmental context axis and functions of action one [7]. Within the environmental context, two factors are considered; the regulatory conditions and the inter-trial variability. Also, as for functions of action, two categories; body orientation and manipulation are defined. Running is one of human skill appeared in such taxonomy. In addition, discussion related to environmental context was tackled in the paper about the difference “Open Skill” and “Closed Skill”. The existence of stationary conditions without variability from response to response represents a closed task, while the presence of motion and inter-trial variability represents an open task. Concretely speaking, closed skill is self-paced in a stable in the unchanging environment. On contrary, open skill is performed in a variable environment and must be repeatedly adapted to the changing demands of that environment. Regarding illustrative embodiment of running from these viewpoints, indoor running by treadmill exercise for the purpose of correction about posture or motion is regarded as closed skill development because runners can train them freely according to the preset program at a uniform pace without any other expectations. On the other hand, open skill needs for runners in an outdoorfield training in the real world because they have to perform against unstable environments such as facing uphill or downhill, and so forth. The target style of running belongs to open skill development because intended runners from the background of this research train in the surrounding field which they can exercise in a daily life.
622
2.2
N. Gotoda et al.
Running in a Real/Virtual Community
Most of traditional researches in computer-supported education, technologyenhanced learning, computer-supported collaborative learning, and so forth assumed real instructors, teachers, or noble strategy to lead learners to correct answer by the computer program. These aspects, in fact, have strong tendency to physical learning and training domains because they are more skilloriented [9]. However, recent SNS-based learning offers different condition to learners. Generally, users in an SNS have horizontal relationship each other and are nonconformist in a sense. They do not have to be aware of each role or position in the real world. Therefore, they can freely explorer in a virtual space and learn something discovered online. If runners’ friends put notes about their daily training, methods, and topics related to course or advice, they can adopt these articles for their training next time. In addition, in the real world, runners may find companies who mutually point their weakness or give advices up. In that case, they can keep the friendship to be highly motivated each other. However, they are likely to have unsatisfactory on schedule coordination such as selecting course or time and so forth. In such a case, the virtual sharing space can help their collaboration or competition about running in the way that makes their training sustainable. Here, the rule of running is more than extremely simple. For this reason, there are so many runners who have different objectives and problems along with its. In particular, an SNS can provide several ways to search peers as helpful tools which enable such runners to contact each other easily beyond generations and gender. Therefore, an SNS is better way for runners to communicate with peers. The fundamental idea of this study met this motivation. 2.3
Direction for Adequate Running Skill
In the sports science of running, some researchers proposed the taxonomy about runners’ pace of exercise-intensity during running. Typical taxonomy of three types are defined; Ascend, Even and Descend (Fig. 1).
Fig. 1. Three typical types of running
A Web-Community Supporting Self-management for Runners
623
Runners categorized into even-type can keep the pace from the beginning to the end. In contrast, runners who belong to ascent-type save their pace of running at beginning but accelerate the speed gradually until finishing the course. Similarly, runners of descent-type have an oppositional feature against the ascenttype. Regarding novice and middle level runners, the fundamental strategy of instruction is to enforce runners at even-pace whole through the planned course [4], because to run at a steady is desirable as the one of objectives about running. If you want to lose weight effectively, you need to continue moving in an uneventful way in order to run as long as possible. Moreover, in the framework of this research, such stable conditions of runners mean stable heart-rate instead of the pace based on their velocity during running. The desired rate of the fundamental exercise-intensity based on heart-rate were defined from some conditions including a goal of exercise in running [10][11]. Runners set themselves the task of maintaining a steady condition of heartrate with the management of their pace affected by a variety of landscape on the course. However, runners have no meanings to catch the information about their running without any sensing devices or monitoring tools. Therefore, we adopt GPS sensor and HRM (Heart Rate Monitor) device that can be put directly on runners. Fig. 2 shows such runner’s equipments of our study in field training. Runners upload these data on to the community site via the web interface.
Fig. 2. Strap-on HRM and wrist-type GPS (Garmin Forerunner 310XT)
3 3.1
Scenario Supporting Self-management “e-running” Project
Fig. 3 shows the conceptual diagram of the “e-running” project. In this project, two training methods are used by turns for runners. In running, they train with GPS and HRM. In contrast to this practice stage, after running, they have the simulation training with the web browser. In addition to several communication
624
N. Gotoda et al.
Fig. 3. ”e-running” project based on the cycle between field and simulation training
Fig. 4. An interface to show the animated pace-simulation
tools including the private message and the community function, an SNS has the storage of running data and the simulation which evokes training-memories for them. Through this environment, runners can get some requisite advice among other users on the reflection stage. Following functions are described as supports of the reflection stage.
A Web-Community Supporting Self-management for Runners
3.2
625
Animated Pace-Simulation
Main objective of this study is to present runners an environment that can afford to make them realize where they should take effort and how to improve running upon due consideration of the stabilization about runners’ heart-rate. In order to meet the needs of such an affordance, the basic idea is to provide an animated pace-simulation with the graph showing changes of their heart-rate and pace to every runner in a community. We had designed the competitive animation among runners who share the same course for running but cannot run together at the same time (Fig. 4). Avatars of each runner can show the competition on the Google-Map based on the past running records. The map provides the time information, relative positions, and the replaying speed to a system user. Users can change the replaying speed, the position in the race freely as they like. If there are several runners who train on the same course, they can share the simulation as their reflection. In such a situation, a runner can be aware of what s/he must reflect upon by the easier comparison with other runners. 3.3
Map-Based Annotation
There are some community sites that have the blog function for running. However, most of them do not provide the function that users can take notes on the map directly. We designed this function originally (Fig. 5 and Fig. 6). On the animated pace-simulation, the input form of annotation works. The location of an annotation in process of creation corresponds to the display position of the target avatar. The content-view of the registered annotation is expressed as pop-up in passing by the avatar. It brings somehow meta-cognition for each runner because s/he can review from objective viewpoint. Moreover, runners can point the position on the map during replaying their running records by mouse and take notes directly there. For example, this function is useful for a runner to remember the position to brace against the fatigue at next time running. Furthermore, the runner will be able to check the
Fig. 5. Input form of an annotation on the Fig. 6. View as pop-up in passing by the map avatar
626
N. Gotoda et al.
Fig. 7. The difference the between current and proposal scenario
running difference from the past time around this position. In this way, runners can improve themselves by the simulation and annotation on the map. 3.4
Agent Function
The existing style of e-running project including above-mentioned support scenario places an emphasis on the promotion of motivation. Therefore, our community site fundamentally has a high degree of freedom for options of learning because of acceptance for various types of runners. In this time, we designed additional function in order to overcome a concrete objective about a specific development of the physical skill. To help runners’ comprehension between their heart-rate and their pace, we propose the agent function which makes a prediction about the problem and recommends appropriate advisers based on the storage data in the system. Fig. 7 illustrates the difference between current and proposal scenario. Activities of users are largely unaltered in themselves. Instead, users are automatically given annotations on the simulation environment and candidates of advisers through SNS message-box. When data are uploaded to the system every training, agent function analyzes user’s change of GPS and HRM data to detect the section suspected as cause. In addition to these changes, a variety of landscape on the course is utilized to facilitate understanding of the cause with the evaluation database in the system.
A Web-Community Supporting Self-management for Runners
3.5
627
Diagnostic Wizard of the Improvement
The diagnostic wizard attached to an annotation for runners. Fig.8 shows the process of it. The wizard are composed of three steps; Awareness, Cause, Solution. In the step of awareness, the system checks user’s situation about the awareness of the detected problem. In this case, the system asks user whether s/he remember anything in this section. This step confirms the relationship between unexpected change and awareness of it. Next step, the system pursues the cause of the problem. In this step, user can choice among several factors, self factor including tiredness from having worked out, geographical one including a slope and weather one which contains opposing wind. The system present not only options but also the predicted candidate as the potential cause in advance. Runner can remove the exception such as a traffic light, depending on the training situation. The system promotes runners’ reflection by making them self-conscious about the cause. Last step, the system makes user input the improvement plan by considering the current pace. The runner defined the numerical target of the pace around the section of it for next training.
Fig. 8. Diagnostic wizard attached to an annotation
628
N. Gotoda et al.
If the answer of the first step in the wizard is gnoh, the system provides the requisite information which user become recognized as a problem. Concretely speaking, users are given the message for the reference to similar type of problem. Also, in the other cases without exceptions, the system recommends users as the advisor through the message. The system refers to other annotation of all users, extract candidates as advisers by the retrieval of same section. And the system provide user improvement examples based on the stack of training data.
4
Summary
This paper described the design in an ongoing research project that supports runners with a map-based annotation with the avatars of runners. The pacesimulation on the online community can be realized with GPS data of the real world training. On the pace-simulation, runners answer questions of the diagnostic wizard attached to the annotation in order to promote their comprehension about the relationship between their heart-rate and their pace. And the system recommends the adviser with results of answers about the diagnostic wizard. Acknowledgments. This work was partially supported by Grant-in-Aid for Young Scientists (B) No.20700641, Scientific Research (B) No.22300291 and JSPS Fellows No.22.10013.
References 1. Matsuura, K., Kanenishi, K., Miyoshi, Y., Sagayama, K., Yano, Y.: Promoting Physical Skill Development in a Video-Based WEBlog Community Environment. In: Proc. of ED-MEDIA 2008, pp. 1089–1096 (2008) 2. Gotoda, N., Matsuura, K., Kanenishi, K., Yano, Y.: Organizing Online LearningCommunity Based on the Real World Log. In: Apolloni, B., Howlett, R.J., Jain, C.L. (eds.) KES 2007, Part III. LNCS (LNAI), vol. 4694, pp. 608–615. Springer, Heidelberg (2007) 3. Prochaska, J.O., DiClemente, C.C.: Stages of Change in the Modification of Problem Behaviors. In: Hersen, M., Eisler, R.M., Miller, P.M. (eds.) Progress in Behavior Modification, pp. 184–214. Sycamore Publishing Company, Sycamore (1992) 4. Gotoda, N., Matsuura, K., Kanenishi, K., Yano, Y.: Organizing Online-Community for Jogging Support based on the Physical Log. Journal of Human-Interface Society 10(2), 19–29 (2008) 5. Matsuura, K., Gotoda, N., Otsuka, S., Nabeshima, T., Yano, Y.: Supporting Joggers in a Web-Community with Simulation and Annotation Functions. In: Proc. of ICCE 2009, pp. 119–123 (2009) 6. Higgins, S.: Motor Skill Acquisition. Physical Therapy 71, 123–139 (1991) 7. Gentile, A.M.: Skill acquisition: Action, movement, and neuromotor processes. In: Carr, J.H., Shepherd, R.B. (eds.) Movement Science: Foundations for Physical Therapy in Rehabilitation, pp. 111–187 (2000) 8. Allard, F., Starkes, J.L.: Motor-skill experts in sports, dance, and other domains. In: Ericsson, K.A., Smith, J. (eds.) Toward a General Theory of Expertise, pp. 126–150. Cambridge University Press, Cambridge (1991)
A Web-Community Supporting Self-management for Runners
629
9. Vanden Auweele, Y., Bakker, F., Biddle, S., Durand, M., Seiler, R.: Psychology for Physical Educators, FEPSAC Human Kinetics (1999) (Translated into Japanese by TAISHUKAN Publishing Co. Ltd.) 10. Pekkanen, J., Marti, B., Nissinen, A., Tuomilehto, J., Punsar, S., Karvonen, M.J.: Reduction of premature mortality by high physical activity: A 20-year follow up of middle-aged Finnish men. Lancet 1, 1473–1477 (1987) 11. Pierre, F.S., Brackman, J.P.: Frequence cardiaque chez le sportif amateur, http://fsp.saliege.com/
An Analysis of Background-Color Effects on the Scores of a Computer-Based English Test Atsuko K. Yamazaki College of Engineering, Shibaura Institute of Technology, 307 Fukasaku, Minuma-ku, Saitama-shi, Saitama, Japan [email protected]
Abstract. The effects of background colors on the scores of a computer-based English test were analyzed. Eight combinations of black text and a background color were chosen to examine if test takers’ scores differed, depending on the characteristics of a background color. Two hundred and forty six Japanese college students participated in the experiment. A computer-based test (CBT) that resembles the TOEIC® Test was used and the subjects, divided into eight groups, answered the same questions displayed on a background of eight different colors. The results showed that the subjects who took the test with light blue and blue background scored the highest and the second highest. Although many CBTs use the combination of a white background and black text, this combination was ranked the second worst. The results indicate that colors with higher values both in luminance and in brightness can cause lower performance, and test taker’s color preference does not affect his performance. Keywords: background and text colors, computer-based test, test performance.
1 Introduction The selection of text and background colors for web pages has been evaluated in many studies in order to improve web usability. Many studies on colors used in web design have pointed out that the combination of background and text colors is important for performing tasks on web pages, because it often determines the readability of web pages. From their experimental results, Hall and Hanna found that greater contrast ratios between background and font colors led to better readability. Their results, however, showed that the combination of background and text colors did not significantly affect user retention [1]. Other studies found that the luminance contrast ratio between the text and its background colors can affect readability [2][3][4]. Nishiuchi conducted a visibility study of characters on a web page and found that the best coloration was the combination of a background color in the cold hue group and a character color in the warm hue group [5]. Some studies have been conducted to examine the effects of color combinations used for computer or web-based teaching and learning materials [6][7]. However, not much research has been done to investigate if colors affect the performance of test takers on computer or web-based tests (WBTs) in terms of their test scores. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 630–636, 2010. © Springer-Verlag Berlin Heidelberg 2010
An Analysis of Background-Color Effects on the Scores
631
Since the early eighties, computer-based tests (CBTs) have been widely used, because of their cost efficiency and immediate feedback on performance. In particular, web-based language testing to assess English proficiency, such as the Cambridge Computer Based Tests, the TOEFL® TEST and the TOEIC® Test, has been used by many organizations including schools and corporations, to assess the English proficiency of their students or employees. These WBTs usually use pages with black text and white background, which resemble conventional CBTs. In this study, the author examined whether or not the background colors of a computer-based test had significant influence on the scores of test takers. The author also investigated if test taker’s preference for colors was related to his/her performance.
2 Experiment The experiment was conducted in two phases. In the first phase, the effects of white and the three primary colors as the background color of a computer-based English test were examined. In the second phase of the experiment, four intermediate colors were chosen as the background color to be tested, taking account of the results from the first phase. The text color of all questions was black for both phases. All the subjects in this experiment responded to a questionnaire about their color preferences, and their fatigue and concentration levels after taking the test. 2.1 Computer-Based Test In this study, we adapted a part of a computer-based English test that was delivered in a web browser with JavaScript plug-in. The test was designed to provide preparation exercises for an English language proficiency test called the TOEIC® Test. The TOEIC® Test is a multiple choice exam that consists of two parts with 100 questions each: the Listening Section and the Reading Section. The Reading Section is divided into three parts: Incomplete Sentences (Part V of the test), Text Completion (Part VI) and Reading Comprehension (Part VII). The Incomplete Sentences part has 40 sentence completion questions and each question contains a sentence with a blank and four choices for the blank. The computer-based English test has 200 incompletesentence questions of the same type and a similar difficulty level as Part V of the TOEIC® Test. We chose 40 questions out of the 200 questions for this study. In the computer-based English test, each sentence completion question was presented on one web page. A test taker selected an answer from four choices labeled (1), (2), (3) and (4), and chose one of them from a drop-down list to indicate his answer. A test taker advanced the page to the next question by clicking on the “send” button. In the original computer-based test, all sentences and word choices were given in black text against a white background. An example of the test web pages is shown in Figure 1. 2.2 Subjects Two hundred and forty six subjects participated in the experiment. All of them were first-year college students whose first language was Japanese. They were all male and none of them was reported to have a color vision deficiency at the time of the
632
A.K. Yamazaki
experiment. They all took a paper-based TOEIC® Test three months prior to the experiment. One hundred and seven subjects took the test in the first phase and 139 subjects did in the second phase. The subjects were divided into four groups in each phase. An ANOVA test was performed on their TOEIC® Test scores to see if there was no significant difference in English reading proficiency among the groups. In the ANOVA test, we considered subjects’ scores from the Reading Section of the TOEIC® Test since they answered incomplete-sentence questions that were very similar to questions in the Reading Section of the TOEIC® Test. The ANOVA test result showed that there was no significant difference among the eight subject groups in terms of their TOEIC® Test Reading Section scores (α=0.05, p=0.34). Therefore, we assumed that each group assigned to a different background color had the same level of English reading proficiency at the time of the experiment.
Fig. 1. An example of the computer-based test pages
2.3 Method A set of computer-based English tests with eight different background colors and black text was developed for this study from the original CBT. All the tests were made the same, except their background colors. Each group of subjects took the test with black characters and background in one of the colors shown in Table 1. The table also summarizes their hexadecimal color codes, brightness and color differences between the text and background, and luminance ratios between the text and background. Formulas suggested by the World Wide Web Consortium were used to calculate these values [8]. Test scores obtained from the groups were analyzed to see if there was a difference among the groups assigned to the different background colors. The total test time was set to 30 minutes. The subjects were given instructions for taking the CBT prior to undertaking the test together in a computer room. They
An Analysis of Background-Color Effects on the Scores
633
answered an example question having black text and white background before starting the 40-question test. In the first phase of the experiment, white, blue, red and yellow were chosen for the background colors. Twenty eight subjects were tested with a blue background, 25 with a red background, 31 with a white background and 23 with a yellow one. In the second phase, 139 subjects took the test with a background in one of the intermediate colors, such as light blue, pink, light yellow and light green. They were divided into four groups. Thirty subjects were assigned to the light blue background group and 37 subjects to the pink background group, 26 subjects to the light-yellow background group, and 36 subjects to the light green background group. Each subject obtained 10 points for each correct answer, and his total score was accumulated and recorded in the server of the CBT. Table 1. Brightness difference, color difference and luminance ratio between each background color and black text color (maximum values: Brightness=255, Color difference=765, Luminance ratio=1) Background color
Hexadecimal Color code
Brightness difference
Color difference
Luminance ratio
Luminance
Light Blue
#C0C0FF
199
639
12.225
0.56126
Blue
#0000FF
29
255
2.44
0.00722
Pink
#FFC0C0
211
639
13.553
0.62765
Light Green
#C0FFC0
229
18.306
0.86532
Red
#FF0000
76
255
5.252
0.2126
Light Yellow
#FFFFC0
284
702
20.317
0.96586
White
#FFFFFF
255
765
21
1
Yellow
#FFFF00
226
510
19.55
0.927
639
After completing the test, each subject responded to a questionnaire that was designed to see if each subject’s color preference, concentration level and feelings about the test had affected his test score. One of the questions asked the subjects to provide their preference for the background colors used in the experiment using a five-point scale from 1= “don’t like the color at all” to 5 = “like the color very much”. The subjects also indicated their concentration levels at the time of taking the test using a five-point scale from 1= “could not concentrate at all” to 5= “could concentrate very well”. They also responded to questions about how difficult they felt the test was and how tired they felt after the test with another five point scale.
3 Results The average test scores obtained from each background color group are summarized in Table 2. Among the eight groups, the light blue background group scored the
634
A.K. Yamazaki
highest, followed by the blue background group, the pink background group and the red background group. The subjects who took the test with the yellow background scored the lowest on average among the groups, while the white and light yellow background groups marked the second and third lowest. The author performed paired t-tests (α=0.05) on the test scores obtained from the subjects assigned to the white background and those from the subjects for the test with one of other background colors. Paired t-test results showed that there was no significant difference at the p<0.05 level between the white background and any of other background colors examined in this study. Results from the questionnaire were also analyzed in relation with subjects’ test scores. The author examined if the preference score given by a test taker to the background color of the test he took correlated with his test score. The analyses showed Table 2. The average test scores of the subjects in eight background color groups The number of subjects
Average test score
Light Blue
30
129.00
Blue
28
125.71
Pink
37
124.86
Light Green
36
123.33
Red
25
122.00
Light Yellow
34
121.41
White
31
120.97
Yellow
23
109.13
Background color
Table 3. The average test scores of the subjects in eight background color groups Background color
Test score ranking
Average preference score [ranking]
Average concentration score [ranking]
Light Blue
1
3.511 [3]
3.367 [1]
Blue
2
3.561 [2]
2.857 [7]
Pink
3
3.180 [8]
3.189 [2]
Light Green
4
3.335 [4]
2.861 [6]
Red
5
3.318 [5]
2.480 [8]
Light Yellow
6
3.237 [6]
3.147 [3]
White
7
3.832 [1]
3.097 [4]
Yellow
8
3.196 [7]
3.043 [5]
An Analysis of Background-Color Effects on the Scores
635
that there was no correlation between them since the correlation coefficient r was between -0.01 and +0.2 for all the background colors. In addition, the order of the color preferences did not correspond to the order of the test score averages (Table 3). White color received the highest preference among the background colors used in this experiment, although the subjects in the white background group marked the second lowest score on average. Table 3 shows the average of preference scores for each color obtained from the subjects. Results for the concentration levels of the subjects showed that the intermediate colored backgrounds were better than the primary colored ones for keeping their concentration. The light blue background marked the highest concentration score, followed by the pink and light yellow ones, and their average concentration level scores were better than for that of the white background. The average score of concentration levels obtained from the subjects in each background color group was summarized in Table 3, with its ranking among the background colors. Correlation analyses performed on the scores for the concentration levels and the test scores showed low coefficient values for all the background colors (r < 0.2). Other analyses also showed no correlation between the subjects’ test scores and the values for the fatigue degree that the subjects indicated. In addition, there was no correlation between the subjects’ test scores and the values for the difficulty levels that the subjects felt toward the test.
4 Discussion and Conclusion The results of the experiment in this study demonstrate that the background color of a CBT can affect the performance of a test taker. They suggest that blue colors may be better for the background of a CBT when question sentences are displayed in black. The results also suggest that the background color can make a difference to the concentration level of the test taker, indicating that primary colors with low luminance ratios cause test takers to lose their concentration. On the other hand, the results in this study show that the color preference of the user is not an importance factor in choosing the best background color for a CBT, although Hall and Hanna mention that a preferred color can lead to higher rating of behavior intention [1]. White, yellow and light yellow have similar characteristics when the text color is black, as shown in Table 1. Compared to other colors, they have higher brightness differences and luminance ratios to black. Since the subjects who took the test with these background colors tended to scored low, a combination of black text and a background color with high luminance and high brightness is not considered preferable for CBTs. In this study, the levels of fatigue and difficulty that test takers experienced did not correlate with their test scores, in addition to their color preferences. Many studies have mentioned hue differences in color combinations can induce physiological changes in people. Jang et. al point out that the distinction between warm and cold colors can affect the usability of web pages [7]. Since the blue colors ranked the two highest test scores in this study, but their color characteristics are pretty different in terms of luminance and brightness, other physiological factors associated with color characteristics need to be investigated in relation with the scores of CBT test takers.
636
A.K. Yamazaki
Acknowledgments. The author would like to thank Chieru Co., Ltd. for its assistance in developing a set of CBTs with different background colors. The author also would like to extend her gratitude to the students at Institute of Technologists who participated in this study.
References 1. Hall, R., Hanna, P.: The Impact of Web Page Text-Background Color Combinations on Readability, Retention, Aesthetics, and Behavioral Intention Citation. Behaviour & Information Technology 23(3), 183–195 (2004) 2. Lin, C.: Effects of contrast ratio and text color on visual performance with TFL-LCD. International Journal of Industrial Ergonomics 31(2), 65–72 (2003) 3. Notomi, K., Hiramatsu, A., Saito, K., Saito, M.: A fundamental study on visibility of background and character colors at the web browsing. Biomedical Soft Computing and Human Sciences 9(1), 17–25 (2003) 4. Saburi, S.: Structure of visibility for the combination of background and character colors: an analysis by a maximum likelihood asymmetric multidimensional scaling. The Japanese Journal of Behavior Metrics 35(2), 193–201 (2008) 5. Nishiuchi, N., Yamanaka, K., Beppu, K.: A study of visibility evaluation for the combination of character color and background color on a web page. In: The International Conference on Secure System Integration and Reliability Improvement, pp. 191–192 (July 2008) 6. Tharangie, K.G.D., Irfan, C.M.A., Marasinghe, C.A., Yamada, K.: Kansei Engineering Assessing System to enhance the usability in e-learning web interfaces: Colour basis. In: 16th International Conference on Computers in Education, vol. 1(1), pp. 145–150 (2008) 7. Jang, Y.G., Kim, H.Y., Yi, M.K.: A color contrast algorithm for e-learning standard. International Journal of Computer Science and Network Security 7(4), 195–201 (2007) 8. World Wide Web Consortium, http://www.w3.org/TR/AERT
Message Ferry Route Design Based on Clustering for Sparse Ad hoc Networks Hirokazu Miura, Daisuke Nishi, Noriyuki Matsuda, and Hirokazu Taki Faculty of Systems Engineering, Wakayama University, 930, Sakaedani, Wakayama, 640-8510, Japan [email protected]
Abstract. In a message ferry technology for sparse ad hoc networks, a ferry node stores a message from a source node and carries it to the destination node. In this technology, message ferry route design is one of the most important technical issues. To improve the performance, we propose a cluster-based route design where the ferry moves along the path formed by the centers of the clusters. Keywords: Ad hoc networks, Message ferry, Clustering, Route design.
1 Introduction Mobile ad hoc networks (MANETs) are multi-hop networks in which wireless mobile nodes cooperate to maintain network connectivity and perform routing functions [1][2]. MANETs enable nodes to communicate with each other without any existing infrastructure or centralized administration. These rapidly deployable and selfconfiguring networks have applications in many critical areas, such as battlefields, disaster relief and wide area sensing and surveillance. Routing in ad hoc networks has been an active research field in recent years [1][2]. However, most of the existing work focuses on connected networks where an end-toend path exists between any two nodes in the network. In sparse networks, where partitions are not exceptional events, these routing algorithms will fail to deliver packets because no route is found to reach their destinations. Message ferrying technology[3][4][5][6] is one of the solutions to address this problem. Message ferrying is a networking paradigm where a special node, called a message ferry, facilitates the connectivity in a mobile ad hoc network where the nodes are sparsely deployed. One of the key challenges under this paradigm is the design of ferry routes to achieve certain properties of end-to-end connectivity, such as delay and message loss among the nodes in the ad hoc network. It’s necessary to minimize a ferry route to obtain a small message delivery delay, and ferry on the designed route needs to visit as many nodes as possible to reduce the message loss. To satisfy above requirements, in this paper, we propose a cluster-based route design. In our proposal, nodes in the network are clustered based on the node location and radio range of ferry node. Ferry moves along the path formed by the centers of the clusters. Our simulation results show good performance from the viewpoint of improvement of message delivery ratio and delay. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 637–644, 2010. © Springer-Verlag Berlin Heidelberg 2010
638
H. Miura et al.
2 Message Ferrying In the Message Ferrying [3], the devices in the network are classified into two categories. (i) Regular nodes, or simply the nodes, that move according to some mobility model. These nodes generate data for other nodes in the network in the form of application layer data units called messages. At the same time, these nodes are interested in receiving the messages that other nodes have generated for them. For this work we assume that all the messages are unicast, i.e., they have a single unique destination. (ii) A single special node called message ferry (MF) that is responsible for delivering the messages between the nodes. The ferry achieves this by traversing a predetermined path repeatedly. Node
Message ferry
Message storing Message forwarding Source node
Destination node
Message carrying
Fig. 1. Message Ferrying approach
Both the ferry and the nodes are equipped with a similar radio of given small or long communication range. The nodes and ferry can communicate with each other only when they are within a distance of each other that is less than the communication range. The node and ferry are in contact they are within the communication range of each other. During each successful contact, the ferry exchanges messages with the nodes. The ferry uploads the messages that the node has generated for other nodes, and downloads the messages that the ferry has for the particular node. The process is referred to as service. The ferry services only one node at a time. Fig. 1 is an illustration of this message ferrying approach. Since both the ferry and the nodes are mobile in our model, we make a simplifying assumption that when the ferry and a node come in contact with each other, either the contact lasts long enough to complete the service, or they can pause, and exchange messages; usually the service time is short and does not amount to disruption in node mobility. After the exchange the ferry continues with its route and the node continues with its movement. We assume that the ferry has infinite resources to move around
Message Ferry Route Design Based on Clustering for Sparse Ad hoc Networks
639
and meet the other nodes, as well as to communicate with the other nodes when they come within its radio range, and to carry the messages between the nodes. Furthermore, we assume that we can route the message ferry in whatever way we want in the region where the nodes move. In this work the only constraints that we consider regarding the ferry are that the ferry cannot move faster than a certain maximum speed, and that it can only communicate with other nodes that are within a given radio range of the ferry. In the paper [3], two message ferrying approaches, Node-Initiated Message Ferrying (MIMF) scheme and Ferry-Initiated Message Ferrying (FIMF) scheme, are proposed to utilize mobility of either the ferry or regular nodes. In MIMF scheme, the ferry moves according to a specific route. The ferry route is known by nodes, e.g., periodically broadcast by the ferry or conveyed by other out-of-band means. Nodes take proactive movement periodically to meet up with the ferry. The ferry moves on a known route. As the sending node approaches the ferry, it forwards its messages to the ferry. The receiving node meets the ferry and receives its messages. By using the ferry as a relay, the sending node can send messages to the receiving node even if there is no end-to-end path between them. However, movements of nodes are disrupted because the nodes take proactive movement to contact the ferry. Node
Message ferry
Default ferry route
Fig. 2. Ferry route of FIMF scheme
In FIMF scheme, ferry takes proactive movement to meet up with nodes for communication. In this scheme, nodes are equipped with a long range radio which is used for transmitting control messages. Initially the ferry follows a specific default route and periodically broadcasts its location to nodes using a long range radio. When a node finds the ferry is nearby and wants to send or receive messages via the ferry, it sends a request message which includes the node’s location information to the ferry using its long range radio. Upon reception of a request message, the ferry adjusts its trajectory to meet the node. To guide the ferry movement, the node occasionally transmits messages to notify the ferry of its new location. When the ferry and the
640
H. Miura et al.
node are close enough, they exchange messages via short range radios. After completing message exchange with the node, the ferry moves back to its default route. In this scheme, the movement of nodes is not disrupted. Fig. 2 shows the ferry route in FIMF scheme. In FIMF scheme, as we mentioned above, ferry takes the tour of default route repeatedly. This delay performance depends on the length of the default route regardless of node location and demands from nodes. To improve that performance, default route should be changed dynamically according to the node location and the node’s demand.
3 Adaptive Message Ferry Route Design Using Clustering 3.1 Key Idea We produce ferry routes that are in the form of an ordered set of way-points. These way-points have to be chosen carefully to achieve desired message delivery performance. The performance of message ferrying depends on the route that the ferry traverses, because the moving ferry delivers messages. To improve the performance, it is necessary to make a ferry route shorter as possible. At the same time we require that the nodes and ferry meet frequently. In the previous work, the ferry visits each node and the number of ferry movements is equal to the number of nodes to visit all nodes. In this case, the length of ferry route may be longer unnecessarily. To avoid this degradation, the number of way-points which form ferry route has to be minimized. When a number of nodes are exist within the radio range of ferry at a certain waypoint, the ferry can contact many nodes at one movement to that way-point. Such a way-point has to be selected as the candidate way-point. We expect that the ferry route can be minimized as a result. Node
Message ferry
Way-point
New route
Default route
Fig. 3. Ferry route of proposed scheme
Node cluster
Message Ferry Route Design Based on Clustering for Sparse Ad hoc Networks
641
In this paper, we cluster the nodes based on the location of nodes and we select the center of the cluster as a way-point. At this time, radius of the cluster is the communication range between the nodes and ferry. Fig. 3 shows new default ferry route established by our approach. 3.2 Ferry Route Design Using Clustering Ferry route design process is as following. 1 Ferry moves on the path which is predefined default route. 2 Ferry obtains the node location periodically through the long radio communication. 3 When the ferry arrives at the original point, ferry constructs the path to be traversed according to the node location. Ferry set this path as a default route 4 Ferry repeats above process. We divide the ferry route design algorithm (in above Step 3) into two parts, finding a good set of way-points and ordering these way-points together to form a route. 3.2.1 Adequate Way-Points To find a good set of way-points, in this paper, we use complete linkage clustering (hierarchical clustering) to cluster all nodes according to the location of nodes, i.e., Euclidean distance is used for distance measure between clusters. Our clustering algorithm is as following. 1. Start by assigning each node to a cluster. Let the distances between the clusters the same as the distances between the nodes they contain. 2. Find the closest pair of clusters and merge them into a single cluster. 3. Compute distances between the new cluster and each of the old clusters. The distance between one cluster and another cluster is equal to the farthest distance from any member of one cluster to any member of the other cluster. 4. Repeat steps 2 and 3 until distance between any clusters become longer than radio range. 5. Find the farthest pair of nodes in each cluster. The middle point between the pair of nodes is way-point. 3.2.2 Construction of Path from Way-Points After ferry found the way-points, ferry orders them so as to minimize the length of the route. This is travelling salesman problem (TSP) which is NP-hard. In this paper, we use nearest neighbor algorithm as an approximation algorithm to solve this problem.
4 Performance Evaluation In this section, we evaluate the performance of our method by network simulator (NS2). We use both data delivery and message delay metrics to evaluate our performance. The message delivery rate is defined as the ratio of the number of successfully delivered messages to the total number of messages generated. The message delay is defined as the average delay between the time a message is generated and the time the
642
H. Miura et al.
message is received at the destination. We compare our method with FIMF previously proposed. We first describe our simulation setup, and then we present our simulation results. 4.1 Simulation Setup We chose a setup with a ferry node and 40 nodes that move in a 5km x 5km area according to one of the following mobility models. − Random Way-point (RWP) Each node moves according to the random way-point mobility model in the entire 5km x 5km region. The RWP parameters are same for all nodes; v = 1 – 5 (m/s), and exponentially distributed pause time with mean of 50 sec. − Area Based RWP(AB-RWP) 30 nodes move according to the random way-point mobility model in a 400m x 400m region whose center is uniformly distributed in the 5km x 5km area. The other nodes (10 nodes) move in the entire 5km x 5km region. This model indicates the locality of node mobility. We assume that 25 nodes are randomly chosen as sources which send messages to randomly chosen destinations every 20 seconds. We also assume that the long communication range for contact discovery and the short communication range for message exchange between the ferry and nodes is 1km and 250m respectively. Finally, we assume that there is a single ferry in the system and the ferry speed is 15 (m/s). The default ferry route follows a rectangle with (1250, 1250) and (3750, 3750) as diagonal points. Each message have a timeout value which specifies when this message should be dropped if not delivered. The timeout value of messages was varied from 1000sec to 8000sec in our simulations. 4.2 Simulation Results Figure 4 shows the message delivery rate under different message timeout values. Our proposal in both mobility models outperforms FIMF for all timeout values. As the message timeout value increases, the message delivery rate for FIMF also increases but is still lower than the proposal. For example, when timeout value is 8000sec, our proposal achieve more than 70% delivery rate, proposal in RWP model especially achieves more than 90% delivery rate while FIMF delivers 60% of messages. This is because in our proposal ferry route is minimized and the ferry can meet more nodes within the timeout value. This leads to higher message delivery rate. Figure 5 shows the message delay under different message timeout values. In RWP model, both proposal and FIMF shows the similar performance. This is because in RWP model all nodes uniformly exist in the entire area and even our proposal cannot reduce the ferry route as compared with default route in FIMF. On the contrary, in AB-RWP model our proposal achieves significantly lower delay as compared to FIMF. This is because our method can minimize the ferry route and this leads to lower delay performance. In AB-RWP model, some locality can be seen in the node mobility.
Message Ferry Route Design Based on Clustering for Sparse Ad hoc Networks
643
Fig. 4. Message delivery rate under different message timeout value
Fig. 5. Message delay under different message timeout value
From these results, our method can minimize the ferry route and shows good performance. Especially, in the case where some locality can be seen in the node mobility, our proposal shows significantly good performance.
644
H. Miura et al.
5 Conclusions In this paper, we proposed an adaptive route design method based on clustering which dynamically establish the ferry route according to the node location. Our method clusters the all nodes so that one cluster whose maximum length of radius is radio communication range between the ferry and nodes includes nodes as many as possible. The ferry route consists of a set of centers of the clusters. It can be expected that as the number of clusters is decreased, the length of ferry route also is reduced. Our simulation results show that our method performs significantly better than existing ferry routing scheme which uses predefined route. Our method needs online collaboration between the nodes and the ferry, however, can work flexibly even in the environment where the node mobility changes dynamically. We would like to leave the cooperation technique between multiple message ferries for our future research.
References 1. Johnson, D., Maltz, D.: Dynamic source routing in ad-hoc wireless networks. In: Proc. of ACM SIGCOMM (August 1996) 2. Perkins, C., Bhagwat, P.: Highly dynamic destination-sequenced distance-vector routing (DSDV) for mobilecomputers. Computer Communication Review 24 (October 1994) 3. Zhao, W., Ammar, M., Zegura, E.: A Message Ferrying Approach for Data Delivery in Sparse Mobile Ad Hoc Networks. In: Proc. of ACM Int’l Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc 2004), pp. 187–198 (May 2004) 4. Li, Q., Rus, D.: Sending Messages to Mobile Users in Disconnected Ad Hoc Wireless Networks. In: Proc. of ACM Int’l Conf. on Mobile Computing and Networking (MobiCom 2000), pp. 44–55 (August 2000) 5. Zhao, W., Ammar, M., Zugura, E.: Controlling the Mobility of Multiple Data Transport Ferries in a Delay-Tolerant Network. In: Proc. of IEEE Infocom 2005, vol. 2, pp. 1407–1418 (March 2005) 6. Zhao, W., Ammar, M., Zegura, E.: Proactive Routing in Highly-partitioned Wireless and Ad hoc Networks. In: Proc. of 9th IEEE Workshop on Future Trends in Distributed Computing Systems, FTDCS (May 2003)
Affordance in Dynamic Objects Based on Face Recognition Taizo Miyachi1, Toshiki Maezawa2, Takanao Nishihara1, and Takeshi Suzuki3 1
School of Information Science and Technology, Tokai University 1117 Kitakaname, Hiratsuka, Kanagawa 259-1292, Japan 2 Toppan Printing CO, LTD 3 Kanagawa Prefecture Hiratsuka School for the Visually Impaired [email protected]
Abstract. Moving objects and oncoming pedestrians sometimes become obstacles for a pedestrian either on a crosswalk or in the pavement at night. Low vision persons and aged people with low eye sight sometimes dodge against the other pedestrians or danger. A dynamic object should suggest a person its affordance for the safe and pleasant mobility as static object like a chair affords. We propose an affordance of dynamic object based on instinct and an affordance indicator system for safe mobility based on face recognition. The results of examination of the indicator both in a passage and on the stairs are also discussed.
1 Introduction Sustainable transport such as walking and public transport can save energy and make a variant city that makes healthy profits and reduces the medical expenses. Persons with low eye sight and low vision persons have been increasing in the aged society. The number of low vision persons is about 269 million in the world [1]. Safe mobility for them [7] is indispensable for the variant compact city. Persons with low eye sight also walk either in a crowded crosswalk, in a crowded passage in a station or in a dark pavement at night in order to attain to a destination. It has been becoming hard for them to cope with such situations since lots of pedestrians use mobile phones while they are walking on a crosswalk and do not actively avoid collision with the other pedestrians. Persons with low eye sight can acquire static spatial information in the route to the destination from Web-site or by mobile terminal like UC [2]. However a rapid stream of oncoming crowd persons either on the crosswalk or in the passage in a station suddenly forms just after an arrival of train. The low vision persons and aged persons with bags can not acquire dynamic information concerning such a rapid stream and often fall into a tuff situation such that they collide with oncoming pedestrians in the stream. Ultrasonic detectors never stop for too many obstacles [8, 9]. We propose an affordance of dynamic object and an indicator system for safe and pleasant mobility of persons with low eye sight based on face recognition. The examination of the indicator both in a passage and on the stairs is also discussed. R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 645–652, 2010. © Springer-Verlag Berlin Heidelberg 2010
646
T. Miyachi et al.
2 Dynamic Object and Aggregate Motion A pedestrian can make an image in a route to the destination based on spatial information acquired from maps and Web sites. The pedestrian can usually deal with a trouble caused by static objects when he/she encounters unexpected situations based on static objects or a delay of transport. However a trouble with moving objects is more difficult for him/her than that with static objects. Pedestrian have been injured by the trouble with moving objects such as a revolving door and an escalator. ■The aggregate motion of a herd of pedestrians and danger Nowadays people can send e-mails and enjoy lots of Web services by mobile phones anywhere. They can easily use the mobile phones even when they are crossing a crosswalk. They do not actively pay attention to avoid collision against oncoming pedestrian there. This is a new social phenomenon and makes it harder for low vision persons and aged persons with bags to safely cross a crosswalk without anxiety. Pedestrians had ever been avoiding facing oncoming pedestrians on a crosswalk. The user with the mobile phone on a crosswalk walks only following a herd of pedestrians in the same direction based on simple three disciplines in the aggregate motion “boid” that is an instinct of animal and was proposed by Craig Reynolds [6]: (1) Separation: steer to avoid crowding local flockmates, (2) Alignment: steer towards the average heading of local flockmates, (3) Cohesion: steer to move toward the average position of local flockmates. Therefore persons with low eye sight and aged persons also need an affordance [3, 4] from a dynamic object like a stream of a crowd in order to avoid danger. There are also three similar dangerous situations in the mobility. (a) A stream of a crowd people in a passage near wickets, (b) A pedestrian who is talking with someone in order to defend herself against an attacker by a mobile phone on the pavement at night, and (c) A facing oncoming bicycle on a road or on the pavement.
3 Affordance of a Dynamic Object Donald Norman appropriated the term affordances in the context of human–machine interaction to refer to just those “action possibilities” that are readily perceivable by an actor [4]. It makes the concept dependent not only on the physical capabilities of an actor, but also the actor's goals, plans, values, beliefs, and past experiences. A metal plate instead of a door knob suggests a user to push the plate. A cliff suggests land animals moving on the land that the cliff is dangerous. The facing oncoming crowd pedestrians on a wide crosswalk should suggest that they are dangerous like the cliff. Affordance in dynamic objects is very important since it is not easy for a person to escape from the situations in chapter 2 when he/she falls into them. The characteristics of affordance of dynamic objects are mainly classified into five points. (r1) A dynamic object suddenly appears. (r2) Damage from a dynamic object is generally stronger than that from a static object.
Affordance in Dynamic Objects Based on Face Recognition
647
(r3) It is harder for a person to avoid damage by dynamic object than to avoid damage by static object since obstacles can not change current motion at once by the discipline of inertia [10]. (r4) The possibility of new world created by a dynamic object in the environment is bigger than that by a static object. (r5) Relationship between a part of dynamic object and a user is created.
■
Affordance from dynamic object Affordances of a dynamic object in the mobility are (a1) to (a5) corresponding to (r1) to (r5). (a1) Awareness for a dynamic object. (a2) Suggest outstanding relation between a dynamic object, human and surroundings in a new space in the environment, change of the relationship, and danger. (a3) Find a Right moment. (a4) Do (avoid, change a position, ride on, wait etc.) (a5) Suggest adjustment for a safe action in a safe spot. (a2) As affordance in a static object such as a pen, a pen case, and a cream puff suggest different forms of a hand in order to hold them, affordance in a dynamic object also suggests different forms of a hand. In addition, affordance in a dynamic object suggests a relation between the dynamic object, human, and environment. Example of such relations with three elements are {an electric toothbrush, a hand, a mouth}, {an escalator, human, floors}, and {a butterfly, a net, a leaf}. In case of a stream of facing oncoming crowd pedestrians, the relation is {a stream of facing oncoming crowd pedestrians, a pedestrian with low eye sight, surrounding space}. Outstanding relation between each oncoming pedestrian and a pedestrian with low eye sight also becomes very important since a stream of facing oncoming crowd pedestrians is a complex object consists of lots of oncoming pedestrians and does a complex motion. (a5) is very important for keep safety since a user moves to do (a4) and the user should also avoid the other dangers embedded in the environment. Examples of the adjustment for riding a escalator and entering a revolving door are “move a little,” “not to dodge,” “not to be caught under a step,” “take care a bottom of the other stairs,” “not to be sandwiched between a door and a wall”.
4 An Affordance Indicator System for Dynamic Objects Based on Face Recognition We propose an affordance indicator system for the aged people and low-vision persons based on face recognition in order to utilize affordance in a dynamic object by the capabilities or remained eyesight of a pedestrian. They can walk even in the crowded passage or in a dark passage without collision with the other pedestrians and can enjoy the mobility. Face recognition function is very effective for recognizing oncoming pedestrians since a face of a pedestrian turns to the direction to go. A pedestrian walking at a quick pace has to especially keep watching the direction to go. Such a dynamic object should suggest danger in an affordance in it.
648
T. Miyachi et al.
■ Feature of affordance indicator system A prototype of affordance indicator system mainly consists of three parts with a helmet and a rucksack. (a) USB Ca me r a ( G R - C A M 1 3 0 N2 ) N i g ht mo d e: I n fr a r e d c a me r a wi t h 8 L E D li g h t s, 3 0 fr a me / sec , 6 4 0 × 4 8 0 p i xe l . ( b ) Mo no c u lar HM D ( D a ta G la ss2 / A, S hi ma z u) o n a hel me t (c) No te co mp ute r (V AI O t yp e S , S ON Y) i n a r u cksa c k - So f t wa r e . T he s ys t e m i s p r o gr a m me d i n C l an g u a ge. B a sic so ft : M ic r o so f t V i s ua l S t ud io 2 0 0 5 . AP I : Op e n C V, O S W i nd o ws XP
:
■Functions of affordance indicator system Users can know the degree of danger based on the colored circles like a traffic signal since a person with low eye sight can see the circles in the display if he/she could not watch facing oncoming pedestrians in bad light condition for him/her. For example, a low vision person often feels glare and can not watch anything in the real world but watch the display of the system. The user can not only watch colored circles but also hear a sound sign corresponding to the color of the nearest circle even in such hard environment. The user can enjoy safe and pleasant mobility [5] to the destination since he/she can previously know facing oncoming pedestrians and the number, density in a distribution map of the circle in surrounding space, speed based on change of the colored circles in a display of HMD in real-time. The user can also acquire a same affordance from oncoming pedestrians even in a dark place as in welllighted room by infrared camera function. Moreover the user can acquire similar affordance and be aware of the oncoming pedestrians when he/she moves from a well-lighted room to a dark passage since the camera automatically switches the mode from “day mode” to “night mode”. The affordance indicator system has five major functions.
6m 4m Fig. 1. Colored circles for affordance from pedestrians
Fig. 2. An image of crowded passage
(1) Face recognition in well-lighted room or passage in a building. (2) Face recognition in a dark place. (3) Automatic switching between day mode and night mode ((1) and (2)). (4) Draw a large colored circle on a recognized face in a display of HMD in a color of traffic signal in real-time. The system not only shows a circle but also makes a sound sign according to the distance between an oncoming pedestrian and the user. The colors are Red (within 4m), Yellow (between 4m and 6m) and Green (more than 6m). (5) Recognition is a success in a face with a glasses and a mask.
Affordance in Dynamic Objects Based on Face Recognition
649
5 Experiments Test1. Face recognitions in nine different situations W e p r e p a r e d ni n e d i f f e r en t si t ua tio n s i n o r d e r t o i n ve st i ga te o n t he in f l ue n ce o f g la s se s, a ma s k o n t he no s e, ha nd s o v er t wo e ye s , a nd li g ht co nd i tio n . W e al so mea s ur ed t he b r i g ht n es s i n t he r o o m. Re c o g niz e a f a c i n g p e r s o n Re co gniz e fi ve fa ci ng p erso n s si mu l ta neo usl y Re co gniz e a fac i ng per s on wh e n a us er wi t h a c a mera i s sta mpi ng hi s le g s at t he sa me p o si tio n Re co gniz e a fac i ng o nco mi ng per so n t hat wa l k c los e to t he use r Re co gniz e t h e per so n i n bot h co nd it io ns o f and Re co gniz e a p er so n we a r i n g gl as se s Re co gniz e a p er so n t h at cover a e ye wi t h hi s ha nd Re co gniz e a p er so n t h at cover t wo e ye s wi t h hi s ha nd s Re co gniz e a p er so n t h at cover hi s no se wi t h a mas k
① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨
③
④
Table 1. Face recognition in 9 situations
Environments (lx) Opened blinds (531.3) Closed blinds and illuminated in a fluorescent light (488.6) Closed blinds and illuminated in a fluorescent light (148.5l) Half closed blind (31.61) Closed blinds (4.31)
① ○ ○ ○ ○ ●
② ○ ○ ○ ○ ●
③ ○ ○ ○ ○ ●
④ ○ ○ ○ ○ ●
⑤ ○ ○ ○ ○ ●
⑥ ○ ○ ○ ○ ●
⑦ △ △ △ △ ▲
⑧ × × × × ×
○,●:success, △,▲:partially success ×: fail, ●,▲:by Infra-red
⑨ ○ ○ ○ ○ ●
Results (1) Face recognition was a success except for and that are the situations eyes are covered by hands. (2) Face recognition was a success within 8 meters. - Easy awareness of the dynamic object by colored circle (“○”) and easy recognition of the distribution of “○”. - Strength is shown by the number of “○” and distribution of “○”s. - Density of “○”s shows easiness of avoiding the damage. - Adequate timing for processing is analyzed by the wave of “○”s. - Suitable parts to have a relation with are analyzed by the distribution of “○”s. - Colored circles can be seen even in a dark space.
⑦ ⑧
Test2. Avoid the collision with a facing oncoming pedestrian in a passage The system shows a circle in different color according to the distance between the user and an oncoming pedestrian and makes a sound sign corresponding to the color. - Subjects: Subjects are six sighted men and one sighted woman (in their 20s). Subjects wore a helmet and watched a monocular HMD.
650
T. Miyachi et al.
- Place. Test2-1: A well-lighted passage beside M Lab. Test2-2: A dark passage beside M Lab. - The way of Test2. A faced person walks from 30 meter away from a user. Check the enough time for the user in order to avoid the collision with oncoming person. Results Test2-1. All subjects could easily avoid the facing oncoming pedestrians about 3 m ahead of the pedestrian in the well-lighted passage. We could ensure that the sound signs corresponding to the circle colors were also useful in order to know the oncoming pedestrians. Test2-2. All subjects managed to avoid the facing oncoming pedestrian. It was hard for the user to find the safe spot beside him/her since it was too dark to find the spot and to ensure the safety of it. The users were worried about some obstacles beside him/her that were seen by a kind of fear in the dark space and used a few seconds. One subject stopped when he was aware of the oncoming pedestrian in the display and spent a few second. ■ Discussion Test2-1. All subjects avoided collision with facing pedestrian from front of him/she by affordance indication system since he/she previously found both an oncoming pedestrian and a safe space beside him/her. We could ensure that users could avoid such collision if they would previously found a safe space based on awareness by the colored circles. We suppose that a pedestrian can not help moving either to a direction shown by green circles or to the side end of the space if there would be no safe space beside him/her. That might be a suitable behavior. He/she should move to the position after a facing pedestrian that stopped in order to avoid him/her since the other facing pedestrian following the facing pedestrian usually changes his/her direction to go by two obstacles and there is a small safe space there. The third facing pedestrian after the second pedestrian also can not help changing the direction and follows the second facing pedestrian. This corresponds to a “boid” with function of avoiding collisions. We also ensure the usefulness of the system for oncoming two persons and one big cart in the well-lighted passage beside M Lab. We had better show how large cart was beside an oncoming person. Interview and discussion Teat2-2. Question for a low vision: “Do you avoid an oncoming pedestrian if you could not notice any obstacles beside you because of the darkness in the pavement?” Answer. “Yes I will avoid it because moving object is the most dangerous since that suddenly appears just before the collision. I could deal with collision with the static obstacles. Damage of collision with the static obstacles might not be so big. Awareness by colored circles is very useful for me.” We could ensure the usefulness of affordance indicator system for a low vision person. Test3. Avoid the collision with a facing oncoming pedestrian in a stair - The way of Test3 is as same as that of Test2 except for place. - Place. Test3-1: A well-lighted stairs between 3F and 4F
Affordance in Dynamic Objects Based on Face Recognition
651
Test3-2: with a wide angle lens in a well-lighted stairs between 3F and 4F Test3-2: with tilted camera for 4F in a well-lighted stairs between 3F and 4F Results Tes3-1. All subjects could not avoid the oncoming pedestrian because they watched the next step and no video of face of the oncoming pedestrian could previously be taken. Table 2. Eye sight of subjects
Subjects A B C D E F G H
Eye sight Left Right 0.7 0.7 0.8 0.8 0.05 0.3 0.8 0.5 0.5 0.7 0.5 0.1 0.5 0.7 1.0 1.0
eye of camera eye (a) Well-lighted passage (b) Dark passage
(c) Stair
Fig. 3. Images of avoidance of collisions in Test 2 and Test3
Interview. The camera could not face the direction that the oncoming pedestrian walking down on the stair because all pedestrians watched their next step for the safe stepping up. Tes3-2. All subject could not avoid the oncoming pedestrian in spite of using two times wider lens because they could not simultaneously watch both his/her next step and a face of the oncoming pedestrians in a view of camera. - The camera needed wider lens than two times wide lens. We suppose that this approach would not be so good since the camera needs very high resolution. ■ Discussion. One pedestrian could avoid the oncoming pedestrian in the stair in an additional test when he tilted the camera up for the direction that the oncoming pedestrian stepped down. This means that users should tilt the camera up for the direction for the oncoming pedestrian. This method is very troublesome. We suppose that the system with two cameras might be effective in the stair.
6 Conclusions We discussed the importance of affordance of dynamic objects that often produces danger based on the disciplines of an instinct “boid” and physical inertia in order to enjoy safe and pleasant mobility by sustainable transport. We proposed affordance in a dynamic object in the mobility and an affordance indicator system based on face recognition and made a prototype system for both well-lighted space and dark space by a USB infra red camera. We also examined the system in both well-lighted spaces
652
T. Miyachi et al.
and dark spaces and ensured that colored circles on the faces of oncoming persons were useful for persons with low eye sight in order to previously know the oncoming pedestrians and their distribution and to avoid them. It also provided context of dynamic change of a stream of oncoming pedestrians by colored circles for crowd pedestrians. The system might not useful against the light and in difficult light situation since the USB camera is very simple and does not have intelligent adjusting functions for such situations. Combination between recognition function of obstacle and face recognition might improve the affordance indicator system.
References 1. World Health Organization: Visual impairment and blindness (2010), http://www.who.int/mediacentre/factsheets/fs282/en/ 2. MLIT National Institute for Land and Infrastructure Management: A draft of Technology for FMA system (2009), http://www.nilim.go.jp/lab/bcg/jiritsu/index.htm 3. Gibson, J.J.: The Theory of Affordances. In: Shaw, R., Bransford, J. (eds.) Perceiving, Acting, and Knowing (1977), ISBN 0-470-99014-7 4. Norman, D.: The Design of Everyday Things, ISBN 0-465-06710-7. Originally pub-lished under the title The Psychology of Everyday Things, often abbreviated to POET (1988) 5. Lynch, K.: The Image of the City. MIT Press, Cambridge (1960) 6. Reynolds, C.: Flocks, herds and schools: A distributed behavioral model. In: SIGGRAPH 1987: Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, pp. 25–34. Association for Computing Machinery, New York (1987), doi:10.1145/37401.37406 7. Miyachi, T., Balvig, J.J., Kisada, W., et al.: A Quiet Navigation for Safe Crosswalk by Ultrasonic Beams. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007, Part III. LNCS (LNAI), vol. 4694, pp. 1049–1057. Springer, Heidelberg (2007) 8. UltraCane.com: Whois Record (2010), http://whois.domaintools.com/ultracane.com 9. Shin, B.-S., Lim, C.-S.: Obstacle Detection and Avoidance System for Visually Impaired People. LNCS. Springer, Heidelberg (2007) 10. Miyachi, T., Balvig, J.J.: An Expansion of Space Affordance by Sound Beams and Tactile Indicators. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006. LNCS (LNAI), vol. 4252, pp. 603–610. Springer, Heidelberg (2006)
Author Index
Abe, Akinori III-307 Abecker, Andreas I-660 Abdul Maulud, Khairul Nizam IV-22 Abe, Hidenao III-297 Abe, Jair Minoro III-123, III-133, III-143, III-154, III-164, III-200 Abu Bakar, Azuraliza IV-22 Adachi, Yoshinori III-63, III-81 Adam, Giorgos III-389 Aguilera, Felipe II-591 Ahmad Basri, Noor Ezlin IV-22 Ain, Qurat-ul I-340 Akama, Seiki III-133, III-143, III-164, III-200 Albusac, J. IV-347 Alechina, Natasha IV-41 Alonso-Betanzos, Amparo I-168 Alosefer, Yaser IV-556 ´ Alvarez, H´ector II-581 Alvarez, H´ector II-591 Alvez, Carlos E. II-44 Alvi, Atif IV-576 An, Dongchan II-302 Aoki, Kumiko II-143 Aoki, Masato IV-153 Apostolakis, Ioannis III-23 Appice, Annalisa III-339 Aritsugi, Masayoshi IV-220 Asano, Yu I-649 Ashida, Masaya II-611 Asimakis, Konstantinos III-389 Aspragathos, Nikos II-341 Ayres, Gareth IV-566 Azpeitia, Eneko II-495 Azuma, Haruka III-273 Baba, Norio III-555 Baba, Takahiro III-207 Babenyshev, S. II-224 Babenyshev, Sergey I-230 Baig, Abdul Rauf I-61 Bajo, Javier IV-318 Baralis, Elena III-418
Bardone, Emanuele III-331 Barry, Dana M. IV-200 Bath, Peter II-163 Baumgartner Jr., William A. IV-420 Belohlavek, Radim I-471 Belˇsa, Igor II-21 Ben Hassine, Mohamed Ali I-532 Bermejo-Alonso, Julita I-522 Bhatti, Asim I-5 Biernacki, Pawel I-350, I-360 Biscarri, F´elix I-410 Biscarri, Jes´ us I-410 Bishop, Christopher I-3 Blaˇskovi´c, Bruno II-292 Bluemke, Ilona II-82 Boochs, Frank I-576 Borzemski, Leszek II-505 Bouamama, Sadok II-312 Bouch´e, P. IV-32 Bouras, Christos III-379, III-389 Bourgoin, Steve IV-410 Bratosin, Carmen I-41 Bravo-Marquez, Felipe II-93 Bretonnel Cohen, K. IV-420 Bruno, Giulia III-418 Buckingham, Christopher D. IV-88 Bukatoviˇc, Martin I-432 Bumbaru, Severin I-188 Byrne, Caroline IV-365 Cambria, Erik IV-385 Carpio Valadez, Juan Mart´ın II-183 Carrella, Stefano II-361 Caspar, Joachim IV-402 Cavallaro, Alexander I-290 Chakkour, Fairouz IV-586 Champesme, Marc II-351 Chang, Jae-Woo I-511 Charton, Eric IV-410 Chiusano, Silvia III-418 Chowdhury, Nihad Karim I-511 Ciampi, Anna III-339 Cˆırlugea, Mihaela IV-613 Ciruela, Sergio IV-70
654
Author Index
Clive, Barry IV-633 Cocea, Mihaela II-103, II-124 Condell, Joan IV-430 Cooper, Jerry IV-497 Corchado, Emilio S. IV-318 Corchado, Juan M. IV-318 Coyne, Bob IV-375 Cruz, Christophe I-576 Csipkes, Doris IV-603 Csipkes, Gabor IV-603 Cui, Hong IV-506 Culham, Alastair IV-517 ˇ Cupi´ c, Marko I-100 Cuzzocrea, Alfredo III-426 Czarnecki, Adam II-533 d’Amato, Claudia III-359 Katarzyna III-369 Dabrowska-Kubik, Dalbelo Baˇsi´c, Bojana I-100, II-31 Da Silva Filho, Jo˜ ao In´ acio III-154 de Faria, Ricardo Coelho III-174 Debenham, John I-220 Deb Nath, Rudra Pratap I-511 Delgado, Miguel IV-70, IV-337 ˇ Dembitz, Sandor II-292 Dengel, Andreas I-290 Dentsoras, Argyris II-331 De Paz, Juan F. IV-318 Detyniecki, Marcin I-544 Di Bitonto, Pierpaolo II-64 Diniz, Jana´ına A.S. III-182 Dino Matijaˇs, Vatroslav I-100 Dolog, Peter III-398 Domenici, Virna C. III-418 Do˜ na, J.M. III-445 do Prado, H´ercules Antonio III-174 Duarte, Abraham II-183 Eckl, Chris IV-385 Ercan, Tuncay II-253, II-263 Ernst, Patrick I-290 Ezawa, Hiroyasu IV-280 Fadzli, Syed Abdullah IV-240 Fahlman, Scott E. II-193 Fanizzi, Nicola III-359 Farago, Paul IV-623 Farquad, M.A.H. I-461 Feng, Haifeng I-544 Fern´ andez-Breis, Jesualdo Tom´ as
I-597
Fernandez-Canque, Hernando IV-603 Fern´ andez de Alba, Jos´e M. IV-328 Ferneda, Edilson III-174, III-182 Ferro, Alfredo III-438 Festila, Lelia IV-623 Festilˇ a, Lelia IV-613 Figueiredo, Adelaide III-182 Flann, Christina IV-497 ´ Fontenla-Romero, Oscar I-168 Forge, David II-351 Fortino, Giancarlo I-240 Fraire Huacuja, H´ector Joaqu´ın II-183 Fujii, Satoru III-483, III-519 Fujiki, Takatoshi III-509 Fujiwara, Minoru IV-163 Fukuda, Taro III-473 Fukui, Shinji III-89 Fukumi, Minoru III-612 Fukumura, Yoshimi II-143, IV-190, IV-200 Fulcher, John II-454 Furutani, Michiko III-307 Furutani, Yoshiyuki III-307 G.V.R., Kiran II-11 Ga´ al, Bal´ azs I-607 Gabbar, Hossam A. II-427 Gagnon, Michel IV-410 Gartiser, N. IV-32 Gharahbagh, Abdorreza Alavi I-331 Ghofrani, Sedigheh I-331 Gibbins, Nicholas IV-594 Giddy, Jonathan IV-485 Giugno, Rosalba III-438 Glaser, Hugh IV-594 Gledec, Gordan II-292 Glez-Morcillo, C. IV-347 Goesele, Michael IV-402 Golemanova, Emilia II-253, II-263 Golemanov, Tzanko II-253, II-263 Gomez, Juan Carlos I-566 G´ omez-Ruiz, J. III-445 G´ omez Zuluaga, Giovanni II-601 Gonda, Yuuji IV-210 Gonzaga Martins, Helga III-154 Gonzalez B., Juan J. II-203 Gonz´ alez, Yanira I-51 G¨ org, Carsten IV-420 Gotoda, Naka II-620, IV-145 Graczyk, Magdalena I-111
Author Index Gra˜ na, M. IV-80 Grauer, Manfred II-399 Greaves, David IV-576 Grillo, Nicola III-426 Grimnes, Gunnar I-290 Grosvenor, Roger II-371 Grzech, Adam II-523 Guerrero, Juan I. I-410 Guerrero, Luis A. II-93, II-591 Gurevych, Iryna IV-402 Gutierrez-Santos, Sergio II-124 Hacid, Mohand-Said III-426 Hagita, Norihiro III-307 H˚ akansson, Anne II-273, IV-60, IV-98, IV-124 Halabi, Ammar IV-527 Haller, Heiko I-660 Hamaguchi, Takashi II-381, II-417 Hamdan, Abdul Razak I-491 Hanabusa, Hisatomo IV-308 Handa, Hisashi III-555 Hangos, Katalin M. II-389 Hanser, Eva IV-430 Harada, Kouji III-637 Hardisty, Alex IV-485 Hartung, Ronald IV-124 Hartung, Ronald L. II-273 Hasegawa, Mikio IV-271 Hasegawa, Naoki IV-190 Hashimoto, Yoshihiro II-417 Hashizume, Aoi II-135 Haskkour, Nadia IV-586 Hattori, Akira IV-290 Havasi, Catherine IV-385 Hayami, Haruo IV-290 Hayashi, Hidehiko IV-475 Hayashi, Yuki IV-153 Heap, Marshall J. IV-517 Hern´ andez, Carlos I-522 Hintea, Sorin IV-603, IV-613, IV-623 Hiratsuka, Yoshimune III-315 Hirokawa, Masakazu I-148 Hirokawa, Sachio III-207 Hirschberg, Julia IV-375 ˇ Hocenski, Zeljko I-300 H¨ oppner, F. I-442 Hor´ ak, Aleˇs I-432 Horiguchi, Ryota IV-308 Hosseini, Mohammad Mehdi I-331
655
Huang, Houkuan II-1 Hunger, A. II-114 Hunter, Lawrence E. IV-420 Hussain, Amir IV-385 Iftikhar, Nadeem III-349 Igarashi, Masao III-622 Iijima, Chie III-264 Iijima, Morihisa IV-308 Ikeda, Mitsuru IV-163 Inoue, Akiya III-225 Inoue, Etsuko III-509 Inuzuka, Nobuhiro III-72 Iribe, Yurie II-143, IV-173 Ishida, Keisuke IV-190 Ishida, Yoshiteru III-628, III-637, III-645, III-652 Ishii, Naohiro III-97, III-104, III-113 Islim, Ahmed-Derar IV-527 Isokawa, Teijiro III-592 Iswandy, Kuncup II-361 Itoh, Toshiaki II-381 Itokawa, Tsuyoshi IV-220 Ito, Momoyo III-612 Itou, Junko III-473, III-527 Ivan, Lavall´ee I-452 Iwahori, Yuji III-63, III-81, III-89 Iwashita, Motoi III-225 Iwazaki, Tomonori IV-190 Jabeen, Hajira I-61 Jaffar, M. Arfan I-340 Jain, Lakhmi C. II-454 Jakobovi´c, Domagoj I-100 Jamil, Hasan III-408 Jantan, Hamidah I-491 Jascanu, Nicolae I-188 Jascanu, Veronica I-188 Jezic, Gordan I-261 Jia, Dawei I-5 Jiao, Roger I-131 Jimbo, Takashi III-97 Jimenez, L. IV-347 Jimenez-Molina, Angel II-54 Jing, Liping II-1 Jin, Zhe III-464 Ji, Xiaofei I-369 Jones, Andrew C. IV-485
656
Author Index
Kambayashi, Yasushi I-198 Kamide, Norihiro I-178, II-153 Kanda, Taki II-477 Kanematsu, Hideyuki IV-200 Karadgi, Sachin II-399 Karmacharya, Ashish I-576 Kasabov, Nikola I-1 Kastania, Anastasia N. III-43, III-53 Kasugai, Kunio III-81 Katarzyniak, Radoslaw I-271 Katoh, Takashi III-455 Kavakli, Manolya II-214 Kawaguchi, Masashi III-97 Kawakatsu, Hidefumi III-281 Kawano, Hiromichi III-225 Kelly, Michael IV-11, IV-135 Khalid, Marzuki I-69, II-464 Kholod, Marina III-273 Kim, Daewoong IV-261 Kim, Hakin II-302 Kimura, Makito III-264 Kimura, Naoki II-381, II-409 Kipsang Choge, Hillary III-612 Kitasuka, Teruaki IV-220 Klawonn, Frank I-141, II-244 Ko, In-Young II-54 Kodama, Issei IV-475 Koffa, Antigoni III-53 Kogawa, Keisuke III-555 Kohlert, Christian I-321 Kohlert, Michael I-321 Kojima, Masanori III-572 Kojiri, Tomoko IV-153 Kolp, Manuel I-209 Komine, Noriyuki III-545, III-572 K¨ onig, Andreas I-321, II-361 Korouˇsi´c Seljak, Barbara I-587 Kou, Tekishi II-409 Kouno, Shouji III-225 Kountchev, Roumen III-133, III-215 Kozierkiewicz-Hetma´ nska, Adrianna I-281 Kozmann, Gy¨ orgy I-607 Kratchanov, Kostadin II-253, II-263 Kriˇsto, Ivan II-21 Kubo, Masao IV-298 Kumakawa, Toshiro III-315 Kunifuji, Susumu IV-457 Kunimune, Hisayoshi IV-210 Kurahashi, Wataru III-89
Kurdi, Mohamed-Zakaria IV-527 Kurosawa, Takeshi III-225 Kusaka, Mariko III-555 Kuwahara, Daiki IV-465 Lambert-Torres, Germano III-154 Lasota, Tadeusz I-111 Laterza, Maria II-64 Lawrenz, Wolfhard II-244 L awrynowicz, Agnieszka III-359 Le, D.-L. II-114 Lee, Huey-Ming II-438 Lee, Hyun-Jo I-511 Lensch, Hendrik P.A. IV-402 Le´ on, Carlos I-410 Le´ on, Coromoto I-51 Lesot, Marie-Jeanne I-544 Lewandowski, Andrzej I-311 L’Huillier, Gaston II-93, II-581 Li, Chunping I-131 Li, Kai IV-173 Li, Yibo I-369 Li, You II-445 Lin, Lily II-438 Liu, Honghai I-369 Liu, Jin I-379 Liu, Jing II-214 Liu, Kecheng I-554 Liu, Lucing III-207 Liu, Xiaofan IV-41 Liu, Yang I-90 Liu, Ying I-131 Logan, Brian IV-41 Lopes, Helder F.S. III-164 L´ opez, Juan C. II-193 Loukakos, Panagiotis I-481 Lovrek, Ignac I-251 Ludwig, Simone A. IV-536 Lukose, Dickson I-627 Lunney, Tom IV-430 Luz, Saturnino IV-394 Ma, Minhua IV-430 Mac´ıa, I. IV-80 Mackin, Kenneth J. III-622 Maddouri, Mondher I-121 Maeda, Kaoru I-639 Maehara, Chihiro IV-261 Maeno, Hiroshi IV-163 Maezawa, Toshiki II-645
Author Index Magnani, Lorenzo III-331 Magoulas, George D. II-103, II-124 Mahanti, Ambuj II-282 Maheswaran, Ravi II-163 Mahoto, Naeem A. III-418 Majumder, Sandipan II-282 M´ ak, Erzs´ebet I-607 Makino, Toshiyuki III-72 Malachowski, Bartlomiej IV-180 Malerba, Donato III-339 Mancilla-Amaya, Leonardo II-553 Marc, Bui I-452 Mar´ın, Nicol´ as IV-70 Markos, Panagiotis II-331 Marteau, Pierre-Fran¸cois I-420 Mart´ınez-Romero, Marcos II-74 Mart´ınez-Costa, Catalina I-597 Mart´ınez F., Jos´e A. II-173, II-203 Marzani, Franck I-576 Masoodian, Masood IV-394 Mat Ali, Nazmona I-554 Mati´c, Tomislav I-300 Matsubara, Takashi IV-298 Matsuda, Noriyuki II-637 Matsui, Nobuyuki III-592 Matsumoto, Hideyuki II-417 Matsuoka, Rumiko III-307 Matsushita, Kotaro III-622 Matsuura, Kenji II-620, IV-145 Mattila, Jorma K. IV-108 Maus, Heiko I-639 Mc Kevitt, Paul IV-430 McMeekin, Scott G. IV-633 Meddouri, Nida I-121 Mehmood, Irfan I-340 Mehmood, Rashid IV-566, IV-576 Mello, Bernardo A. III-182 Men´ arguez-Tortosa, Marcos I-597 Merlo, Eduardo II-581, II-591 Metz, Daniel II-399 Mill´ an, Roc´ıo I-410 Millard, Ian C. IV-594 Minaduki, Akinori IV-475 Mi˜ narro-Gim´enez, Jos´e Antonio I-597 Mineno, Hiroshi II-135, III-535 Miranda, Gara I-51 Mishina, Takashi III-493 Misue, Kazuo IV-440 Mitsuishi, Takashi III-281 Miura, Hajime IV-190
657
Miura, Hirokazu II-637 Miura, Motoki IV-457, IV-465 Miyachi, Taizo II-645 Miyaji, Isao III-483 Miyoshi, Masato III-612 Mizuno, Tadanori III-572 Mizuno, Shinji II-143 Mizuno, Tadanori II-135, III-535 Mizutani, Masashi I-198 Moens, Marie-Francine I-566 Mohd Yatid, Moonyati Binti III-473 Molina, Jos´e Manuel IV-357 M¨ oller, Manuel I-290 Molnar, Goran I-100 Monedero, I˜ nigo I-410 Moradian, Esmiralda IV-98, IV-124 Morihiro, Koichiro III-592 Morii, Fujiki I-390 Morita, Hiroki III-572 Moulianitis, Vassilis II-341 Moya, Francisco II-193 Muhammad Fuad, Muhammad Marwan I-420 Muhammad-Sukki, Firdaus IV-633 Mukai, Naoto IV-280 M¨ uller, Ulf II-399 Munemori, Jun III-473, III-527 Munteanu, Cristian R. II-74 Murakami, Akira III-315 Murat, Ahat I-452 Musa, Zalili II-454 Nabi, Zubair IV-576 Nahavandi, Saeid I-5 Nakada, Toyohisa IV-449 Nakagawa, Masaru III-509 Nakahara, Takanobu III-244, III-273 Nakamatsu, Kazumi III-123, III-133, III-143, III-164, III-200, III-215 Namiki, Junji III-562 Naqi, Syed M. I-340 Naruse, Keitaro IV-298 Nasri, Chaker Abidi I-532 Nauck, Detlef I-141 Naveen, Nekuri I-80 N´emeth, Erzs´ebet II-389 N´emeth, Istv´ ann´e I-607 Nguyen, A.-T. II-114 Nguyen, D.-T. II-114
658
Author Index
Nguyen, Ngoc Thanh I-281 Niimura, Masaaki IV-210 Nishi, Daisuke II-637 Nishide, Tadashi III-473 Nishihara, Takanao II-645 Nishihara, Yoko III-315 Nishimura, Haruhiko III-592 Nishino, Kazunori II-143 Niskanen, Vesa A. IV-116 Niwa, Takahito III-113 Noda, Masaru II-381 Noguchi, Daijiro IV-163 Nonaka, Yuki IV-271 Nunohiro, Eiji III-622 Obembe, Olufunmilayo IV-88 Oberm¨ oller, Nils II-244 Oehlmann, Ruediger III-290 O’Grady, Michael J. IV-365 O’Hare, Gregory M.P. IV-365 Ohsawa, Yukio III-315 Oikawa, Ryotaro I-198 Okada, Masashi III-104 Okada, Yoshihiro IV-251 Okada, Yousuke III-113 Okajima, Seiji IV-251 Okamoto, Takeshi III-628 Okumura, Noriyuki IV-51 Oliver, Jos´e L. I-31 Oltean, Gabriel IV-623 Omitola, Tope IV-594 Omori, Yuichi IV-271 Onn, Kow Weng I-627 Onogi, Manabu III-113 Ooshaksaraie, Leila IV-22 Orlewicz, Agnieszka II-82 Orlowski, Aleksander II-515 Orlowski, Cezary II-533, II-543, II-571 Othman, Zulaiha Ali I-491 Otsuka, Shinji II-620 Ounelli, Habib I-532 Oyama, Tadahiro III-612 Ozaki, Masahiro III-63 Ozell, Benoit IV-410 Palenzuela, Carlos II-495 Paloc, C. IV-80 Park, Jong Geol III-622 Park, Seog II-302
Pav´ on, Juan IV-328 Pazos, Alejandro II-74 Pazos R., Rodolfo II-183 Pazos R., Rodolfo A. II-173, II-203 Pedersen, Torben Bach III-349 Pel´ aez, J.I. III-445 P´erez O., Joaqu´ın II-173 Pereira, Javier II-74 Peter, S. I-442 Petre, Emil II-234 Petric, Ana I-261 Petrigni, Caterina III-418 Pham, Tuan D. I-379 Pint´er, Bal´ azs I-607 Podobnik, Vedran I-251 Porto-D´ıaz, Iago I-168 Poulopoulos, Vassilis III-389 Pratim Sanyal, Partha IV-506 Prickett, Paul II-371 Pudi, Vikram II-11 Pu, Fei IV-135 Puga Soberanes, H´ector Jos´e II-183 Puglisi, Piera Laura III-438 Pulvirenti, Alfredo III-438 Raghavendra Rao, C. I-80 Raja, Hardik IV-485 Raju, S. Bapi I-461 Rambousek, Adam I-432 Rambow, Owen IV-375 Ramirez-Iniguez, Roberto IV-633 Rana, Omer F. IV-546, IV-556 Rango, Francesco I-240 Ravi, V. I-80, I-461 Ray, Sanjog II-282 Read, Simon II-163 Reicher, Tomislav II-21 Renaud, D. IV-32 Resta, Marina III-583 Richards, Kevin IV-497 R´ıos, Sebasti´ an A. II-93, II-581, II-591 Rodr´ıguez, Manuel I-522 Rodr´ıguez, Sara IV-318 Rogers, Bill IV-394 Rojas P., Juan C. II-203 Rojtberg, Pavel IV-402 Roos, Stefanie IV-536 Ros, Mar´ıa IV-337 Roselli, Teresa II-64
Author Index Rossano, Veronica II-64 Rostanin, Oleg I-639 Rousselot, F. IV-32 Rouveyrol, Claire III-81 R´ oz˙ ewski, Przemyslaw IV-180 Ruhlmann, Laurent IV-410 Russo, Wilma I-240 Rybakov, Vladimir I-230, II-224, III-323 Rygielski, Piotr II-523 Sadanandan, Arun Anand I-627 Sadek, Jawad IV-586 Saha, Sourav II-282 Said, Fouchal I-452 Sakamoto, Ryuuki III-501 Salem, Ziad IV-586 San, Tay Cheng I-69 S´ anchez-Pi, Nayat IV-357 San´ın, Cesar II-553, II-601 Sanin, Cesar II-563 Santaolaya S., Ren´e II-203 Santofimia, Mar´ıa J. II-193 Sanz, Ricardo I-522 Sasaki, Takuya III-455 Sato, Hiroshi IV-298 Sato, Hitomi IV-290 Sawamoto, Jun III-455 Sawaragi, Tetsuo I-2 Sch¨ afer, Walter II-399 Schwarz, Katharina IV-402 Segawa, Norihisa III-455 Segura, Carlos I-51 Seifert, Sascha I-290 Seli¸steanu, Dan II-234 S ¸ endrescu, Dorin II-234 Seta, Kazuhisa IV-163 Setchi, Rossitza I-481, I-617, IV-240 Shadbolt, Nigel IV-594 Shankar, Ravi II-11 Shi, Lei I-617 Shida, Haruki III-628 Shidama, Yasunari III-281 Shima, Takahiro III-519 Shimoda, Toshifumi II-143 Shimogawa, Shinsuke III-225 Shiraishi, Yoh III-493 Siddiqui, Raees II-371 Sidirokastriti, Sofia III-43 Sidorova, Natalia I-41
Sierra, Carles I-220 ˇ c, Artur II-21, II-31 Sili´ Sintek, Michael I-290 Sitek, Tomasz II-571 Skorupa, Grzegorz I-271 Sofiane, Benamor I-452 Sohn, So Young IV-200 Soldano, Henry II-351 Sproat, Richard IV-375 Srivastava, Muni S. III-7 Stasko, John IV-420 Stewart, Brian G. IV-633 Suchacka, Gra˙zyna II-505 Sugihara, Taro IV-457 Sugino, Eiji III-455 Sugiyama, Takeshi III-15 Sun, Fan I-90 Sunayama, Wataru III-235 Suzuki, Kenji I-148 Suzuki, Nobuo III-1 Suzuki, Takeshi II-645 Suzuki, Takeshi I-639 Suzuki, Yu IV-440 Szczerbicki, Edward II-515, II-553, II-563, II-601 Szlachetko, Boguslaw I-311 Szolga, Lorant Andras IV-613 Taguchi, Ryosuke IV-200 Takahashi, Hirotaka IV-190 Takahashi, Megumi III-519 Takahashi, Osamu III-493 Takai, Keiji III-254 Takano, Shigeru IV-251 Takeda, Kazuhiro II-381, II-417 Takeda, Kosuke III-501 Takeda, Masaki III-555 Takeshima, Ryo IV-230 Taki, Hirokazu II-611, II-637 Takimoto, Munehiro I-198 Tamura, Yukihiro III-235 Tanabe, Kei-ichi III-645 Tanaka, Jiro IV-440 Tanaka, Takushi III-190 Tanaka, Toshio II-620 Tanaka-Yamawaki, Mieko III-602 Tanaka, Yuzuru I-14, I-649 Telec, Zbigniew I-111 Tenorio, E. III-445
659
660
Author Index
Tipney, Hannah IV-420 Tokuda, Mitsuhiro IV-465 Tominaga, Yuuki IV-210 Tomiyama, Yuuki III-235 Torii, Ippei III-104, III-113 Toro, Carlos II-495 Torres, Claudio Rodrigo III-154 Tortosa, Leandro I-31 T´ oth, Attila II-389 Tran, V.-H. II-114 Trawi´ nski, Bogdan I-111 Tschumitschew, Katharina I-141, II-244 Tsogkas, Vassilis III-379 Tsuchiya, Seiji I-400, IV-1 Tsuda, Kazuhiko III-1 Tsuge, Satoru III-612 Tsuge, Yoshifumi II-409 Tsumoto, Shusaku III-297 Uemura, Yuki IV-220 Ueta, Tetsushi IV-145 Uno, Takeaki III-244 Ushiama, Taketoshi IV-261 Valencia-Garc´ıa, Rafael I-597 Vallejo, D. IV-347 Valsamos, Harry II-341 van der Aalst, Wil I-41 Vaquero, Javier II-495 Varlamis, Iraklis III-23, III-33 Vass´ anyi, Istv´ an I-607 V´ azquez A., Graciela II-173 V´ azquez-Naya, Jos´e M. II-74 Vecchietti, Aldo R. II-44 Vel´ asquez, Juan D. II-93, II-581 Ventos, V´eronique II-351 Verspoor, Karin IV-420 Vicent, Jos´e F. I-31 Vila, Amparo IV-337 Villanueva, David Ter´ an II-183 Vychodil, Vilem I-471 Wada, Yuji III-455 Wakayama, Yuki IV-251 Walters, Simon II-322 Wan, Chang I-501 Wan, Jie IV-365 Wang, Bo II-445 Watabe, Hirokazu I-400, IV-1
Watada, Junzo II-445, II-454, II-485 Watanabe, Takashi III-123 Watanabe, Toyohide IV-153, IV-230 Watanabe, Yuji III-660 Watanabe, Yuta IV-475 Wautelet, Yves I-209 Whitaker, Roger I-4 White, Richard J. IV-485 Willett, Peter II-163 Wilton, Aaron IV-497 Woodham, Robert J. III-81, III-89 Wu, Dan IV-60, IV-124 Xu, Guandong
III-398
Yaakob, Shamshul Bahar II-485 Yada, Katsutoshi III-244, III-254, III-273 Yamada, Kunihiro III-483, III-535, III-545, III-562, III-572 Yamada, Takayoshi I-158 Yamaguchi, Takahira III-264 Yamaguchi, Takashi III-622 Yamamoto, Hidehiko I-158 Yamamura, Mariko III-1, III-7 Yamazaki, Atsuko K. II-630 Yamazaki, Makoto IV-190 Yanagihara, Hirokazu III-7 Yanagisawa, Yukio III-622 Yano, Yoneo II-620, IV-145 Yaoi, Takumu III-483 Yasue, Kizuki II-409 Yatsugi, Kotaro IV-261 Yonekura, Naohiro III-97 Yoshida, Koji III-519 Yoshida, Kouji III-483, III-572 Yoshihara, Yuriko III-555 Yoshihiro, Takuya III-509 Yoshimura, Eriko I-400, IV-1 Yu, Chunshui IV-506 Yu, Jian II-1 Yuizono, Takaya III-464 Yukawa, Takashi IV-190 Yun, Jiali II-1 Yunfei, Zeng III-63 Yunus, Mohd. Ridzuan II-464 Yusa, Naoki III-572 Yusof, Rubiyah I-69, II-464
Author Index Yusof, Yuhanis Yuuki, Osamu
IV-546 III-535, III-562
Zamora, Antonio I-31 Zanni-Merk, C. IV-32 Zhang, Haoxi II-563
Zhang, Hui I-131 Zhang, Yan IV-11, IV-135 Zhang, Yanchun III-398 Zhou, Yi IV-135 Zi´ olkowski, Artur II-543 Zong, Yu III-398
661