Intelligent Knowledge-Based Systems: Business and Technology in the New Millennium

INTELLIGENT KNOWLEDGE-BASED SYSTEMS BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM VOLUME 1 KNOWLEDGE-BASED SYSTEMS ...

Author: Cornelius T. Leondes

38 downloads 1048 Views 127MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 1 KNOWLEDGE-BASED SYSTEMS

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 1 KNOWLEDGE-BASED SYSTEMS

Edited by CORNELIUS T. LEONDES

University of California, Los Angeles, USA

....

"

K LUWER ACADEMIC PUBLISHERS

BOSTONIDORDRECHT ILONDON

Distributors for North, Central and South America: KJuwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA Telephone (781) 871-6600 Fax (781) 871-6528 E-Mail Distributors for all other countries: KJuwer Academic Publishers Group Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS Telephone 31 78 6576 000 Fax 31 786576474 E-Mail

lIl...

"

Electronic Services

Library of Congress Cataloging-in-Publication Data Intelligent knowledge-based systems: business and technology in the new millennium. / edited by Cornelius T. Leondes. Includes bibliographical references and index. Contents: v. 1. Knowledge-based systems-v. 2. Information technologyv. 3. Expert and agent systems-v. 4. Intelligent systemsv. 5. Neural networks, fuzzy theory and genetic algorithms. ISBN 1-40207-746-7 (set)-ISBN 1-40207-824-2 (v.1)-ISBN 1-40207-825-0 (v.2)ISBN 1-40207-826-9 (v.3)-ISBN 1-40207-827-7 (vA)-ISBN 1-40207-828-5 (v.5) ISBN 1-40207-829-3 (electronic book set) (LOC information to follow.)

Copyright © 2005 by KJuwer Academic Publishers All rights reserved. No part of this work may be reproduced, stored in a retrieval systems or transmitted in any form or by any means, electronic, mechanical, photo-copying, microfilming, recording, or otherwise, without the prior written permission of the publisher, witli tlie exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Permissions for books published in the USA: [email protected] Permissions for books published in Europe: [email protected] Printedon acid-jree paper. Printed in the United States of America.

CONTENTS

Foreword Preface

VII

IX

List of contributors

Xlll

Volume 1. Knowledge-Based Systems

1. Platform-Based Product Design and Development: Knowledge Support Strategy and Implementation 3 XUAN F. ZHA AND RAM D. SRIRAM

2. Knowledge Management Systems in Continuous Product Innovation

36

MARIANO CORSO, ANTONELLA MARTINI, LUISA PELLEGRINI, AND EMILIO PAOLUCCI

3. Knowledge-Based Measurement of Enterprise Agility NIKOS

c.

67

TSOURVELOUDIS

4. Knowledge-Based Systems Technology in the Make or Buy Decision in Manufacturing Strategy 83 P. HUMPHREYS AND R. MCIVOR

5. Intelligent Internet Information Systems in Knowledge Acquisition: Techniques and Applications 110 SHIAN-HUA LIN

6. Aggregator: A Knowledge Based Comparison Chart Builder for eShopping

140

F. KOKKORAS, N. BASSILIADES, AND I. VLAHAVAS

v

vi

Contents

7. Impact of the Intelligent Agent Paradigm on Knowledge Management

164

JANIS GRUNDSPENKIS AND MARITE KIRIKOVA

8. Methods of Building Knowledge-Based Systems Applied in Software Project Management 207 CEZARY ORLOWSKI

9. Security Technologies to Guarantee Safe Business Processes in Smart Organizations ISTVAN MEZGAR

10. Business Process Modelling and Its Applications in the Business Environment

288

BRANE KALPIC, PETER BERNUS, AND RALF MUHLBERGER

11. Knowledge Based Systems Technology and Applications in Image Retrieval EUGENE DI SCIASCIO, FRANCESCO M. DONINI, AND MARINA MONGIELLO

346

246

FOREWORD

Almost unknown to the academic world, and to the general publi c, the application of intelligent knowledge-b ased systems is rapidly and effectively changing the future of the human species. Today, hum an well-being is, as it has been for all of history, fundamentally limited by the size of the world economic produ ct. Thus , if human economic well-being (which I personally define as the bottom centile annual per capita income) is ever soon to reach an acceptable level (e.g., the equivalent of $20,000 per capita per annum in 2004), then intelligent knowledge-b ased systems must be employed in vast quantiti es. This is pr imarily becau se of the reality that few human s live in efficient societies (such as the United States, Canada, Japan, the UK, France, and Germany, for example) and that inefficient societies, many of which are already large, and growing larger, may require many decades to become efficient. In the meantime, billions of people will continue to suffer economic impoverishm ent-an impoverishment that inefficient hum an labor cannot remedy. To create the extra economi c output so urgently needed, we have only one choice : to employ inte lligent knowledge-based systems in great numbers, which will produ ce eco nomic output prodigiously, but will consum e hardly at all. This multi-volume major reference work , architect ed by its editor, Cornelius T. Leond es, provides a wealth of'case studies' illustrating the state of the art in intelligent knowledge-ba sed systems. In contrast to ordinary academic pedagogy, wh ere 'ivory tower' abstraction and elegance are the guiding principles, practical applications require detailed relevant examples that can be used by practitioners to successfully inno vate new operational capabilities. Th e economic progre ss of the species depends upon the vii

viii

Foreword

flow of these innovations, which requires multi-volume major reference works with carefully selected, well-written, and well-edited 'case studies.' Professor Leonde s knows these realities well, and the five volum es in this work resoundingly reflect his success in achieving their requir ements. Volume 1 addresses Knowledge-Based Systems. Thes e eleven chapters consider the basic question ofhow accumulated data and staffexpertise from business operations can be abstracted into valuable knowledge, and how such knowledge can then be applied to ongoing operations. Wide and represent ative situations are considered, ranging from produ ct innovation and design, to intelligent database exploit ation , to business model analysis. Volume 2, Informati on Technology, addressesin ten chapters the important question of how data should be stored and used to maximize its overall value. Case studies consider a wide variety of application arenas: produ ct development, manufacturing, product management, and even product pricing. Volume 3 addresses Expert and Agent Systems in ten chapters. Application arenas considered include image databases, business process monitoring, e-commerce, and production planning and scheduling. Again, the coverage is designed to provide a wide range of perspectives and business-function con centrations to help stimulate inno vation by the reader. Volume 4, Intelligent Systems, provides nine chapters considering such topic s as mission-critical functions , businessforecasting, medical patient care, and produ ct design and development. Volume 5 addresses Neural Networks, Fuzzy Theory, and Genetic Algor ithm Techniques. Its ten chapters cover examples in areas including bioinformatics, product Iifecycle cost estimating, produ ct development, computer-aided design, produ ct assembly, and facility location . The examples assembled by Professor Leondes in this work provide a wealth of practical ideas designed to trigger the development of innovation. The contributors to this grand proje ct are to be congratulated for the major efforts they have expended in creating their chapters. Humans everywhere will soon ben efit from the case studies provided herein. Intelligent Knowledge-B ased Systems: Business and Technology in the New Millennium, is a reference work that belongs on the desk of every innovative technologist. It has taken many decades of experience and unflagging hard work for Professor Leondes to accumulate the wisdom and judgment reflected in his editorial stewardship of this reference work . Wisdom and judgment are rare-but indispensablecommodities that cann ot be obtained in any other way. The world of innovative technology, and the world at large, stand in his debt . Robert Hecht-Nielsen Computational N eurobiology Institute for Neural Computation Department of Electric al and Computer Engineering University of California, San Diego

PREFACE

At the start of the 20 th cent ury, national economies on the international scene were, to a large extent, agriculturally based. T his was, perh aps, the dominant reason for the protraction, on the internation al scene, of the Great Depression , which began with the Wall Street stock market crash of October, 1929. After World War II the trend away from agric ulturally based economies and toward industrially based econo mies continued and strengt hened . Indeed, today, in the United States, approximately only 1% of the population is involved in the agriculture requirements of the US and, in addition, provides significant agriculture exports. This, of course, is made possible by th e greatly improved techniqu es and technologies utilized in the agriculture industry. The trend toward indu strially based economies after World War II was, in turn, followed by a trend toward service- based economies. In th e U nited States today, roughly over 70% of the employment is involved with service indu stries-and this percentage continues to increase. Separately, the electronic computer indu stry began to take hold in the early 1960s, and thereafter always seemed to exceed expec tations. For example, the first large-scale sales of an electro nic computer were of the rEM 650. At that time, projec tions were that the total sales for the United States wou ld be twenty-five rEM 650 computers. Before the first one came off the proj ection line, rEM had initial orders for over 30,000. T hat was thought to be huge by the standards of that day, and today it is a very miniscule number, to say nothing of the fact that its computing power was also very miniscule by today's standards. Computer mainframes continued to grow in power and complexity. At th e same time, Gordon Moore, of "M oore's Law" fame, and his colleagues founded IN TE L. Then around 1980 M [CRO SO FT was ix

x

Preface

founded, but it was not unt il the early 1990s, not that long ago, that WINDOWS were created- incidentally, after the APPLE computer family started. The first browser was the NETSCAPE browser, which appeared in 1995, also not that lon g ago. Of course, computer networking equipment, most notably C ISCO 's, also appeared about that time. Toward the end of th e last century the "DOT CO M bubble" occurred and "burst" around 2000. Co ming to the new millennium, tor most of our history the wealth of a nation was limited by the size and stamina ofthe work force. Today, nation al wealth is measured in intellectual capital. N ation s possessing skillful peop le in such diverse areas as science, medi cine, business, and engineering produce inno vations that drive the nation to a higher quality oflife. To better utilize these valuable resources, intelligent, knowledgebased systems technology has evolved at a rapid and significantly expanding rate, and can be utilized by nations to improve their medical care, advance their engineering technology, and increase th eir manufacturing productivity, as well as playa significant role in a very wide variety of other areas of activity of substantive significance. T he breadth of the major application areas of intelligent, know ledge-based systems technology is very impressive. These include the following, among other areas. Agri culture Business C hemistry Co mmunications Co mputer Systems Education Management Law Manufacturin g Mathematics Medi cine Meteorology

Electroni cs Engineering Environm ent Geo logy Image Processing Information Military Mining Power Systems Science Space Technology Transportation

It is difficult now to imagine an area that will not be tou ched by intelligent, knowledge-based systems techn ology. Th e great breadth and expanding significance of such a broad field on the international scene requires a multi- volume, major reference work to provide an adequately substantive treatment of the subject, "Intelligent Knowledge-Based Systems: Business and Technology of The New Millennium." T his work con sists of the following distin ctly titled and well integrated volume s. Volume Volume Volume Volume Volume

I. II. III. IV V

Knowled ge-Based Systems Inform ation Technolo gy Expert and Agent Systems Intelligent Systems Ne ural Networks

This five-volume set on intelligent knowledge-based systems clearly manifests the great significance of these key technologies for the new economies of the new millennium. The authors are all to be highly commended for their splendid contributions, which together will provide a significant and uniquely comprehensive reference source for research workers, practitioners, computer scientists, students, and others on the international scene for years to come. Cornelius T. Leondes University of California, Los Angeles January 5, 2004

CONTRIBUTORS

VOLUME 1: KNOWLEDGE-BASED SYSTEMS N. Bassiliades Department of Informatics Aristotle University of Thessaloniki Thessaloniki GREECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderfor eShopping

Peter Bernus Griffith University School of CIT Nathan Queensland AUSTRALIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Mariano Corso Department of Management Engineering Polytechnic University of Mailand Milano ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innnovation xiii

xiv

Contributors

Eugenio di Sciascio Dipartimento Elettrotecnica ed Elettronica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Francesco M. Donini Universita della Tuscia Viterbo ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Janis Grundspenkis Faculty of Computer Science and Information Technology Riga Technical University Riga LATVIA Chapter 7. Impact of the Intelligent Agent Paradigm on Knowledge Management P. Humphreys Faculty of Business and Management University of Ulster Northern Ireland UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manuftcturing Strategy Brane Kalpic ETI Elektroelement Jt. St. Compo Izlake SLOVENIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Marite Kirikova Faculty of Computer Science and Information Technology Riga Technical University Riga LATVIA Chapter 7. Impact of the Intelligent Agent Paradigm on Knowledge Management F. Kokkoras Department of Informatics Aristotle University of Thessaloniki

Thessalon iki GREECE

Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderf or eShopping

Shian-Hua Lin Department of Computer Science and Information En gin eering National C hi Nan University Taiwan REPUBLI C OF CHINA Chapter 5. Intelligent Internet biformation Systems in Knowledge Acquisition: Techniques mid

Applications

Antonella Martini Faculty of Engineering University of Pisa Pisa ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innovation R. Mcivor Faculty of Business and Managemen t Un iversity of Ul ster UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manufizcturing Strategy Istvan Mezgar CIM R esearch Laboratory Comp uter and Automation s Research Institute Hungarian Academy of Sciences Bud apest HUNGARY Chapter 9. Security Technologies to Guarantee Safe Business Processes in Smart Organiz ations Marina Mongiello Dipartiment o di Elettrote cn ica ed Elettronica PoIitecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Ralf Muhlberger University of Queensland Information Technology & Electrical Engineering

xvi

Contributors

Queensland AUSTRALIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment

Cezary Orlowski Gdansk University of Technology Gdansk POLAND Chapter 8. Methods
Thessaloniki GREECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderfor eShopping

Xuan F. Zha Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Plaiform- Based Product Design and Development: Knowledge Support Strategy and Implementation VOLUME 2: INFORMATION TECHNOLOGY Ales Brezovar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analysis of Sequential and Concurrent Product Development Processes Chris R. Chatwin School of Engineering and Information Technology University of Sussex Brighton UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in Manufacturing Ke-Zhang Chen Department of Mechanical Engineering The University of Hong Kong HONG KONG Chapter 5. Design and Modeling Methodsfor Components Made of Multi-Heterogeneous Materials in High- Tech Applications Adrian E. Coronado Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and Their Applications in Manufacturing Systems

xviii

Contributors

Xin-An Feng Schoo l of Mechanical Engineering Dalian Uni versity of Techn ology Dalian CHINA Chapter 5. Design and Modeling Methodsjor Components Made oj Multi-Heterogeneous Materials in High-Tech Applications Janez Grum Faculty of Mechanical Engineering Uni versity of Ljubljana Ljubljana SLOVEN IA Chapter 4. Techniques and A nalysis oj Sequential and Concurrent Product Development Processes George Hadjinicola Dep artment of Public and Business Administration Schoo l of Economics and Management University of Cyprus Ni cosia CYPRUS Chapter 9. Product Design and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Jared Jackson IBM Almaden Research Center San Jose, California USA Chapter 7. VVeb Data Extraction Techniques and Applications Using the Extensible Markup Language (XML)

D. F. Kehoe Man agement School The U niversity of Liverp ool Liverpool UNIT ED KINGDOM Chapter 2. lnformation SystemsFrameworks and TheirApplications in Malll!faC(urillg Systems Andreas Koeller Departm ent of Compu ter Science Mont clair State Un iversity Upp er Mont clair, N ew Jersey USA Chapter 6. Quality and Cost of Data Warehouse Views

K. Ravi Kumar Department of Information and Operations Management Marshall School of Business University of Southern California Los Angeles, California USA Chapter 9. Product Redesign and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Janez Kusar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analyses of Sequential and Concurrent Product Development Processes Henry C. w: Lau Department ofIndustrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means of Intelligent Information Infrastructure Methods and Their Applications Amy Lee The Ohio State University Columbus, Ohio USA Chapter 6. Quality and Cost of Data Warehouse Views Choon Seong Leem School of Computer and Industrial Engineering Yonsei University Seoul KOREA Chapter 1. Techniques in Integrated Development andImplementation of Enterprise Information Systems A. C. Lyons Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and Their Applications in Manufacturing Systems

xx

Co ntributors

Jussi Myllymaki IBM Almaden R esearch Ce nter San Jose, Californ ia USA Chapter 7. I1Ieb Data Extraction Techniques and Applications Using the Extensibie Markup Language (XML) Anisoara Nica Sybase Incor porated Waterloo, O ntario Canada Chapter 6. Quality and Cost of Data Warehouse Views Jorg Niemann IFF University of Stutt gart Fraunh ofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management in the Digital Age Andrew Ning Departm ent of Industrial and Systems Engineering The Hon g Kong Polytechnic Uni versity Hunghom HONG KO N G Chapter 10. Knowledge Discovery by Means if Intelligent Inf ormation Infrastructure Metho ds and Their Applications Elke A. Rundensteiner D epartme nt of Comp uter Science Worcester Polytechnic Institut e Worcester Massachusetts USA Chapter 6. Quality and Cost of Data Wa rehouse Views Marko Starbek Faculty of Mechanical Engineering Un iversity of Ljubljana Ljubljana SLOVEN IA Chapter 4. Techniques and A nalyses of Sequential and Concurrent Product Development Processes Jong Wook Suh Scho ol of Computer and Industrial Enginee ring Yonsei U niversity

Seoul KOREA Chapter 1. Techniques in Integrated Development and Implementation ofEnterprise Inf ormation Systems

Qian Wang Schoo l of Eng ineering and Infor mation Technology University of Sussex Brighton and Department of Mechanical Engin eering University of Bath Bath UNIT ED KINGDOM Chapter 3. Modeling Techniq ues hi Integrated Operations and Inf ormation Systems in Mallufacturing Systems Engelbert Westkiimper IFF University of Stuttgart Fraunhofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle i'v[anagel1lent in the Digital Age Christina \v. Y. Wong Dep artment of Indu strial and Systems Engineering T he Hong Kong Polytechnic Univ ersity Hunghom HO N G KO NG Chapter 10. Knowledge Discoveryby Means of Intelligent biformation Infrastructure Methods and Their Applications R . C. D. Young Schoo l of Engineering and Information Technology University of Sussex Brighton U NI T ED KIN GDO M Chapter 3. Modeling Techniques in Integrated Operations and Inf ormation Systems in Mamifacturillg Systems VOLUME 3: EXPERT AND AGENT SYSTEMS Dimitris Askounis Institute of Communicat ions & Computer Systems N ational Technical Univers ity of Athems

xxii

Contributors

Athen s GREECE Chapter 2. Expert Systems Technology ill Production Plan/ling and Scheduling G. A. Bri tton Design Research Center School O f Mechanical and Production Engineering Nanyang Technolo gical Un iversity SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems for the Design of Eligilleering Systems

Jing Dai Schoo l of Computing Nati onal University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Robert Gay Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technologi cal Un iversity SINGAPORE Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach Angela Goh School of Co mputer Engin eering Nanyang Technological Un iversity SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Ivan R omero H ernand ez Technolo gical University of Grenoble LCIS R esearch Laboratory Valence FRA N CE Chapter 5. From Roles to Agents: Considerations on Formal Agm t Modeling and lmplementation Tn Bao H o Japan Advanced Institute of Science and Techno logy Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis

Wynne Hsu School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Chun-Che Huang Department of Information Management National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 3. Applying Intelligent Agent-Based Support Systems in Agile Business Processes K. Karibasappa Department of Electronics and Telecommunication Engineering University College of Engineering, Burla Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and Their Applications Nelly Kasim Singapore-MIT Alliance National University of Singapore SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Saori Kawasaki Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory Valence FRANCE Chapter 5. From Roles to Agents: Considerations on Formal Agent Modeling and Implementation Si Quang Le Japan Advanced Institute of Science and Technology Ishikawa

xxiv

Contributors

JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Mong Li Lee School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Antonio Liotta Center for Communication Systems Research University of Surrey Guildford, Surrey UNITED KINGDOM Chapter 8. Distributed Monitoring: Methods, Means, and Technologies Kostas Metaxiotis Institute of Communications & Computer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Chunyan Miao School of Computer Engineering Nanyang Technological University SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Yuan Miao Institute of Communication and Information Systems Nanyang Technological University SINGAPORE Chapter 6. Agent-Based el.earning Systems: A Goal-Based Approach Trong Dung Nguyen Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Srikanta Patnaik Department of Electronics and Telecommunication Engineering University College of Engineering, Buda

Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and Their Applications

John Psarras Institute of Communications & Co mputer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Zhiqi Shen Institute of Communication and Information Systems School of Electrical and Electronic Engineering N anyang Technological Univ ersity SIN GAPO R E Chapter 6. Agent-Based eLeaming Systems: A Coal-Based Approach S. B. Tor Singapore- M IT Alliance Nanyang Technological University SIN GAPORE Chapter 1. Techn iques in Knowledge-Based E xpert Systemsfor the Design of Etlgineering Systems

w. Y. Zhang

Design R esearch Center Schoo l of Me chanical and Production Engineerin g Nanyang Techn ological Uni versity SINGAPO R E Chapter 1. Techniques in Knowledge-Based Expert Systemsfor the Design of Engineering Systems

VOLUME 4: INTELLIGENT SYSTEMS Cheng-Leong Ang Singapore Institute of Manu facturing Technology SING APO R E Chapter 4. A n Intelligent Hybrid Systemfor Business Forecasting Sistine A. Barretto Advanced Computing R esearch Ce ntre Th e Un iversity of South Australia Adelaide

xxvi

Contributors

AUSTRALIA Chapter 6. Techniques in the Utilization of the Internet and Intranets in Facilitating the Development of Clinical Decision Support Systems in the Process of Patient Care

Billy Fenton International Test Technologies and University of Ulster Letterkenny, Donegal IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis

of Electronic Systems

Robert Gay Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 4. An Intelligent Hybrid Systemfor Business Forecasting Victor Giurgiutiu Mechanical Engineering Department University of South Carolina Columbia, South Carolina USA Chapter 8. Mechatronics and Smart Structures Design Techniques for Intelligent Products, Processes and Systems Marc-Philippe Huget Leibnitz Laboratory Grenoble France Chapter 9. Engineering Interaction Protocols for Multiagent Systems

Richard \v. Jones School of Engineering University of Northumbria Newcastle upon Tyne England UNITED KINGDOM Chapter 2. Intelligent Patient Monitoring in the Intensive Care Unit and the Operating Room jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory

Valence FRANCE Chapter 9. Engineering Interaction Protocols for Multiagent Systems

Xiang Li Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid System for Business Forecasting Liam Maguire Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis of Electronic Systems T. M. McGinnity Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis

of Electronic Systems

Tolety Siva Perraju Verizon Communications Waltham, Massachusetts USA Chapter 3. Mission Critical Intelligent Systems Mauricio Sanchez-Silva Department of Civil and Environmental Engineering Universidad de los Andes Bogota COLOMBIA Chapter 7. Risk Analysis and the Decision-Making Process in Engineering Garimella Uma South Asia International Institute Hyderabad INDIA Chapter 3. Mission Critical Intelligent Systems James R. Warren Advanced Computing Research Centre The University of South Australia

xxviii

Contributors

Mawson Lakes AUSTRALIA Chapter 6. Techniques in the Utilization if the Internet and Intranets in Facilitating the Development if Clinical Decision Support Systems in the Process if Patient Care Xuan F. Zha Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Artificial Intelligence and Integrated Intelligent Systems: Applications in Product Design and Development

VOLUME 5: NEURAL NETWORKS, FUZZY THEORY AND GENETIC ALGORITHM TECHNIQUES Kazem Abhary School of Advanced Manufacturing and Mechanical Engineering University of South Australia Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms F. Admiraal-Behloul Division of Image Processing Leiden University Medical Center Leiden THE NETHERLANDS Chapter 4. Fuzzy Rule Extraction Using Radial Basis Function Neural Networks in High-Dimensional Data

Kemal Ahmet Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD/CAM Integration Carl K. Chang Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Management Systems

Lian Ding Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD / CAM Integration Shing-Hwang Doong Department of Information Management Shu-Te University Yen Chau TAIWAN Chapter 10. Computational IntelligenceJor Facility Location Allocation Problems Yujia Ge Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Management Systems Andrew Kusiak Department of Mechanical and Industrial Engineering University of Iowa Iowa City, Iowa USA Chapter 5. Fuzzy Decision Modeling oj Product Development Processes Chih-Chin Lai Department of Information Management Shu-Te University Yen-Chau TAIWAN Chapter 10. Computational Intelligence Jor Facility Location Allocation Problems Wen F. Lu Product Design and Development Group Singapore Institute of Manufacturing Technology SINGAPORE Chapter 6. Evaluation and Selection in Product Design Jor Mass Customization Lee H. S. Luong School of Advanced Manufacturing and Mechanical Engineering University of South Australia

xxx

Contributors

Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms

Romeo Marin Marian CSIRO Manufacturing & Infrastructure Technology Woodville North, SA AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms Stergios Papadimitriou Department of Information Management Technological Education Institute ofKavala Kavala GREECE Chapter 9. Kernel-Based Se!f-Organized Maps Trained with Supervised Bias for Gene

Expression Data Mining

Johan H. C. Reiber Division of Image Processing Department of Radiology Leiden University Medical Center Leiden THE NETHERLANDS Chapter 4. Fuzzy-Rule Extraction Using Radial Basis Function Neural Networks in High-Dimensional Data Kwang-Kyu Seo Division of Computer, Information and Telecommunication Engineering Sangmyung University Chungnam KOREA Chapter 2. Neural Network Systems Technology and Applications in Product Life-Cycle Cost Estimates Joaquin Sitte Faculty of Information Technology Queensland University of Technology Brisbane AUSTRALIA Chapter 3. Neural Network Systems Technology in the Analysis oj Financial Time Series

Renate Sitte Faculty of Engineering and Information and Technology Griffith University Queensland AUSTRALIA Chapter 3. Neural Network Systems Technology in the Analysis of Financial Time Series Ram D. Sriram Design and Process Group Manufacturing Systems Integration Divison National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization FuJ. Wang Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization Juite Wang Department of Industrial Engineering Feng Chia University Taichung, Taiwan REPUBLIC OF CHINA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Hung Wu Department of Information Management Shu- Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yong Yue Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology and Applications in CAD / CAM Integration

xxxii

Contributors

Xuan F. Zha Design and Process Group Manufacturing Systems Integration Divison National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

llUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 2 INFORMATION TECHNOLOGY

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 2 INFORMATION TECHNOLOGY

Edited by CORNELIUS T. LEONDES University of California, Los Angeles, USA

....

"

K LUWER ACADEMIC PUBLISHERS

BOSTON/DORDRECHT ILONDON

Distributors for North, Central and South America: Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA Telephone (781) 871-6600 Fax (781) 871-6528 E-Mail Distributors for all other countries: Kluwer Academic Publishers Group Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS Telephone 31 78 6576 000 Fax31786576474 E-Mail

lIl...

"

Electronic Services

Library of Congress Cataloging-in-Publication Data Intelligent knowledge-based systerns : business and technology in the new millennium. /

edited by Cornelius T. Leondes.

Includes bibliographical references and index. Contents: v, 1. Knowledge-based systems-v. 2. Information technologyv. 3. Expert and agent systems-v. 4. Intelligent systemsv. 5. Neural networks, fuzzy theory and genetic algorithms. ISBN 1-40207-746-7 (set)-ISBN 1-40207-824-2 (v.1)-ISBN 1-40207-825-0 (v.2)ISBN 1-40207-826-9 (v.3)-ISBN 1-40207-827-7 (vA)-ISBN 1-40207-828-5 (v.5) ISBN 1-40207-829-3 (electronic book set) (LaC information to follow.)

Copyright © 2005 by Kluwer Academic Publishers All rights reserved. No part of this work may be reproduced, stored in a retrieval systems or transmitted in any form or by any means, electronic, mechanical, photo-copying, microfilming, recording, or otherwise, without the prior written permission of the publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Permissions for books published in the USA: permissi,,,,[email protected]/ll Permissions for books published in Europe: [email protected]

Printed on acid-free paper. Printed in the United States of America.

CONTENTS

Foreword Preface

Vll

IX

List of contributors

Xlll

Volume 2. Inforrnation Technology

1. Techniques in Integrated Development and Implementation of Enterprise Information Systems 3 CHOON SEONG LEEM AND JONG WOOK SUH

2. Information Systems Frameworks and Their Applications in Manufacturing and Supply Chain Systems 27 ADRIAN E. CORONADO MONDRAGON, ANDREW C. LYONS, AND DENNIS F. KEHOE

3. Modelling Techniques in Integrated Operations and Information Systems in Manufacturing Systems 64 Q. WANG,

c.

R. CHATWIN, AND R. C. D. YOUNG

4. Techniques and Analyses of Sequential and Concurrent Product Development Processes 123 MARKO STARBEK, JANEZ GRUM, ALES BREZOVAR, AND JANEZ KUSAR

5. Design and Modeling Methods for Components Made of Multi-Heterogeneous Materials in High-Tech Applications 177 KE-ZHANG CHEN AND XIN-AN FENG

v

vi

Contents

6. Quality and Cost of Data Warehouse Views 224 ANDREAS KOELLER, ELKE A. RUN DEN STEINER, AMY LEE, AND ANISOARA NICA

7. Web Data Extraction Techniques and Applications Using the Extensible Markup Language (XML) 259 JUSSI MYLLYMAKI AND JARED JACKSON

8. Product Life Cycle Management in the Digital Age

293

JORG NIEMANN AND E. WESTKAMPER

9. Product Redesign and Pricing in Response to Competitor Entry: A MarketingProduction Perspective 324 GEORGE C. HADJINICOLA AND K. RAVI KUMAR

10. Knowledge Discovery by Means of Intelligent Information Infrastructure Methods and Their Applications 347 HENRY C. W. LAU, CHRISTINA W. Y. WONG, AND ANDREW NING

FOREWORD

Almost unknown to the academic world, and to the general public, the application of intelligent knowledge-based systems is rapidly and effectively changing the future of the human species. Today, human well-being is, as it has been for all of history, fundamentally limited by the size of the world economic product. Thus, if human economic well-being (which I personally define as the bottom centile annual per capita income) is ever soon to reach an acceptable level (e.g., the equivalent of $20,000 per capita per annum in 2(04), then intelligent knowledge-based systems must be employed in vast quantities. This is primarily because of the reality that few humans live in efficient societies (such as the United States, Canada, Japan, the UK, France, and Germany, for example) and that inefficient societies, many of which are already large, and growing larger, may require many decades to become efficient. In the meantime, billions of people will continue to suffer economic impoverishment-an impoverishment that inefficient human labor cannot remedy. To create the extra economic output so urgently needed, we have only one choice: to employ intelligent knowledge-based systems in great numbers, which will produce economic output prodigiously, but will consume hardly at all. This multi-volume major reference work, architected by its editor, Cornelius T. Leondes, provides a wealth of 'case studies' illustrating the state of the art in intelligent knowledge-based systems. In contrast to ordinary academic pedagogy, where 'ivory tower' abstraction and elegance are the guiding principles, practical applications require detailed relevant examples that can be used by practitioners to successfully innovate new operational capabilities. The economic progress of the species depends upon the vii

viii

Foreword

flow of these innovations, which requires multi-volume major reference works with carefully selected, well-written, and well-edited'case studies.' Professor Leondes knows these realities well, and the five volumes in this work resoundingly reflect his success in achieving their requirements. Volume 1 addresses Knowledge-Based Systems. These eleven chapters consider the basic question ofhow accumulated data and staff expertise from business operations can be abstracted into valuable knowledge, and how such knowledge can then be applied to ongoing operations. Wide and representative situations are considered, ranging from product innovation and design, to intelligent database exploitation, to business model analysis. Volume 2, Information Technology, addressesin ten chapters the important question of how data should be stored and used to maximize its overall value. Case studies consider a wide variety of application arenas: product development, manufacturing, product management, and even product pricing. Volume 3 addresses Expert and Agent Systems in ten chapters. Application arenas considered include image databases, business process monitoring, e-commerce, and production planning and scheduling. Again, the coverage is designed to provide a wide range of perspectives and business-function concentrations to help stimulate innovation by the reader. Volume 4, Intelligent Systems, provides nine chapters considering such topics as mission-critical functions, business forecasting, medical patient care, and product design and development. Volume 5 addresses Neural Networks, Fuzzy Theory, and Genetic Algorithm Techniques. Its ten chapters cover examples in areas including bioinformatics, product lifecycle cost estimating, product development, computer-aided design, product assembly, and facility location. The examples assembled by Professor Leondes in this work provide a wealth of practical ideas designed to trigger the development of innovation. The contributors to this grand project are to be congratulated for the major efforts they have expended in creating their chapters. Humans everywhere will soon benefit from the case studies provided herein. Intelligent Knowledge-Based Systems: Business and Technology in the New Millennium, is a reference work that belongs on the desk of every innovative technologist. It has taken many decades of experience and unflagging hard work for Professor Leondes to accumulate the wisdom and judgment reflected in his editorial stewardship of this reference work. Wisdom and judgment are rare-but indispensablecommodities that cannot be obtained in any other way. The world of innovative technology, and the world at large, stand in his debt. Robert Hecht-Nielsen Computational Neurobiology Institute for Neural Computation Department of Electrical and Computer Engineering University of California, San Diego

PREFACE

At the start of th e 20 th century, nation al economies on th e interna tio nal scene were, to a large extent, agri culturally based . This was, perhaps, th e dominant reason for the protr action , on the internation al scene, of the Great Depression , which beg an with th e Wall Street stoc k market crash of O ctob er, 1929. After World War II the trend away from agriculturally based eco no mies and toward industrially based economies co ntinued and strengthe ned. Inde ed , tod ay, in the United States, approximately onl y 1% of the population is involved in the agriculture requ irem ents of the US and , in addition, provides significant agriculture exp orts. This, of course, is made possible by th e greatly improved techniqu es and technologies utilized in th e agriculture industry. T he trend toward industrially based economies after World War II was, in turn, followed by a trend toward service-b ased economies. In the United States today, roughly over 70% of the employment is involved with service industries- and this percentage continues to inc rease. Separately, the electronic computer industry began to take hold in th e early 1960s, and thereafter always seem ed to exceed expectations. For example, th e first large-scale sales of an electronic computer wer e of the IBM 650. At that tim e, projections were that th e tot al sales for the United States would be twenty-five IBM 650 co mputers. Before the first one came off the proje ction line, IBM had initial o rders for over 30 ,000. That was thou ght to be huge by the standards of that day, and today it is a very mini scule number, to say nothing of the fact that its computing power was also very mini scule by today's standards. Compute r mainframes continued to grow in pow er and complexity. At the same time , Gord on M oore, of " Moore's Law" fame, and his colleagues founded IN T EL. Then around 1980 MI CROSOFT was ix

x

Preface

founded, but it was not until the early 1990s, not that long ago, that WINDOWS were created-incidentally, after the APPLE computer family started. The first browser was the NETSCAPE browser, which appeared in 1995, also not that long ago. Of course, computer networking equipment, most notably CISCO's, also appeared about that time. Toward the end of the last century the "DOT COM bubble" occurred and "burst" around 2000. Coming to the new millennium, for most of our history the wealth of a nation was limited by the size and stamina of the work force. Today, national wealth is measured in intellectual capital. Nations possessing skillful people in such diverse areas as science, medicine, business, and engineering produce innovations that drive the nation to a higher quality oflife. To better utilize these valuable resources, intelligent, knowledgebased systems technology has evolved at a rapid and significantly expanding rate, and can be utilized by nations to improve their medical care, advance their engineering technology, and increase their manufacturing productivity, as well as playa significant role in a very wide variety of other areas of activity of substantive significance. The breadth of the major application areas of intelligent, knowledge-based systems technology is very impressive. These include the following, among other areas. Agriculture Business Chemistry Communications Computer Systems Education Management Law Manufacturing Mathematics Medicine Meteorology

Electronics Engineering Environment Geology Image Processing Information Military Mining Power Systems Science Space Technology Transportation

It is difficult now to imagine an area that will not be touched by intelligent, knowledge-based systems technology. The great breadth and expanding significance of such a broad field on the international scene requires a multi-volume, major reference work to provide an adequately substantive treatment of the subject, "Intelligent Knowledge-Based Systems: Business and Technology of The New Millennium." This work consists of the following distinctly titled and well integrated volumes. Volume Volume Volume Volume Volume

I. II.

III. IV V

Knowledge-Based Systems Information Technology Expert and Agent Systems Intelligent Systems Neural Networks

This five-volume set on intelligent knowledge-based systems clearly manifests the great significance of these key technologies for the new economies of the new millennium. The authors are all to be highly commended for their splendid contributions, which together will provide a significant and uniquely comprehensive reference source for research workers, practitioners, computer scientists, students, and others on the international scene for years to come. Cornelius T. Leondes University of California, Los Angeles January 5, 2004

CO NTRIBUTORS

VOLUME 1: KNOWLEDGE-BASED SYSTEMS N . Bassiliades Departm ent of Informatics Aristotle University of T hessalo niki Th essaloniki GR EEC E Chapter 6. A~regator: A Knowledge-Based Comparison Chart Builderfor eShoppitlg P eter Bernus Griffith U niversity Schoo l of C IT Nathan Qu eensland AUSTRALIA Chapter 10. Business Process Modeling ami Its Applications ill the Business Environment Mariano Corso Department of Management Engineerin g Polytechni c U niversity of Mailand Milano ITALY Chapter 2. Knowledge Manaoement S ystems i l l Continuous Product lnnnovation xiii

xiv

Co ntributors

Eugenio di Sciascio Dipartimento Elett rotecnica ed Elettron ica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Francesco M. Donini Uni versita della Tuscia Viterbo ITALY Chapter 11. Knowledge-Based Systems Technology and Applications ill Image Retrieval Janis Grundspenkis Faculty of Computer Science and Information Technology R iga Techni cal University R iga LAT VIA Chapter 7. Impact if the IntelligentAgen: Paradigm on Knowledge Management P. Humphreys Faculty of Business and Management Un iversity of Ulster N orthern Ireland UNITE D KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buv Decision in Manufacturing Strategy Brane Kalpic ETI Elektroelement Jt . St. Co mpo Izlake SLOVEN IA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Marite Kirikova Faculty of C omputer Science and Information Technol ogy R iga Techni cal University R iga LATVIA Chapter 7. Impact if the Intelligent Agent Paradigm on Knowledge Management F. Kokkoras D epartm ent of Informatics Aristotle Uni versity of Thessaloniki

Thessaloniki GREECE Chapter 6. A~regator: A Knowledge-Based Comparison Chart Builderfor eShopping

Shian-Hua Lin Department of Computer Science and Information Engineering National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 5. Intelligent Internet Information Systems in Knowledge Acquisition: Techniques and Applications Antonella Martini Faculty of Engineering University of Pisa Pisa ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innovation R. McIvor Faculty of Business and Management University of Ulster UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manufacturing Strategy Istvan Mezgar CIM Research Laboratory Computer and Automations Research Institute Hungarian Academy of Sciences Budapest HUNGARY Chapter 9. Security Technologies to Guarantee Safe Business Processes in Smart Organizations

Marina Mongiello Dipartimento di Elettrotecnica ed Elettronica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Ralf Muhlberger University of Queensland Information Technology & Electrical Engineering

xvi

Contribotors

Queensland AUSTRALIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment

Cezary Orlowski Gd ansk University of Technology Gdansk POLAND Chapter 8. Methods of Blli/ding Knowledge-Based Systems Applied in Software Project Management

Emilio Paolucci Depar tment of Operation and Business Ma nagement Polytechnic University of Turin Torino ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innovation

Luisa Pelle grini Faculty of Engineering University of Pisa Pisa ITALY Chapter 2. Knou.ledye M anaoement Systems ill Continuous Product Innovation

Ram D. Sriram D esign and Pro cess Group Manufacturing Systems Integration Division N ational Institute of Stand ards and Technology Gaithe rsburg, Maryland U SA Chapter 1. Platform-Based Product Design and Development: Knowledge Support Strategy and Implementation

N ikos C. Tsourveloudis D epartment of Production Eng inee ri ng and M anagem ent Techni cal University of Crete C hania, C rete GRE EC E Chapter 3. Knowledge-Based Measurement of Enterprise Agility I. Vlahavas Department of Informatics Ari stotle University of T hessalon iki

Thessaloniki GREECE

Chapter 6. A~regator: A Knowledge-Based Comparison Chart Builderfor eShopping Xuan F. Zha Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Plaiform-Based Product Design and Development: Knowledge Support Strategy

and Implementation VOLUME 2: INFORMATION TECHNOLOGY Ales Brezovar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analysis of Sequential and Concurrent Product Development

Processes

Chris R. Chatwin School of Engineering and Information Technology University of Sussex Brighton UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in

Manufacturing

Ke-Zhang Chen Department of Mechanical Engineering The University of Hong Kong HONG KONG Chapter 5. Design and Modeling Methods for Components Made of Multi-Heterogeneous

Materials in High-Tech Applications

Adrian E. Coronado Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and TheirApplications in Mamifacturing

Systems

xviii

Contributors

Xin-An Feng Schoo l of Mechanical Engineering Dalian Uni versity of Techno logy Dalian C H IN A Chapter 5. Design and i'v[odeling Methodsjor Components Made 1 Multi-Heterogeneous Materials in High-Tech Applications Janez Grum Faculty of Mechanical Engin eering University of Ljublj ana Lj ubljana SLOVEN IA Chapter 4. Techniques and Analysis qf Sequential and Concurrent Product Development Processes George Hadjinicola De partment of Public and Business Admi nistration School of Economics and Managem ent Un iversity of Cyprus N icosia CYPRU S Chapter 9. Product Design and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Jared Jackson IBM Almaden R esearch Ce nte r San Jose, California USA Chapter 7. web Data Extraction Techniques and Applications Using the Extensible Matkup LAnguage (XML)

D. F. Kehoe Management School Th e University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Inf ormation Systems Fram eworks and TheirApplications in Manufacturing Systems Andreas Koeller De partment of C omputer Science Montclair State Un iversity U pper Mo ntclair, N ew Jersey USA Chapter 6. Quality and Cost 1 Data Warehol/se Views

K. Ravi Kumar Department of Information and Operations Management Marshall School of Business University of Southern California Los Angeles, California USA Chapter 9. Product Redesign and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Janez Kusar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analyses of Sequential and Concurrent Product Development Processes Henry C. W Lau Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means of Intelligent Information Infrastructure Methods and Their Applications Amy Lee The Ohio State University Columbus, Ohio USA Chapter 6. Quality and Cost ~f Data Warehouse Views Choon Seong Leem School of Computer and Industrial Engineering Yonsei University Seoul KOREA Chapter 1. Techniques in Integrated Development and Implementation oi Bnterprise Information Systems A. C. Lyons Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and TheirApplications in Manuiacturino Systems

xx

Contributors

Jussi Myllymaki IBM Almaden R esearch Cen ter San Jose, Californ ia USA Chapter 7. l# b Data Extraction Techniq ues and Applications Using the Extensible Marhup L mgllage (XML) Anisoara Nica Sybase Incorp orated Waterloo, O ntario Canada Chapter 6. Quality and Cost of Data Wa rehouse Views Jorg Niemann IFF University of Stuttgart Fraunh ofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management ill the Digital Age Andrew Ning Department of Industrial and Systems Engineerin g The Hong Kong Polytechn ic University Hun ghom HONG KO NG Chapter 10. Knowledge Discovery by Means of Intelligent Inf ormatioll Injrastruaure Methods and Their Applications Elke A. Rundensteiner Depa rtmen t of Comp uter Science Worcester Polytechnic Institute Worcester Massachu setts USA Chapter 6. Quality and Cost of Data Warehouse Views Marko Starbek Faculty of Mechanical Engineering Un iversity of Ljubljana Ljubljana SLOVEN IA Chapter 4. Techniques and Analyses of Sequential and Concurrent Product Development Processes Jong Wook Suh Schoo l of Co mputer and Industrial Engineering Yonsei University

Seoul KOREA Chapter 1. Techniques in Integrated Development andImplementation ofEnterprise Information Systems

Qian Wang School of Engineering and Information Technology University of Sussex Brighton and Department of Mechanical Engineering University of Bath Bath UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in Manufacturing Systems Engelbert Westkamper IFF University of Stuttgart Fraunhofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management in the Digital Age Christina w: Y. Wong Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means if Intelligent Information Infrastructure Methods and Their Applications R. C. D. Young School of Engineering and Information Technology University of Sussex Brighton UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in Manufacturing Systems VOLUME 3: EXPERT AND AGENT SYSTEMS Dimitris Askounis Institute of Communications & Computer Systems National Technical University of Athems

xxii

Contributors

Ath ens GR EECE Chapter 2. Expert Systems Technology in Production Plamling and Scheduling

G. A. Britto n Desig n R esearch Center School O f M echanical and Production En gineering N anyang Technological University SINGAPO R E Chapter 1. Techniques in Knowledge-Based E:'..pert Systemsjor the Design of Engineering

Systems

Jing Dai School of Computing N ational University of Singapore SINGAPO R E Chapter 9. Finding Patterns in Image Databases Robert Gay Institute of C om municatio n and Information Systems School of Electrical and Electronic En gin eering N anyang Techn ological University SINGAPO R E Chapter 6. Agmt-Based el.eaminy Systems: A Goal-BasedApproach An gela Goh School of Computer En gin eering N anyang Technological University SINGAPOR E Chapter 4. TIle Knowledge Base of a BlB eCommCTce Multi-Agent System Ivan Romero Hernandez Techn ological University of Greno ble LCI S Research Labor atory Valence FRANCE Chapter 5. From R oles to Agents: Considera tions on Formal Agent Modeli,~~ and

Implementation Tu Bao Ho Japan Advanced Institute of Science and Technology Ishikawa JAPAN

Chapter 7. Comoining Temporal Abstraction and Data-Milling Methods in Medical Data A nalysis

Wynne Hsu School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Chun-Che Huang Department of Information Management National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 3. Applying Intelligent Agent-Based Support Systems in Agile Business Processes K. Karibasappa Department of Electronics and Telecommunication Engineering University College of Engineering, Burla Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and TheirApplications Nelly Kasim Singapore-MIT Alliance National University of Singapore SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Saori Kawasaki Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory Valence FRANCE Chapter 5. From Roles to Agents: Considerations on Formal Agent Modeling and Implementation Si Quang Le Japan Advanced Institute of Science and Technology Ishikawa

xxiv

Contributors

JAPAN

Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis

Mong Li Lee School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Antonio Liotta Center for Communication Systems Research University of Surrey Guildford, Surrey UNITED KINGDOM Chapter 8. Distributed Monitoring: Methods, Means, and Technologies Kostas Metaxiotis Institute of Communications & Computer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Chunyan Miao School of Computer Engineering Nanyang Technological University SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Yuan Miao Institute of Communication and Information Systems Nanyang Technological University SINGAPORE Chapter 6. Agent-Based el.earnino Systems: A Goal-Based Approach Trong Dung Nguyen Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data

Analysis

Srikanta Patnaik Department of Electronics and Telecommunication Engineering University College of Engineering, Burla

Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and Their Applications

John Psarras Institute of Communications & Computer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Zhiqi Shen Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 6. Agent-Based el.earning Systems: A Goal-Based Approach S. B. Tor Singapore-MIT Alliance Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems for the Design of Engineering Systems \v. Y. Zhang Design Research Center School of Mechanical and Production Engineering Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems for the Design of Engineering Systems

VOLUME 4: INTELLIGENT SYSTEMS Cheng-Leong Ang Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid System for Business Forecasting Sistine A. Barretto Advanced Computing Research Centre The University of South Australia Adelaide

xxvi

Contributors

AUSTRALIA Chapter 6. Techniques in the Utilization of the Internet and Intranets in Facilitating the Development ~f Clinical Decision Support Systems in the Process if Patient Care

Billy Fenton International Test Technologies and University of Ulster Letterkenny, Donegal IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis

~f Electronic

Systems

Robert Gay Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 4. An Intelligent Hybrid Systemfor Business Forecasting Victor Giurgiutiu Mechanical Engineering Department University of South Carolina Columbia, South Carolina USA Chapter 8. Mechatronics and Smart Structures Design Techniquesjor Intelligent Products, Processes and Systems Marc-Philippe Huget Leibnitz Laboratory Grenoble France Chapter 9. Engineering Interaction Protocols for Multiagent Systems Richard w: Jones School of Engineering University of N orthumbria Newcastle upon Tyne England UNITED KINGDOM Chapter 2. Intelligent Patient Monitoring in the Intensive Care Unit and the Operating Room Jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory

Valence FRANCE Chapter 9. Engineering Interaction Protocols for Multiagent Systems

Xiang Li Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid System for Business Forecasting Liam Maguire Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis

of Electronic Systems

T. M. McGinnity Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis of Electronic Systems Tolety Siva Perraju Verizon Communications Waltham, Massachusetts USA Chapter 3. Mission Critical Intelligent Systems

Mauricio Sanchez-Silva Department of Civil and Environmental Engineering Universidad de los Andes Bogota COLOMBIA Chapter 7. Risk Analysis and the Decision-Making Process in Engineering Garimella Uma South Asia International Institute Hyderabad INDIA Chapter 3. Mission Critical Intelligent Systems James R. Warren Advanced Computing Research Centre The University of South Australia

xxviii

Contributors

Mawson Lakes AUSTRALIA Chapter 6. Techniques in the Utilization of the Internet and Intranets in Facilitating the Development of Clinical Decision Support Systems in the Process if Patient Care Xuan F. Zha Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Artificial Intelligence and Integrated Intelligent Systems: Applications in Product Design and Development

VOLUME 5: NEURAL NETWORKS, FUZZY THEORY AND GENETIC ALGORITHM TECHNIQUES Kazem Abhary School of Advanced Manufacturing and Mechanical Engineering University of South Australia Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms F. Admiraal-Behloul Division of Image Processing Leiden University Medical Center Leiden THE NETHERLANDS Chapter 4. Fuzzy Rule Extraction Using Radial Basis Function Neural Networks in High-Dimensional Data

Kemal Ahmet Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD/CAM Integration Carl K. Chang Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Management Systems

Lian Ding Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD / CAM Integration Shing-Hwang Doong Department of Information Management Shu- Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yujia Ge Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Management Systems Andrew Kusiak Department of Mechanical and Industrial Engineering University of Iowa Iowa City, Iowa USA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Chin Lai Department of Information Management Shu- Te University Yen-Chan TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Wen F. Lu Product Design and Development Group Singapore Institute of Manufacturing Technology SINGAPORE Chapter 6. Evaluation and Selection in Product Design for Mass Customization Lee H. S. Luong School of Advanced Manufacturing and Mechanical Engineering University of South Australia

xxx

Contributors

Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms

Romeo Marin Marian CSIRO Manufacturing & Infrastructure Technology Woodville North, SA AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms Stergios Papadimitriou Department of Information Management Technological Education Institute of Kavala Kavala GREECE Chapter 9. Kernel-Based Self-Organized Maps Trained with Supervised Biasfor Gene Expression Data Mining Johan H. C. Reiber Division of Image Processing Department of Radiology Leiden University Medical Center Leiden THE NETHERLANDS Chapter 4. Fuzzy-Rule Extraction Using Radial Basis Function Neural Networks in H(l(h-Dimensional Data Kwang-Kyu Seo Division of Computer, Information and Telecommunication Engineering Sangmyung University Chungnam KOREA Chapter 2. Neural Network Systems Technology and Applications in Product Life-Cycle Cost Estimates Joaquin Sitte Faculty of Information Technology Queensland University of Technology Brisbane AUSTRALIA Chapter 3. Neural Network Systems Technology in the Analysis of Financial Time Series

Renate Sitte Faculty of Engineering and Information and Technology Griffith University Queensland AUSTRALIA Chapter 3. Neural Network Systems Technology in theAnalysis of Financial Time Series Ram D. Sriram Design and Process Group Manufacturing Systems Integration Divison National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization FuJ. Wang Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization Juite Wang Department of Industrial Engineering Feng Chia University Taichung, Taiwan REPUBLIC OF CHINA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Hung Wu Department of Information Management Shu-Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yong Yue Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD/CAM Integration

xxxii

C ontributors

X uan F. Zha Design and Process Gro up Manufacturing Systems Integration Divison N ation al Institut e of Standards and Technolo gy Gaithersburg, Maryland

USA Chapter 6. Eualuation and Selection ill Product Designjor Mass Customization

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 3 EXPERT AND AGENT SYSTEMS

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 3 EXPERT AND AGENT SYSTEMS

Edited by

CORNELIUS T. LEONDES

University of California, Los Angeles, USA

" ~.

K LUWER ACADEMIC PUBLISHERS

BOSTON/DORDRECHTILONDON

Distributors for North, Central and South America: Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA Telephone (781) 871-6600 Fax (781) 871-6528 E-Mail Distributors for all other countries: Kluwer Academic Publishers Group Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS Telephone 31 78 6576 000 Fax 31 786576474 E-Mail

....

"

Electronic Services

Library of Congress Cataloging-in-Publication Data Intelligent knowledge-based systems: business and technology in the new millennium. / edited by Cornelius T. Leondes. Includes bibliographical references and index. Contents: v. 1. Knowledge-based systems-v. 2. Information technologyv. 3. Expert and agent systems-v. 4. Intelligent systemsv. 5. Neural nerworks, fuzzy theory and genetic algorithms. ISBN 1-40207-746-7 (set)-ISBN 1-40207-824-2 (v.1)-ISBN 1-40207-825-0 (v.2)ISBN 1-40207-826-9 (v.3)-ISBN 1-40207-827-7 (vA)-ISBN 1-40207-828-5 (v.5) ISBN 1-40207-829-3 (electronic book set) (LOC information to follow.)

Copyright © 2005 by Kluwer Academic Publishers All rights reserved. No part of this work may be reproduced, stored in a retrieval systems or transmitted in any form or by any means, electronic, mechanical, photo-copying, microfilming, recording, or otherwise, without the prior written permission of the publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Permissions for books published in the USA: [email protected] Permissions for books published in Europe: [email protected]

Printed 0/1 acid-free paper. Printed in the United States of America.

CONTENTS

Foreword Preface

VII

IX

List of contributors

XIII

Volum.e 3. Expert and Agent System.s

1. Techniques in Knowledge-Based Expert Systems for the Design of Engineering Systems 3 G. A. BRITTON, S. B. TOR AND W. Y. ZHANG

2. Expert Systems Technology in Production Planning and Scheduling

55

KOSTAS METAXIOT!S, DIMITIUS ASKOUNIS AND JOHN PSARRAS

3. Applying Intelligent Agent-Based Support Systems in Agile Business Processes

76

CHUN-CHE HUANG

4. The Knowledge Base of a B2B eCommerce Multi-Agent System

132

CHUNYAN MIAO, NELLY KASIM AND ANGELA GOH

5. From Roles to Agents: Considerations on Formal Agent Modeling and Implementation 154 IVAN ROMERO HERNANDEZ AND JEAN-LUC KONING

6. Agent-Based eLearning Systems: a Goal-Based Approach

182

ZHlQI SHEN, ROBERT GAY AND YUAN MIAO

v

vi

Contents

7. Combining Temporal Abstraction and Data Mining Methods in Medical Data Analysis 198 TU BAO HO, TRONG DUNG NGUYEN, SAORI KAWASAKI AND SI QUANG LE

8. Distributed Monitoring: Methods, Means and Technologies ANTONIO LIOTTA

9. Finding Patterns in Image Databases 254 WYNNE HSU, MONG LI LEE AND JING DAI

10. Cognition Techniques and Their Applications 273 SRIKANTA PATNAIK AND K. KARIBASAPPA

223

FOREWORD

Almost unknown to the academic world, and to the general public, the application of intelligent knowledge-based systems is rapidly and effectively changing the future of the human species. Today, human well-being is, as it has been for all of history, fundamentally limited by the size of the world economic product. Thus, if human economic well-being (which I personally define as the bottom centile annual per capita income) is ever soon to reach an acceptable level (e.g., the equivalent of $20,000 per capita per annum in 2004), then intelligent knowledge-based systems must be employed in vast quantities. This is primarily because of the reality that few humans live in efficient societies (such as the United States, Canada, Japan, the UK, France, and Germany, for example) and that inefficient societies, many of which are already large, and growing larger, may require many decades to become efficient. In the meantime, billions of people will continue to suffer economic impoverishment-an impoverishment that inefficient human labor cannot remedy. To create the extra economic output so urgently needed, we have only one choice: to employ intelligent knowledge-based systems in great numbers, which will produce economic output prodigiously, but will consume hardly at all. This multi-volume major reference work, architected by its editor, Cornelius T. Leondes, provides a wealth of 'case studies' illustrating the state of the art in intelligent knowledge-based systems. In contrast to ordinary academic pedagogy, where 'ivory tower' abstraction and elegance are the guiding principles, practical applications require detailed relevant examples that can be used by practitioners to successfully innovate new operational capabilities. The economic progress of the species depends upon the vii

viii

Forewo rd

flow of these innovations, which requires multi-volume major reference wor ks with carefully selected, well-written, and well-e dited 'case studies.' Professor Leond es knows these realities well, and the five volumes in this wor k resoundingly reflect his success in achieving their requir ements. Volume 1 addresses Kn owledge-Based Systems. These eleven chapters consider the basic question ofhow accumulated data and staff experti se from business operations can be abstracted into valuable knowledge, and how such knowledge can then be applied to ongoing operations. W ide and representative situations are considered, ranging from product innovation and design, to intelligent database exploitation, to business model analysis. Volume 2, Information Technology,addresses in ten chapters the important question of how data should be stored and used to maximize its overall value. Case studies consider a wide variety of application arenas: product developm ent , manufacturing, product management, and even produ ct pricing. Volume 3 addresses Expert and Agent Systems in ten chapters. Application arenas considered include image databases, business process monit orin g, e-co mmerce, and produ ction planning and scheduling. Again, the coverage is designed to provide a wide range of perspectives and business-function concentrations to help stimulate innovation by the reader. Volume 4, Intelligent Systems, provides nine chapters consider ing such topics as mission-c ritical functions, business forecasting, medi cal patient care, and produ ct design and development. Volu me 5 addresses N eural Ne tworks, Fuzzy Theor y, and Genetic Algorithm Techniques. Its ten chapters cover examples in areas including bioinformatics, product Iifecycle cost estimating, product developme nt, computer-aided design, produ ct assembly, and facility location . The examples assembled by Professor Leondes in this wor k provide a wealth of practical ideas designed to trigger the development of innovation . The contributors to this grand project are to be congratulated for the major efforts they have expended in creating their chapters. Hu mans everywhere will soon benefi t from the case studies provided herein . Intelligent Knowledge-Based Systems: Business and Technology in the New Millennium, is a reference work that belongs on the desk of every innovative techn ologist. It has taken many decades of experience and unflagging hard work for Professor Leond es to accumulate the wisdom and judgment reflected in his editorial stewardship of this reference work . Wisdom and judgment are rare-but indispensablecommodities that cannot be obtained in any other way. Th e world of innovative technology, and th e world at large, stand in his debt. R obert Hecht-Nielsen Computational Neurobiology Institute for Ne ural Co mputation Department of Electrical and Computer Engineering U niversity of Californ ia, San Diego

PREFACE

At the start of the 20 th century, national economies on the international scene were, to a large extent, agriculturally based. This was, perhaps, the dominant reason for the protraction, on the international scene, of the Great Depression, which began with the Wall Street stock market crash of October, 1929. After World War II the trend away from agriculturally based economies and toward industrially based economies continued and strengthened. Indeed, today, in the United States, approximately only 1% of the population is involved in the agriculture requirements of the US and, in addition, provides significant agriculture exports. This, of course, is made possible by the greatly improved techniques and technologies utilized in the agriculture industry. The trend toward industrially based economies after World War II was, in turn, followed by a trend toward service-based economies. In the United States today, roughly over 70% of the employment is involved with service industries-and this percentage continues to increase. Separately, the electronic computer industry began to take hold in the early 1960s, and thereafter always seemed to exceed expectations. For example, the first large-scale sales of an electronic computer were of the IBM 650. At that time, projections were that the total sales for the United States would be twenty-five IBM 650 computers. Before the first one came off the projection line, IBM had initial orders for over 30,000. That was thought to be huge by the standards of that day, and today it is a very miniscule number, to say nothing of the fact that its computing power was also very miniscule by today's standards. Computer mainframes continued to grow in power and complexity. At the same time, Gordon Moore, of "Moore's Law" fame, and his colleagues founded INTEL. Then around 1980 MICROSOFT was ix

x

Preface

founded, but it was not until the early 1990s, not that long ago, that WINDOWS were created-incidentally, after the APPLE computer family started. The first browser was the NETSCAPE browser, which appeared in 1995, also not that long ago. Of course, computer networking equipment, most notably CISCO's, also appeared about that time. Toward the end of the last century the "DOT COM bubble" occurred and "burst" around 2000. Coming to the new millennium, for most of our history the wealth of a nation was limited by the size and stamina of the work force. Today, national wealth is measured in intellectual capital. Nations possessing skillful people in such diverse areas as science, medicine, business, and engineering produce innovations that drive the nation to a higher quality oflife. To better utilize these valuable resources, intelligent, knowledgebased systems technology has evolved at a rapid and significantly expanding rate, and can be utilized by nations to improve their medical care, advance their engineering technology, and increase their manufacturing productivity, as well as playa significant role in a very wide variety of other areas of activity of substantive significance. The breadth of the major application areas of intelligent, knowledge-based systems technology is very impressive. These include the following, among other areas. Agriculture Business Chemistry Communications Computer Systems Education Management Law Manufacturing Mathematics Medicine Meteorology

Electronics Engineering Environment Geology Image Processing Information Military Mining Power Systems Science Space Technology Transportation

It is difficult now to imagine an area that will not be touched by intelligent, knowledge-based systems technology. The great breadth and expanding significance of such a broad field on the international scene requires a multi-volume, major reference work to provide an adequately substantive treatment of the subject, "Intelligent Knowledge-Based Systems: Business and Technology of The New Millennium." This work consists of the following distinctly titled and well integrated volumes. Volume Volume Volume Volume Volume

I. II. III. IV V

Knowledge-Based Systems Information Technology Expert and Agent Systems Intelligent Systems Neural Networks

This five-volume set on intelligent knowledge-based systems clearly manifests the great significance of these key technologies for the new economies of the new millennium. The authors are all to be highly commended for their splendid contributions, which together will provide a significant and uniquely comprehensive reference source for research workers, practitioners, computer scientists, students, and others on the international scene for years to come. Cornelius T. Leondes University of California, Los Angeles January 5, 2004

CONTRIBUTORS

VOLUME 1: KNOWLEDGE-BASED SYSTEMS N. Bassiliades Department of Informatics Aristotle University of Thessaloniki Thessaloniki GREECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderfor eShopping Peter Bernus Griffith University School of CIT Nathan Queensland AUSTRALIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Mariano Corso Department of Management Engineering Polytechnic University of Mailand Milano ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innnovation xiii

xiv

Contributors

Eugenio di Sciascio Dipartimento Elettrotecnica ed Elettronica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Francesco M. Donini Universita della Tuscia Viterbo ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Janis Grundspenkis Faculty of Computer Science and Information Technology Riga Technical University Riga LATVIA Chapter 7. Impact of the Intelligent Agent Paradigm on Knowledge Management P. Humphreys Faculty of Business and Management University of Ulster Northern Ireland UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manufacturing Strategy Brane Kalpic ETI Elektroelement Jt. St. Compo Izlake SLOVENIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Marite Kirikova Faculty of Computer Science and Information Technology Riga Technical University Riga LATVIA Chapter 7. Impact of the Intelligent Agent Paradigm on Knowledge Management F. Kokkoras Department of Informatics Aristotle University of Thessaloniki

Thessaloniki GREECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderfor eShopping

Shian-Hua Lin Department of Computer Science and Information Engineering National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 5. Intelligent Internet Information Systems in Knowledge Acquisition: Techniques and Applications Antonella Martini Faculty of Engineering University of Pisa Pisa ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innovation R. McIvor Faculty of Business and Management University of Ulster UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manufacturing Strategy Istvan Mezgar CIM Research Laboratory Computer and Automations Research Institute Hungarian Academy of Sciences Budapest HUNGARY Chapter 9. Security Technologies to Guarantee Safe Business Processes in Smart Organizations Marina Mongiello Dipartimento di Elettrotecnica ed Elettronica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Ralf Muhlberger University of Queensland Information Technology & Electrical Engineering

xvi

Co ntributors

Queensland AUSTRALIA

Chapter 10. Business Process Modeling and Its Applications in the Business Environment Cezary Orlowski Gdansk University of Technology Gdan sk POLAND

Chapter 8. Methods of Bllilding Knoudedye-Based Systems Applied in Soft ware Project Management Emilio Paolucci D epartment of Operation and Business Management Polytechnic University of Turi n Torino ITALY Chapter 2. Knowledge Manaoement Systems in Continuous Product Innovation

Lui sa Pellegrini Faculty of Engineering University of Pisa Pisa ITALY Chapter 2. Knowledoe Management Systems in Continuous Product Innovation

Ram D. Sr ir am Design and Pro cess Group Manufacturing System s Integration Division Nation al Institute 'of Stand ards and Techn ology Gaith ersburg, Mar yland USA Chapter 1. Plotform-Based Product Design and Development: Knowledge Support Strategy

and Implementation

Nikos C. T so urvelou dis Department of Production Engineering and Management Technical University of Crete Chania, C rete GREECE

Chapter 3. Knowledge-Based Measurement

I. Vlahavas D epartment of Informatics Aristotle University of T hessaloniki

~r Enterprise

A,l!ility

Thessaloniki GREECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderfor eShopping Xuan F. Zha Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Plafform-Based Product Design and Development: Knowledge Support Strategy and Implementation

VOLUME 2: INFORMATION TECHNOLOGY Ales Brezovar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analysis of Sequential and Concurrent Product Development Processes Chris R. Chatwin School of Engineering and Information Technology University of Sussex Brighton UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Iy!{1Jrmation Systems in Manufacturing Ke-Zhang Chen Department of Mechanical Engineering The University of Hong Kong HONG KONG Chapter 5. Design and Modeling Methods for Components Made Materials in High-Tech Applications

of Multi-Heterogeneous

Adrian E. Coronado Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and TheirApplications in Manufacturing Systems

xviii

Contributors

Xin-An Feng School of Mechanical Engineering Dalian University of Technology Dalian CHINA Chapter 5. Design and Modeling Methods for Components Made of Multi-Heterogeneous Materials in High- Tech Applications Janez Grum Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analysis of Sequential and Concurrent Product Development Processes George Hadjinicola Department of Public and Business Administration School of Economics and Management University of Cyprus Nicosia CYPRUS Chapter 9. Product Design and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Jared Jackson IBM Almaden Research Center San Jose, California USA Chapter 7. VVeb Data Extraction Techniques and Applications Using the Extensible Markup Language (XML) D. F. Kehoe Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and Their Applications in Manufacturing Systems Andreas Koeller Department of Computer Science Montclair State University Upper Montclair, New Jersey USA Chapter 6. Quality and Cost of Data Warehouse Views

K. Ravi Kumar Department of Information and Operations Management Marshall School of Business University of Southern California Los Angeles, California USA Chapter 9. Product Redesign and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Janez Kusar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analyses of Sequential and Concurrent Product Development Processes Henry C. \v. Lau Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means of Intelligent Information Infrastructure Methods and Their Applications Amy Lee The Ohio State University Columbus, Ohio USA Chapter 6. Quality and Cost of Data Warehouse Views Choon Seong Leem School of Computer and Industrial Engineering Yonsei University Seoul KOREA Chapter 1. Techniques in Integrated Development andImplementation ofEnterprise Information

Systems

A. C. Lyons Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and Their Applications in Manufacturing Systems

xx

Contributors

Jussi Myll ymaki IBM Almaden R esearch Center San Jose, Californ ia USA Chapter 7. VVeb Data Extraction Techniques and Applications Using the Extensible Markup LaHguage (XML) Anisoara Nica Sybase Incor porated Waterloo, O ntario Canada Chapter 6. Quality and Cost eif Data Warehouse Views J org Niemann IFF University of Stutt gart Fraunh ofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management in the Digital Age Andrew Ning Department of Industrial and Systems Engineering The Hong Kong Polytechnic Uni versity Hunghom HONG KO N G Chapter 10. Knowledge Discovery by Means of Intelligent lnformation Infrastructure Methods and Their Applications Elke A. Rundensteiner Department of Computer Science Worcester Polytechnic Institut e Worcester Massachusetts U SA Chapter 6. Quality and Cost of Data Warehouse Views Marko Starbek Faculty of M echanical Engineering U niversity of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analyses of Sequential and Concurrent Product Development Processes Jong Wook Suh School of Co mputer and Industrial Engineering Yonsei Un iversity

Seoul KOREA Chapter 1. Techniques in Integrated Developmentand ImplementationofEnterprise Information Systems

Qian Wang Schoo l of Engineering and Inform ation Technology Un iversity of Sussex Brighton and Department of M echanical Engineering Un iversity of Bath Bath UNITED KINGDOM Chapter 3. Modeling Techniques in bltegrated Operations and Information Systems in Mamifacturing Systems Engelbert Westkamper IFF University of Stuttgart Fraunhofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management in the Digital Age Christina W. Y. Wong Department of Industrial and Systems Engineering The Hong Kong Polytechn ic Un iversity Hun ghom HONG KONG Chapter 10. Knowledge Discovery by Means of Intelligent Inform ation lrfrastructure Methods and Their Applications R. C. D. Young Schoo l of Engineering and Information Technology University of Sussex Brighton UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operationsand Iniormation Systems in lvlanufaeturing Systems VOLUME 3: EXPERT AND AGENT SYSTEMS Dimitris Askounis Institut e of Co mmunications & Computer Systems N ational Technic al Uni versity of Athems

xxii

Contributors

Athens GREECE Chapter 2. Expert Systems Technology in Production Platming and Schedulitlg

G.A.Britton Design Research Center School Of Mechanical and Produ ction Engineering N anyang Techn ological Un iversity SINGAPO RE Chapter 1. Techniques in Knowledge-Based Expert Systemsfor the Design of Engineerillg Systems Jing Dai School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Robert Gay Institute of Communication and Information Systems School of Electrical and Electronic Engineerin g Nanyang Techn ological Un iversity SINGAPORE Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach Angela Goh School of Computer Engineering Nanyang Techn ological University SINGAPO RE Chapter 4. TIle Knowledge Base if a B2B eCommerce Multi-Agent System Ivan Romero Hernandez Technological University of Grenoble LCIS R esearch Laboratory Valence FRAN CE Chapter 5. From Roles to Agents: Considerations on Formal Agetll Modeling and Implementation Tu Bao Ho Japan Advanced Institut e of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis

Wynne Hsu School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Chun-Che Huang Department of Information Management National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 3. Applying Intelligent Agent-Based Support Systems in Agile Business Processes K. Karibasappa Department of Electronics and Telecommunication Engineering University College of Engineering, Burla Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and TheirApplications Nelly Kasim Singapore-MIT Alliance National University of Singapore SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Saori Kawasaki Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory Valence FRANCE Chapter 5. From Roles to Agents: Considerations on Formal Agent Modeling and Implementation Si Quang Le Japan Advanced Institute of Science and Technology Ishikawa

xx iv

C ontributor s

JAPAN

Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis

Mong Li Lee School of Computing Na tiona l U niversity of Singapo re SINGAPOR E Chapter 9. Finding Patterns in Image Databases Antonio Liotta C enter for C ommunication Systems R esearch University of Surrey Guildford, Surrey UNITED KINGDOM Chapter 8. Distributed Monitoring: Methods, Means, and Technologies Kostas Metaxiotis Institute of Communications & Computer Systems N ation al Technical University of Ath ens Athens G REECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Chunyan Miao School of Computer En gin eerin g N anyang Techn ological University SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Yuan Miao Institut e of Com munication and Information System s N anyang Technological University SINGAP O R E Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach Trong Dung Nguyen Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data

Analysis

Srikanta Patnaik De partme nt of Electronics and Telecommunication En gin eer ing University College of Engi ne eri ng, Burla

Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and Their Applications

John Psarras Institute of Communications & Computer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Zhiqi Shen Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach S. B. Tor Singapore-MIT Alliance Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems jor the Design oj Engineering Systems W Y. Zhang Design Research Center School of Mechanical and Production Engineering Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems jor the Design oj Engineering Systems VOLUME 4: INTELLIGENT SYSTEMS Cheng-Leong Ang Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid System jor Business Forecasting Sistine A. Barretto Advanced Computing Research Centre The University of South Australia Adelaide

xxvi

C ontributors

AUSTRALIA Chapter 6. Techniques in the Utilization q( the Internet and lntranets in Facilitating the Development ({ Clinical Decision Support Systems in the Process {~( Patient Care

Billy Fenton Int ern ational Test Technologies and University of Ulster Letterkenny, Donegal IRELAND C hapter 5. Intellicent Systems Technology in the Fault Diagnosis if Electronic Systcms

Robert Gay Institut e of C ommunication and Inform ation Systems School of Electrical and Electronic Engi nee ring N anyang Technological U niversity SINGAPORE

Chapter 4. An lntelligent Hybrid Svstemf or Business Forecasting

Victor Giurgiutiu Me chanical Engineering Department University of South Carolina Columbia, South Carolina U SA Chapter 8. Mecliatronics and Smart Structures Desiyn Techniques f or Intelligent Products, Processes and Systems

Marc-Philippe Huget Leibnitz Labora tor y Grenoble France

Chapter 9. Engincering Interaction Protocolsfor Multiaoent Systems

Richard

w: Jones

Schoo l of Engineering U niversity of Northumbria N ew castle up on Tyn e Engl and UNITED KINGDOM Chapter 2. Intelligent Patient Monitoring in the Intensive Care Unit and the Operating

Room

Jean-Luc Koning

Technological University of Gren ob le LC IS Research Laboratory

Valence FRANCE Chapter 9. Engineering Interaction Protocols for Multiagen: Systems Xiang Li Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid Systemfor Business Forecasting Liam Maguire Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis

if Electronic Systems

T. M. McGinnity Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis

if Electronic Systems

Tolety Siva Perraju Verizon Communications Waltham, Massachusetts USA Chapter 3. Mission Critical Intelligent Systems Mauricio Sanchez-Silva Department of Civil and Environmental Engineering Universidad de los Andes Bogota COLOMBIA Chapter 7. Risk Analysis and the Decision-Making Process in Engineering Garimella Uma South Asia International Institute Hyderabad INDIA Chapter 3. Mission Critical Intelligent Systems James R. Warren Advanced Computing Research Centre The University of South Australia

xxviii

Cont ributors

Mawson Lakes AUSTRALIA Chapter 6. Techniques in the Utilization oj the Intern et and lntranets in Facilitating the Development oj Clinical Decision Support Systems in the Process oj Patient Care

Xuan F. Zh a Design and Process Group Manufa cturing Systems Integration Division National Institute of Standards and Techn ology Gaithersburg, Maryland USA Chapter 1. Artificial Intelligence and Integrated Intelligent Systems: Applications in Product Design and Development VOLUME 5: NEURAL NETWORKS, FUZZY THE ORY AND GENETIC ALGORITHM TECHNIQUES Kaz em Abh ary School of Advanced Manufacturing and Mechanical Engineering University of South Australia Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimiz ation Using Genetic A lgorithms F. Admiraal-Behloul Division of Image Processing Leiden Uni versity Medi cal Center Leiden T HE NETHERLANDS

Chapter 4. Fuz zy Rule Extraction Using Radial Basis Function Neural Netwo tles in High -Dimensional Data

Kem al Ahmet Faculty of C reative Arts and Technolo gies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Netuotk Systems Technology and Applications in CAD / CAM Integration Carl K. Chang Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. GeneticA lgorithm Techniques and Applications in Management Systems

Lian Ding Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology and Applications in CAD/CAM Integration Shing-Hwang Doong Department of Information Management Shu-Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yujia Ge Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Management Systems Andrew Kusiak Department of Mechanical and Industrial Engineering University ofIowa Iowa City, Iowa USA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Chin Lai Department of Information Management Shu- Te University Yen-Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Wen F. Lu Product Design and Development Group Singapore Institute of Manufacturing Technology SINGAPORE Chapter 6. Evaluation and Selection in Product Design for Mass Customization Lee H. S. Luong School of Advanced Manufacturing and Mechanical Engineering University of South Australia

xxx

Contributors

Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms

Romeo Marin Marian CSIRO Manufacturing & Infrastructure Technology Woodville North, SA AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms Stergios Papadimitriou Department of Information Management Technological Education Institute of Kavala Kavala GREECE Chapter 9. Kernel-Based Self-Organized Maps Trained with Supervised Bias for Gene Expression Data Mining Johan H. C. Reiber Division of Image Processing Department of Radiology Leiden University Medical Center Leiden THE NETHERLANDS Chapter 4. Fuzzy-Rule Extraction Using Radial Basis Function Neural Networks in High-Dimensional Data Kwang-Kyu Seo Division of Computer, Information and Telecommunication Engineering Sangmyung University Chungnam KOREA Chapter 2. Neural Network Systems Technology andApplications in Product Life-Cycle Cost Estimates Joaquin Sitte Faculty of Information Technology Queensland University of Technology Brisbane AUSTRALIA Chapter 3. Neural Network Systems Technology in theAnalysis of Financial Time Series

Renate Sitte Faculty of Engineering and Information and Technology Griffith University Queensland AUSTRALIA Chapter 3. Neural Network Systems Technology in the Analysis of Financial Time Series Ram D. Sriram Design and Process Group Manufacturing Systems Integration Divison National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization FuJ. Wang Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization Juite Wang Department of Industrial Engineering Feng Chia University Taichung, Taiwan REPUBLIC OF CHINA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Hung Wu Department of Information Management Shu- Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yong Yue Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD/CAM Integration

xxxii

Contributors

Xuan F. Zha D esign and Process Group Manufacturi ng Systems Int egra tion Divison Natio nal Institute of Standards and Technology Gaithersbur g, Maryland U SA Chapter 6. Evaluation and Selection ;11 Product Des(1!1I for Mass C ustomis ation

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 4 INTELLIGENT SYSTEMS

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 4 INTELLIGENT SYSTEMS

Edited by CORNELIUS T. LEONDES

University of California, Los Angeles, USA

....

"

K LUWER ACADEMIC PUBLISHERS

BOSTONIDORDRECHT ILONDON

Distributors for North, Central and South America: Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA Telephone (781) 871-6600 Fax (781) 871-6528 E-Mail Distributors for all other countries: Kluwer Academic Publishers Group Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS Telephone 31 78 6576000 Fax 31 786576474 E-Mail

....

"

Electronic Services

Library of Congress Cataloging-in-Publication Data Intelligent knowledge-based systems: business and technology in the new millennium. / edited by Cornelius T. Leondes. Includes bibliographical references and index. Contents: v. I. Knowledge-based systems-v. 2. Information technologyv. 3. Expert and agent systems-v. 4. Intelligent systemsv. 5. Neural networks, fuzzy theory and genetic algorithms. ISBN 1-40207-746-7 (set)-ISBN 1-40207-824-2 (v.l)-ISBN 1-40207-825-0 (v.2)ISBN 1-40207-826-9 (v.3)-ISBN 1-40207-827-7 (vA)-ISBN 1-40207-828-5 (v.5) ISBN 1-40207-829-3 (electronic book set) (LaC information to follow.)

Copyright © 2005 by Kluwer Academic Publishers All rights reserved. No part of this work may be reproduced, stored in a retrieval systems or transmitted in any form or by any means, electronic, mechanical, photo-copying, microfilming, recording, or otherwise, without the prior written permission of the publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Permissions for books published in the USA: [email protected]/Il Permissions for books published in Europe: [email protected] Printed 011 acid-free paper. Printed in the United States of America.

CONTENTS

Foreword Preface

Vll

IX

List of contributors

Xlll

Volume 4. Intelligent Systems

1. Artificial Intelligence and Integrated Intelligent Systems in Product Design and Development 3 XUAN F. ZHA

2. Intelligent Patient Monitoring in the Intensive Care Unit and the Operating Room

60

RICHARD W. JONES

3. Mission Critical Intelligent Systems

120

TOLETY SIVA PERRAJU AND GARL\1El.l.A LJMA

4. An Intelligent Hybrid System for Business Forecasting

147

XIANG LI, CHENG-LEONG ANG, AND ROBERT (;AY

5. Intelligent Systems Technology in the Fault Diagnosis of Electronic Systems

212

BILLY FENTON, T. M. MCGINNITY, AND I.. P. MAGUIRE

6. Techniques in the Utilization of the Internet and Intraners in Facilitating the Development of Clinical Decision Support Systems in the Process of Patient Care

250

SISTINE A. BARRETTO AND JAMES R. WARREN

v

vi

Contents

7. Ri sk Analysis and the Decision-Making Process in Engineering

297

MAURICI O SANC H EZ-S ILVA

8. Me chatronics and Smart Structures Design T echniques for Intelligent Products, Processes, and Systems 330 VI CTOR GIU RGIUTI U

9. Engineering Interaction Proto cols for Multi agent Systems 409 )',IARC-PHILIPPE H UGET AND J EAN- WC KO NIN G

FOREWORD

Almost unknown to the academic world, and to the general public, the application of intelligent knowledge-based systems is rapidly and effectively changing the future of the human species. Today, human well-being is, as it has been for all of history, fundamentally limited by the size of the world economic product. Thus, if human economic well-being (which I personally define as the bottom centile annual per capita income) is ever soon to reach an acceptable level (e.g., the equivalent of $20,000 per capita per annum in 2004), then intelligent knowledge-based systems must be employed in vast quantities. This is primarily because of the reality that few humans live in efficient societies (such as the United States, Canada, Japan, the UK, France, and Germany, for example) and that inefficient societies, many of which are already large, and growing larger, may require many decades to become efficient. In the meantime, billions of people will continue to suffer economic impoverishment-an impoverishment that inefficient human labor cannot remedy. To create the extra economic output so urgently needed, we have only one choice: to employ intelligent knowledge-based systems in great numbers, which will produce economic output prodigiously, but will consume hardly at all. This multi-volume major reference work, architected by its editor, Cornelius T. Leondes, provides a wealth of 'case studies' illustrating the state of the art in intelligent knowledge-based systems. In contrast to ordinary academic pedagogy, where 'ivory tower' abstraction and elegance are the guiding principles, practical applications require detailed relevant examples that can be used by practitioners to successfully innovate new operational capabilities. The economic progress of the species depends upon the vii

viii

Foreword

flow of these innovations, which requires multi-volume major reference works with carefully selected, well-written, and well-edited 'case studies.' Professor Leondes knows these realities well, and the five volumes in this work resoundingly reflect his success in achieving their requirements. Volume 1 addresses Knowledge-Based Systems. These eleven chapters consider the basic question ofhow accumulated data and staff expertise from business operations can be abstracted into valuable knowledge, and how such knowledge can then be applied to ongoing operations. Wide and representative situations are considered, ranging from product innovation and design, to intelligent database exploitation, to business model analysis. Volume 2, Information Technology, addresses in ten chapters the important question of how data should be stored and used to maximize its overall value. Case studies consider a wide variety of application arenas: product development, manufacturing, product management, and even product pricing. Volume 3 addresses Expert and Agent Systems in ten chapters. Application arenas considered include image databases, business process monitoring, e-commerce, and production planning and scheduling. Again, the coverage is designed to provide a wide range of perspectives and business-function concentrations to help stimulate innovation by the reader. Volume 4, Intelligent Systems, provides nine chapters considering such topics as mission-critical functions, business forecasting, medical patient care, and product design and development. Volume 5 addresses Neural Networks, Fuzzy Theory, and Genetic Algorithm Techniques. Its ten chapters cover examples in areas including bioinformatics, product lifecycle cost estimating, product development, computer-aided design, product assembly, and facility location. The examples assembled by Professor Leondes in this work provide a wealth of practical ideas designed to trigger the development of innovation. The contributors to this grand project are to be congratulated for the major efforts they have expended in creating their chapters. Humans everywhere will soon benefit from the case studies provided herein. Intelligent Knowledge-Based Systems: Business and Technology in the New Millennium, is a reference work that belongs on the desk of every innovative technologist. It has taken many decades of experience and unflagging hard work for Professor Leondes to accumulate the wisdom and judgment reflected in his editorial stewardship of this reference work. Wisdom and judgment are rare-but indispensablecommodities that cannot be obtained in any other way. The world of innovative technology, and the world at large, stand in his debt. Robert Hecht-Nielsen Computational Neurobiology Institute for Neural Computation Department of Electrical and Computer Engineering Universiry of California, San Diego

PREFACE

At the start of the 20 th century, national economies on the international scene were, to a large extent, agriculturally based. This was, perhaps, the dominant reason for the protraction, on the international scene, of the Great Depression, which began with the Wall Street stock market crash of October, 1929. After World War II the trend away from agriculturally based economies and toward industrially based economies continued and strengthened. Indeed, today, in the United States, approximately only 1% of the population is involved in the agriculture requirements of the US and, in addition, provides significant agriculture exports. This, of course, is made possible by the greatly improved techniques and technologies utilized in the agriculture industry. The trend toward industrially based economies after World War II was, in turn, followed by a trend toward service-based economies. In the United States today, roughly over 70% of the employment is involved with service industries-and this percentage continues to increase. Separately, the electronic computer industry began to take hold in the early 1960s, and thereafter always seemed to exceed expectations. For example, the first large-scale sales of an electronic computer were of the IBM 650. At that time, projections were that the total sales for the United States would be twenty-five IBM 650 computers. Before the first one came off the projection line, IBM had initial orders for over 30,000. That was thought to be huge by the standards of that day, and today it is a very miniscule number, to say nothing of the fact that its computing power was also very miniscule by today's standards. Computer mainframes continued to grow in power and complexity. At the same time, Gordon Moore, of "Moore's Law" fame, and his colleagues founded INTEL. Then around 1980 MICROSOFT was ix

x

Preface

founded, but it was not until the early 1990s, not that long ago, that WINDOWS were created-incidentally, after the APPLE computer family started. The first browser was the NETSCAPE browser, which appeared in 1995, also not that long ago. Of course, computer networking equipment, most notably CISCO's, also appeared about that time. Toward the end of the last century the "DOT COM bubble" occurred and "burst" around 2000. Coming to the new millennium, for most of our history the wealth of a nation was limited by the size and stamina of the work force. Today, national wealth is measured in intellectual capital. Nations possessing skillful people in such diverse areas as science, medicine, business, and engineering produce innovations that drive the nation to a higher quality of life. To better utilize these valuable resources, intelligent, knowledgebased systems technology has evolved at a rapid and significantly expanding rate, and can be utilized by nations to improve their medical care, advance their engineering technology, and increase their manufacturing productivity, as well as playa significant role in a very wide variety of other areas of activity of substantive significance. The breadth of the major application areas of intelligent, knowledge-based systems technology is very impressive. These include the following, among other areas. Agriculture Business Chemistry Communications Computer Systems Education Management Law Manufacturing Mathematics Medicine Meteorology

Electronics Engineering Environment Geology Image Processing Information Military Mining Power Systems Science Space Technology Transportation

It is difficult now to imagine an area that will not be touched by intelligent, knowledge-based systems technology. The great breadth and expanding significance of such a broad field on the international scene requires a multi-volume, major reference work to provide an adequately substantive treatment of the subject, "Intelligent Knowledge-Based Systems: Business and Technology of The New Millennium." This work consists of the following distinctly titled and well integrated volumes. Volume Volume Volume Volume Volume

I. II. III. IV V

Knowledge-Based Systems Information Technology Expert and Agent Systems Intelligent Systems Neural Networks

This five-volume set on intelligent knowledge-based systems clearly manifests the great significance of these key technologies for the new economies of the new millennium. The authors are all to be highly commended for their splendid contributions, which together will provide a significant and uniquely comprehensive reference source for research workers, practitioners, computer scientists, students, and others on the international scene for years to come. Cornelius T. Leondes University of California, Los Angeles January 5, 2004

CONTRIBUTORS

VOLUME 1: KNOWLEDGE-BASED SYSTEMS N. Bassiliades Department of Informatics Aristotle University of Thessaloniki Thessaloniki GREECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderfor eShopping Peter Bernus Griffith University School of CIT Nathan Queensland AUSTRALIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Mariano Corso Department of Management Engineering Polytechnic University of Mailand Milano ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innnovation xiii

xiv

Contributors

Eugenio di Sciascio Dipartimento Elettrotecnica ed Elettronica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Francesco M. Donini Universita della Tuscia Viterbo ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Janis Grundspenkis Faculty of Computer Science and Information Technology Riga Technical University Riga LATVIA Chapter 7. Impact of the Intelligent Agent Paradigm on Knowledge Management P. Humphreys Faculty of Business and Management University of Ulster Northern Ireland UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manufacturing Strategy Brane KaIpic ETI ElektroelementJt. St. Compo Izlake SLOVENIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Marite Kirikova Faculty of Computer Science and Information Technology Riga Technical University Riga LATVIA Chapter 7. Impact of the Intelligent Agent Paradigm on Knowledge Management F. Kokkoras Department of Informatics Aristotle University of Thessaloniki

Thessaloniki GREECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderfor eShopping

Shian-Hua Lin Department of Computer Science and Information Engineering National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 5. Intelligent Internet Information Systems in Knowledge Acquisition: Techniques and Applications Antonella Martini Faculty of Engineering University of Pisa Pisa ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innovation R. McIvor Faculty of Business and Management University of Ulster UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manufacturing Strategy Istvan Mezgar

CIM Research Laboratory Computer and Automations Research Institute Hungarian Academy of Sciences Budapest HUNGARY Chapter 9. Security Technologies to Guarantee Stife Business Processes in Smart Organizations

Marina Mongiello Dipartimento di Elettrotecnica ed Elettronica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Ralf Muhlberger University of Queensland Information Technology & Electrical Engineering

xvi

Co ntributors

Queensland AUSTRALIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment

Cezary Orlowski Gd ansk Uni versity of Techn ology Gdansk POLAND Chapter 8. Methods 1" Building Knowledge-Based Systems Applied in S
Thessaloniki GR EECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart BI/ildel".fc'r eShopping

Xuan F. Zha Design and Pro cess Group Manufacturing Systems Integration Division N ation al Institut e of Standards and Techno logy Gaithersburg, M aryland USA Chapter 1. Platform-Based Product Desion and Development: Knowledoc Support Stratexy

and lmplementation VOLUME 2: INFORMATION TECH N OLO GY Ale s Brezovar Faculty of M echanical Engineering University of Ljublj ana Ljubljana SLO VEN IA

Chapter 4. Techniques and Analysis of Sequential and Concurrent Product Development Processes Chris R. Chatwin Scho ol of Engineering and Information Technology Uni versity of Sussex l3righton UN ITED KINGDOM Chapter 3. Nlodelill.l! Techniques ill lntcgrated Operations and lniomtanon Systems ill

Mall/ljaetl/rillg

Ke-Zhang Chen Department of M echanical Eng inee ring T he University of Hong Kong HONG KO N G Chapter 5. Design and ModelillX Met//Odsfor Components Made oj Multi-Hetcroocueous

Materials in High-Tech Applications

Adr ian E. Coronado M anagem ent School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. b!formatioll Systems Fra meworks and Their A pplications ill A1mll!factl/rillg

Systems

xviii

Co ntr ibutors

Xin-An Feng School of Mechanical Engin eering Dalian Un iversity of Techn ology Dalian CHINA Chapter 5. Design and Modeling Methodsf or Components Made if Multi-Heterogeneous Materials in High-Tech Applications Janez Grum Faculty of M echanical Engineerin g Univ ersity of Ljublj ana Ljublj ana SLOVEN IA Chapter 4. Techniques and A nalysis of Sequential and Concurrent Product Development Processes George Hadjinicola Departm ent of Public and Business Adm inistration Schoo l of Economics and Management U niversity of Cyprus Nic osia CYPRUS Chapter 9. Product Design and Pricing in Response to CompetitorEntry: A MarketingProduction Perspective Jared Jackson IBM Almaden Research Center San Jose, Californ ia USA Chapter 7. l;J7eb Data Extraction Techniques and Applications Using the Ex tensible Markup LlIlguage (X A1L) D. F. Kehoe Man agement School The University of Liverp ool Liverp ool UNITED KINGDOM

Chapter 2. It!formation Systems Frameworks and TheirApplications ill Malllifaeturing Systems

Andreas Koeller Dep artment of Computer Science M ontcl air State Uni versity Uppe r Montclair, N ew Jersey USA Chapter 6. Quality and Cost of Data Warehouse Views

K. Ravi Kumar Department of Information and Operations Management Marshall School of Business University of Southern California Los Angeles, California USA Chapter 9. Product Redesign and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Janez Kusar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analyses of Sequential and Concurrent Product Development Processes Henry C. w: Lau Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means of Intelligent Information Infrastructure Methods and TheirApplications Amy Lee The Ohio State University Columbus, Ohio USA Chapter 6. Quality and Cost of Data Warehouse Views Choon Seong Leem School of Computer and Industrial Engineering Yonsei University Seoul KOREA Chapter 1. Techniques in Integrated Development andImplementation ofEnterprise Information Systems A. C. Lyons Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and TheirApplications in Man,,!facturing Systems

xx

Contributors

Jussi Myllymaki IBM Almaden Research Center San Jose, California USA Chapter 7. T#b Data Extraction Techniques and Applications Using the Extensible Markup Language (XML) Anisoara Nica Sybase Incorporated Waterloo, Ontario Canada Chapter 6. Quality and Cost if Data Warehouse Views Jorg Niemann IFF University of Stuttgart Fraunhofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management in the Digital Age Andrew Ning Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means of Intelligent Information Infrastructure Methods and TheirApplications Elke A. Rundensteiner Department of Computer Science Worcester Polytechnic Institute Worcester Massachusetts USA Chapter 6. Quality and Cost if Data Warehouse Views Marko Starbek Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analyses if Sequential and Concurrent Product Development Processes Jong Wook Suh School of Computer and Industrial Engineering Yonsei University

Seoul KOREA Chapter 1. Techniques in Integrated Development andImplementation ofEnterprise Information

Systems

Qian Wang School of Engineering and Information Technology University of Sussex Brighton and Department of Mechanical Engineering University of Bath Bath UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in

Manufacturing Systems

Engelbert Westkamper IFF University of Stuttgart Fraunhofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management in the Digital Age Christina W. Y. Wong Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means if Intelligent Information Injrastructure Methods and Their Applications R. C. D. Young School of Engineering and Information Technology University of Sussex Brighton UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in Manufacturing Systems VOLUME 3: EXPERT AND AGENT SYSTEMS Dimitris Askounis Institute of Communications & Computer Systems National Technical University of Athems

xxii

Contributors

Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling G. A. Britton Design Research Center School Of Mechanical and Production Engineering Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems for the Design Systems

of Engineering

Jing Dai School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Robert Gay Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach Angela Goh School of Computer Engineering Nanyang Technological University SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Ivan Romero Hernandez Technological University of Grenoble LCIS Research Laboratory Valence FRANCE Chapter 5. From Roles to Agents: Considerations on Formal Agent Modeling and Implementation Tu Bao Ho Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis

Wynne Hsu School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Chun-Che Huang Department of Information Management National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 3. Applying Intelligent Agent-Based Support Systems in Agile Business Processes K. Karibasappa Department of Electronics and Telecommunication Engineering University College of Engineering, Burla Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and Their Applications Nelly Kasim Singapore-MIT Alliance National University of Singapore SINGAPORE Chapter 4. The Knowledge Base (~f a B2B eCommerce Multi-Agent System Saori Kawasaki Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory Valence FRANCE Chapter 5. From Roles to Agents: Considerations Implementation

0/1

Formal Agellt Modelillg and

Si Quang Le Japan Advanced Institute of Science and Technology Ishikawa

xxiv

Contributors

JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Mong Li Lee School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Antonio Liotta Center for Communication Systems Research University of Surrey Guildford, Surrey UNITED KINGDOM Chapter 8. Distributed Monitoring: Methods, Means, and Technologies Kostas Metaxiotis Institute of Communications & Computer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Chunyan Miao School of Computer Engineering Nanyang Technological University SINGAPORE Chapter 4. The Knowledge Base if a B2B eCommerce Multi-Agent System Yuan Miao Institute of Communication and Information Systems Nanyang Technological University SINGAPORE Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach Trong Dung Nguyen Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Srikanta Patnaik Department of Electronics and Telecommunication Engineering University College of Engineering, Buda

Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and Their Applications

John Psarras Institute of Communications & Computer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Zhiqi Shen Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach S. B. Tor Singapore-MIT Alliance Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems for the Design Systems

of Engineering

w. Y. Zhang Design Research Center School of Mechanical and Production Engineering Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems for the Design of Engineering Systems VOLUME 4: INTELLIGENT SYSTEMS Cheng-Leong Ang Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid System for Business Forecasting Sistine A. Barretto Advanced Computing Research Centre The University of South Australia Adelaide

xxvi

Contributors

AUSTRALIA Chapter 6. Techniques in the Utilization of the Internet and Intranets in Facilitating the Development of Clinical Decision Support Systems in the Process of Patient Care

Billy Fenton International Test Technologies and University of Ulster Letterkenny, Donegal IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis of Electronic Systems Robert Gay Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 4. An Intelligent Hybrid Systemfor Business Forecasting Victor Giurgiutiu Mechanical Engineering Department University of South Carolina Columbia, South Carolina USA Chapter 8. Mechatronics and Smart Structures Design Techniquesfor Intelligent Products, Processes and Systems Marc-Philippe Huget Leibnitz Laboratory Grenoble France Chapter 9. Engineering Interaction Protocols for Multiagent Systems Richard w: Jones School of Engineering University of Northumbria Newcastle upon Tyne England UNITED KINGDOM Chapter 2. Intelligent Patient Monitoring in the Intensive Care Unit and the Operating Room Jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory

Valence FRANCE Chapter 9. EngineerinJ? Interaction Protocols for Multiagent Systems

Xiang Li Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid System for Business Forecasting Liam Maguire Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis

of Electronic Systems

T. M. McGinnity Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technolo!?y in the Fault Diagnosis

of Electronic Systems

Tolety Siva Perraju Verizon Communications Waltham, Massachusetts USA Chapter 3. Mission Critical Intelligent Systems Mauricio Sanchez-Silva Department of Civil and Environmental Engineering Universidad de los Andes Bogota COLOMBIA Chapter 7. Risk Analysis and the Decision-Makino Process in Engineering Garimella Uma South Asia International Institute Hyderabad INDIA Chapter 3. Mission Critical Intelligent Systems James R. Warren Advanced Computing Research Centre The University of South Australia

xxviii

Contributors

Mawson Lakes AUSTRALIA Chapter 6. Techniques in the Utilization of the Internet and Intranets in Facilitating the Development of Clinical Decision Support Systems in the Process of Patient Care

Xuan F. Zha Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Artificial Intelligence and Integrated Intelligent Systems: Applications in Product Design and Development

VOLUME 5: NEURAL NETWORKS, FUZZY THEORY AND GENETIC ALGORITHM TECHNIQUES Kazem Abhary School of Advanced Manufacturing and Mechanical Engineering University of South Australia Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms F. Admiraal-Behloul Division of Image Processing Leiden University Medical Center Leiden THE NETHERLANDS Chapter 4. Fuzzy Rule Extraction Using Radial Basis Function Neural Networks in High-Dimensional Data

Kemal Ahmet Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology and Applications in CAD/CAM Integration Carl K. Chang Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Management Systems

Lian Ding Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology and Applications in CAD / CAM Integration Shing-Hwang Doong Department of Information Management Shu- Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yujia Ge Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Management Systems Andrew Kusiak Department of Mechanical and Industrial Engineering University ofIowa Iowa City, Iowa USA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Chin Lai Department of Information Management Shu-Te University Yen-Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Wen F. Lu Product Design and Development Group Singapore Institute of Manufacturing Technology SINGAPORE Chapter 6. Evaluation and Selection in Product Design for Mass Customization Lee H. S. Luong School of Advanced Manufacturing and Mechanical Engineering University of South Australia

xxx

Contributors

M awson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic A lgorithms

Romeo Marin Marian CSIRO Manufacturing & Infrastructure Technology Woodville N orth, SA AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms Stergios Papadimitriou Department of Information M anagement Technological Education Institu te of Kavala Kavala GREEC E Chapter 9. Kernel-Based Self- Organiz ed Maps Trained with Supervised Biasjor Gene Expression Data Mining Johan H . C. Reiber Division of Image Processing Department of R adiology Leiden Uni versity M edical Center Leiden THE NETHERLANDS Chapter 4. Fue e v-Rule Extraction Using Radial Basis Function Neural N etworks in High-Dimell5iotlal Data Kwang-Kyu Seo Division of Computer, Information and Telecommunication Engineering Sangmyung University C hungnam KOREA Chapter 2. Neural Network Systems Technology and Applications in Product Life-Cycle Cost Estimates Joaquin Sitte Faculty of Information Techn ology Queensland University of Techn ology Brisbane AUSTRALIA Chapter 3. N eural Network Systems Technologv ill the Atlalysis of Financial Time Series

Renate Sitte Faculty of Engineering and Information and Technology Griffith University Queensland AUSTRALIA Chapter 3. Neural Network Systems Technology in the Analysis if Financial Time Series Ram D. Sriram Design and Process Group Manufacturing Systems Integration Divison National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization FuJ. Wang Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization Juite Wang Department of Industrial Engineering Feng Chia University Taichung, Taiwan REPUBLIC OF CHINA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Hung Wu Department of Information Management Shu- Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yong Yue Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD/CAM Integration

xxxii

Con tri bu tor s

Xuan F. Zha De sign and Process Group Manufacturing System s Int egration D ivison N ation al Institu te of Standards and Technology Gaithersburg , Maryland

USA Chapter 6. Evaluation and Selection ill Product Designfor Mass Customization

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENNIUM

VOLUME 5 NEURAL NETWORKS, FUZZY THEORY AND GENETIC ALGORITHMS

INTELLIGENT KNOWLEDGE-BASED SYSTEMS

BUSINESS AND TECHNOLOGY IN THE NEW MILLENN IUM

VOLUME 5 NEURAL NETWOR KS, FUZZY THEORY AND GENET IC ALGORITHMS

Edited by CORNELIUS T. LEONDES

Un iversity of California, Los Angeles, USA

1Il...

"

KLUWER ACAD EMIC PUB LISHE R S

BO STO N / D O R DRECH T I LO N D O N

Distributors for North, Central and South America:

Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA Telephone (781) 871-6600 Fax (781) 871-6528 E-Mail

Distributors for all other countries:

Kluwer Academic Publishers Group Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS Telephone 31 78 6576 000 Fax 31 786576474 E-Mail

....

"

Electronic Services

Library of Congress Cataloging-in-Publication Data Intelligent knowledge-based systems: business and technology in the new millennium. I edited by Cornelius T. Leondes. Includes bibliographical references and index. Contents: v. 1. Knowledge-based systems-v. 2. Information technologyv. 3. Expert and agent systems-v. 4. Intelligent systemsv. 5. Neural networks, fuzzy theory and genetic algorithms. ISBN 1-40207-746-7 (set)-ISBN 1-40207-824-2 (v.1)-ISBN 1-40207-825-0 (v.2)ISBN 1-40207-826-9 (v.3)-ISBN 1-40207-827-7 (vA)-ISBN 1-40207-828-5 (v.5) ISBN 1-40207-829-3 (electronic book set) (LOC information to follow.)

Copyright © 2005 by Kluwer Academic Publishers All rights reserved. No part of this work may be reproduced, stored in a retrieval systems or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording, or otherwise, without the prior written permission of the publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Permissions for books published in USA: [email protected] Permissions for books published in the Europe: [email protected]

Printed on acid-free paper. Printed in the United States of America.

CONTENTS

Foreword Preface

vii ix

List of contributors

X111

Volume 5. Neural Networks, Fuzzy Theory and Genetic Algorithms 1. Neural Network Systems Technology and Applications in CAD/CAM Integration

3

YONG YUE, LIAN DING AND KEMAL AHMET

2. Neural Network Systems Technology and Applications in Product Life-Cycle Cost Estimates 38 KWANG-KYU SEO

3. Neural Network Systems Technology in the Analysis of Financial Time Series

59

RENATE SITTE AND JOAQUIN SITTE

4. Fuzzy Rule Extraction Using Radial Basis Function Neural Networks in High-Dimensional Data 111 F. ADMIRAAL-BEHLOUL AND J. H. C. REIBER 5. Fuzzy Decision Modeling of Product Development Processes

151

JUITE WANG AND ANDREW KUSIAK

6. Evaluation and Selection in Product Design for Mass Customization XUAN F. ZHA, RAM D. SRIRAM, WEN F. LU, AND FU

183

J. WANG

v

vi

Contents

7. Genetic Algorithm Techniques and Applications in Management Systems 213 CARL K. CHANG AND YU]IA GE

8. Assembly Sequence Optimization Using Genetic Algorithms

234

LEE H. S. LUONG, ROMEO MARIN MARIAN AND KAZEM ABHARY

9. Kernel-Based Self-Organized Maps Trained with Supervised Bias for Gene Expression Data Mining 272 STERGIOS PAPADIMITRIOU

10. Computational Intelligence for Facility Location Allocation Problems SHING-HWANG DOONG, CHIH-CHIN LAI AND CHIH-HUNG WU

Index

321

289

FOREWORD

Almost unknown to the academic world, and to the general public, the application of intelligent knowledge-based systems is rapidly and effectively changing the future of the human species. Today, human well-being is, as it has been for all of history, fundamentally limited by the size of the world economic product. Thus, if human economic well-being (which I personally define as the bottom centile annual per capita income) is ever soon to reach an acceptable level (e.g., the equivalent of $20,000 per capita per annum in 2004), then intelligent knowledge-based systems must be employed in vast quantities. This is primarily because of the reality that few humans live in efficient societies (such as the United States, Canada, Japan, the UK, France, and Germany, for example) and that inefficient societies, many of which are already large, and growing larger, may require many decades to become efficient. In the meantime, billions of people will continue to suffer economic impoverishment-an impoverishment that inefficient human labor cannot remedy. To create the extra economic output so urgently needed, we have only one choice: to employ intelligent knowledge-based systems in great numbers, which will produce economic output prodigiously, but will consume hardly at all. This multi-volume major reference work, architected by its editor, Cornelius T. Leondes, provides a wealth of' case studies' illustrating the state of the art in intelligent knowledge-based systems. In contrast to ordinary academic pedagogy, where 'ivory tower' abstraction and elegance are the guiding principles, practical applications require detailed relevant examples that can be used by practitioners to successfully innovate new operational capabilities. The economic progress of the species depends upon the vii

viii

Foreword

flow of these innovations, which requires multi-volume major reference works with carefully selected, well-written, and well-edited 'case studies.' Professor Leondes knows these realities well, and the five volumes in this work resoundingly reflect his success in achieving their requir ements. Volume 1 addresses Knowl edge-Based Systems. These eleven chapters consider the basic question of how accumulated data and staff expertise from business operations can be abstracted into valuable knowl edge, and how such knowl edge can then be applied to ongo ing operations. Wid e and representative situations are considered, ranging from product innovation and design, to intelligent database exploitation , to business model analysis. Volume 2, Information Techn ology, addresses in ten chapters the important question of how data should be stored and used to maximize its overall value. Case studies consider a wide variety of application arenas: product development, manufacturing, product management, and even produ ct pricing. Volume 3 addresses Expert and Agent Systems in ten chapte rs. Application arenas considered include image databases, business process monitoring, e-commerce, and produ ction planning and schedulin g. Again, the coverage is designed to provide a wide range of perspectives and business-function concentrations to help stimulate innovation by the reader. Volume 4, Intelligent Systems, provides nine chapters considering such topics as mission-critical functions , business forecasting, medical patient care, and product design and development. Volume 5 addresses Neural Networks, Fuzzy Theory, and Genetic Algorithm Technique s. Its ten chapters cover examples in areas including bioinformatics, product lifecycle cost estimating, produ ct development, computer- aided design, produ ct assembly, and facility location . The examples assembled by Professor Leond es in this work provide a wealth of practical ideas designed to trigger the development of innovation . The contributors to this grand project are to be congratulated for the major efforts they have expended in creating their chapters . Humans everywhere will soon ben efit from the case studies provided herein. Intelligent Knowledge-Based Systems: Business and Technology in the New Millennium, is a reference work that belongs on the desk of every innovative technologist. It has taken many decades of experience and unflagging hard work for Professor Leondes to accumulate the wisdom and judgment reflected in his editorial stewardship of this reference work . Wisdom and judgment are rare-but indispensablecommodities that cannot be obtained in any other way. The world of innovative techn ology, and the world at large, stand in his debt. R obert He cht-Nielsen Computational Neurobiology Institut e for Neural Computation Dep artment of Electrical and Co mputer Engineering University of Californ ia, San Diego

PREFACE

At the start of the 20 th century, national economies on the international scene were, to a large extent, agriculturally based. This was, perhaps, the dominant reason for the protraction, on the international scene, of the Great Depression, which began with the Wall Street stock market crash of October, 1929. After World War II the trend away from agriculturally based economies and toward industrially based economies continued and strengthened. Indeed, today, in the United States, approximately only 1% of the population is involved in the agriculture requirements of the US and, in addition, provides significant agriculture exports. This, of course, is made possible by the greatly improved techniques and technologies utilized in the agriculture industry. The trend toward industrially based economies after World War II was, in turn, followed by a trend toward service-based economies. In the United States today, roughly over 70% of the employment is involved with service industries-and this percentage continues to increase. Separately, the electronic computer industry began to take hold in the early 1960s, and thereafter always seemed to exceed expectations. For example, the first large-scale sales of an electronic computer were of the IBM 650. At that time, projections were that the total sales for the United States would be twenty-five IBM 650 computers. Before the first one came off the projection line, IBM had initial orders for over 30,000. That was thought to be huge by the standards of that day, and today it is a very miniscule number, to say nothing of the fact that its computing power was also very miniscule by today's standards. Computer mainframes continued to grow in power and complexity. At the same time, Gordon Moore, of "Moore's Law" fame, and his colleagues founded INTEL. Then around 1980 MICROSOFT was ix

x

Preface

founded, but it was not until the early 1990s, not that long ago, that WINDOWS were created-incidentally, after the APPLE computer family started. The first browser was the NETSCAPE browser, which appeared in 1995, also not that long ago. Of course, computer networking equipment, most notably CISCO's, also appeared about that time. Toward the end of the last century the "DOT COM bubble" occurred and "burst" around 2000. Coming to the new millennium, for most of our history the wealth of a nation was limited by the size and stamina ofthe work force. Today, national wealth is measured in intellectual capital. Nations possessing skillful people in such diverse areas as science, medicine, business, and engineering produce innovations that drive the nation to a higher quality oflife. To better utilize these valuable resources, intelligent, knowledgebased systems technology has evolved at a rapid and significantly expanding rate, and can be utilized by nations to improve their medical care, advance their engineering technology, and increase their manufacturing productivity, as well as playa significant role in a very wide variety of other areas of activity of substantive significance. The breadth of the major application areas of intelligent, knowledge-based systems technology is very impressive. These include the following, among other areas. Agriculture Business Chemistry Communications Computer Systems Education Management Law Manufacturing Mathematics Medicine Meteorology

Electronics Engineering Environment Geology Image Processing Information Military Mining Power Systems Science Space Technology Transportation

It is difficult now to imagine an area that will not be touched by intelligent, knowledgebased systems technology. The great breadth and expanding significance of such a broad field on the international scene requires a multi-volume, major reference work to provide an adequately substantive treatment of the subject, "Intelligent Knowledge-Based Systems: Business and Technology of The New Millennium." This work consists of the following distinctly titled and well integrated volumes. Volume Volume Volume Volume Volume

I.

II. III. IV V

Knowledge-Based Systems Information Technology Expert and Agent Systems Intelligent Systems Neural Networks

This five-volume set on intelligent knowledge-based systems clearly manifests the great significance of these key technologies for the new economies of the new millennium. The authors are all to be highly commended for their splendid contributions, which together will provide a significant and uniquely comprehensive reference source for research workers, practitioners, computer scientists, students, and others on the international scene for years to come. Cornelius T. Leondes University of California, Los Angeles January 5, 2004

CO NTRIBUTORS

VOLUME 1: KNOWLEDGE-BASED SYSTEMS N. B assili ades Department of Infor matics Aristotle Uni versity of Thessaloniki Thessaloniki G RE ECE Chapter 6. Aggregator: A Knowledge-Based Comparison Chart Builderfor eSllOpping Pe ter Bernus Griffith Unive rsity Scho ol of C IT N athan Q ueensland AUSTRALIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Mari ano Corso Department of Managem ent Engin eering Polytechnic Unive rsity of Mailand Milano ITALY Chapter 2. Knowledge Manaoement Systems in Continuous Product lnnnovation xiii

xiv

Contributors

Eugenio di Sciascio Dipartimento Elettrotecnica ed Elettronica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Francesco M. Donini Universita della Tuscia Viterbo ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Janis Grundspenkis Faculty of Computer Science and Information Technology Riga Technical University Riga LATVIA Chapter 7. Impact of the Intelligent Agent Paradigm on Knowledge Management P. Humphreys Faculty of Business and Management University of Ulster Northern Ireland UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manufacturing Strategy Brane Kalpic ETI Elektroelement Jt. St. Compo Izlake SLOVENIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment Marite Kirikova Faculty of Computer Science and Information Technology Riga Technical University Riga LATVIA Chapter 7. Impact of the Intelligent Agent Paradigm on Knowledge Management F. Kokkoras Department of Informatics Aristotle University of Thessaloniki

Thessaloniki GREECE Chapter 6. A~regator: A Knowledge-Based Comparison Chart Builderfor eShopping

Shian-Hua Lin Department of Computer Science and Information Engineering National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 5. Intelligent Internet Information Systems in Knowledge Acquisition: Techniques and Applications Antonella Martini Faculty of Engineering University of Pisa Pisa ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innovation R. McIvor Faculty of Business and Management University of Ulster UNITED KINGDOM Chapter 4. Knowledge-Based Systems Technology in the Make-or-Buy Decision in Manufacturing Strategy Istvan Mezgar CIM Research Laboratory Computer and Automations Research Institute Hungarian Academy of Sciences Budapest HUNGARY Chapter 9. Security Technologies to Guarantee Safe Business Processes in Smart Organizations

Marina Mongiello Dipartimento di Elettrotecnica ed Elettronica Politecnico di Bari Bari ITALY Chapter 11. Knowledge-Based Systems Technology and Applications in Image Retrieval Ralf Muhlberger University of Queensland Information Technology & Electrical Engineering

xvi

Contributors

Queensland AUSTRALIA Chapter 10. Business Process Modeling and Its Applications in the Business Environment

Cezary Orlowski Gdansk University of Technology Gdansk POLAND Chapter 8. Methods of Building Knowledge-Based Systems Applied in Software Project Management Emilio Paolucci Department of Operation and Business Management Polytechnic University of Turin Torino ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innovation Luisa Pellegrini Faculty of Engineering University of Pisa Pisa ITALY Chapter 2. Knowledge Management Systems in Continuous Product Innovation Ram D. Sriram Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Plaiform-Based Product Design and Development: Knowledge Support Strategy and Implementation Nikos C. Tsourveloudis Department of Production Engineering and Management Technical University of Crete Chania, Crete GREECE Chapter 3. Knowledge-Based Measurement if Enterprise Agility I. Vlahavas Department of Informatics Aristotle University of Thessaloniki

Thessaloniki GREECE Chapter 6. AgI?regator: A Knowledge-Based Comparison Chart Builderfor eShopping Xuan F. Zha Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Plaiform-Based Product Design and Development: Knowledge Support Strategy and Implementation

VOLUME 2: INFORMATION TECHNOLOGY Ales Brezovar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analysis of Sequential and Concurrent Product Development Processes Chris R. Chatwin School of Engineering and Information Technology University of Sussex Brighton UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in Manufacturing Ke-Zhang Chen Department of Mechanical Engineering The University of Hong Kong HONG KONG Chapter 5. Design and Modeling Methods for Components Made of Multi-Heterogeneous Materials in High- Tech Applications Adrian E. Coronado Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and Their Applications in Manufacturing Systems

xviii

Contributors

Xin-An Feng School of Mechanical Engineering Dalian University of Technology Dalian CHINA Chapter 5. Design and Modeling Methodsfor Components Made Materials in High-Tech Applications

if Multi-Heterogeneous

Janez Grum Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analysis of Sequential and Concurrent Product Development Processes George Hadjinicola Department of Public and Business Administration School of Economics and Management University of Cyprus Nicosia CYPRUS Chapter 9. Product Design and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Jared Jackson IBM Almaden Research Center San Jose, California USA Chapter 7. VVeb Data Extraction Techniques and Applications Using the Extensible Markup Language (XML) D. F. Kehoe Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and Their Applications in Manufacturing Systems

Andreas Koeller Department of Computer Science Montclair State University Upper Montclair, New Jersey USA Chapter 6. Quality and Cost of Data TMlrehouse Views

K. Ravi Kumar Department of Information and Operations Management Marshall School of Business University of Southern California Los Angeles, California USA Chapter 9. Product Redesign and Pricing in Response to Competitor Entry: A MarketingProduction Perspective Janez Kusar Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analyses oj Sequential and Concurrent Product Development Processes Henry C. w: Lau Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means of Intelligent Information Infrastructure Methods and TheirApplications Amy Lee The Ohio State University Columbus, Ohio USA Chapter 6. Quality and Cost oj Data Warehouse Views Choon Seong Leem School of Computer and Industrial Engineering Yonsei University Seoul KOREA Chapter 1. Techniques in Integrated Development and Implementation ofEnterprise Information Systems A. C. Lyons Management School The University of Liverpool Liverpool UNITED KINGDOM Chapter 2. Information Systems Frameworks and TheirApplications in Manufacturing Systems

xx

Contributors

Jussi Myllymaki IBM Almaden Research Center San Jose, California USA Chapter 7. Web Data Extraction Techniques and Applications Using the Extensible Markup Language (XML) Anisoara Nica Sybase Incorporated Waterloo, Ontario Canada Chapter 6. Quality and Cost of Data Warehouse Views

jorg Niemann IFF University of Stuttgart Fraunhofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management in the Digital Age

Andrew Ning Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG Chapter 10. Knowledge Discovery by Means of Intelligent Information Infrastructure Methods and Their Applications Elke A. Rundensteiner Department of Computer Science Worcester Polytechnic Institute Worcester Massachusetts USA Chapter 6. Quality and Cost of Data Warehouse Views Marko Starbek Faculty of Mechanical Engineering University of Ljubljana Ljubljana SLOVENIA Chapter 4. Techniques and Analyses of Sequential and Concurrent Product Development Processes Jong Wook Suh School of Computer and Industrial Engineering Yonsei University

Seoul KOREA Chapter 1. Techniques in Integrated Development andImplementation ifEnterprise Information Systems

Qian Wang School of Engineering and Information Technology University of Sussex Brighton and Department of Mechanical Engineering University of Bath Bath UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in Manufacturing Systems Engelbert Westkamper IFF University of Stuttgart Fraunhofer IPA Stuttgart GERMANY Chapter 8. Product Life Cycle Management in the Digital Age Christina W Y. Wong Department of Industrial and Systems Engineering The Hong Kong Polytechnic University Hunghom HONG KONG . Chapter 10. Knowledge Discovery by Means if Intelligent Information Infrastructure Methods and Their Applications R. C. D. Young School of Engineering and Information Technology University of Sussex Brighton UNITED KINGDOM Chapter 3. Modeling Techniques in Integrated Operations and Information Systems in Manufacturing Systems VOLUME 3: EXPERT AND AGENT SYSTEMS Dimitris Askounis Institute of Communications & Computer Systems National Technical University of Athems

xxii

Contributors

Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling

G. A. Britton Design Research Center School Of Mechanical and Production Engineering Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems jor the Design Systems

if Engineering

Jing Dai School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Robert Gay Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 6. Agent-Based elearnino Systems: A Goal-Based Approach Angela Goh School of Computer Engineering Nanyang Technological University SINGAPORE Chapter 4. The Knowledge Base if a B2B eCommerce Multi-Agent System Ivan Romero Hernandez Technological University of Grenoble LCIS Research Laboratory Valence FRANCE Chapter 5. From Roles to Agents: Considerations on Formal Agent Modeling and Implementation Tn Bao Ho Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis

Wynne Hsu School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Chun-Che Huang Department of Information Management National Chi Nan University Taiwan REPUBLIC OF CHINA Chapter 3. Applying Intelligent Agent-Based Support Systems in Agile Business Processes K. Karibasappa Department of Electronics and Telecommunication Engineering University College of Engineering, Burla Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and TheirApplications Nelly Kasim Singapore-MIT Alliance National University of Singapore SINGAPORE Chapter 4. The Knowledge Base of a B2B eCommerce Multi-Agent System Saori Kawasaki Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory Valence FRANCE Chapter 5. From Roles to Agents: Considerations on Formal Agent Modeling and Implementation Si Quang Le Japan Advanced Institute of Science and Technology Ishikawa

xxiv

Contributors

JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis Mong Li Lee School of Computing National University of Singapore SINGAPORE Chapter 9. Finding Patterns in Image Databases Antonio Liotta Center for Communication Systems Research University of Surrey Guildford, Surrey UNITED KINGDOM Chapter 8. Distributed Monitoring: Methods, Means, and Technologies Kostas Metaxiotis Institute of Communications & Computer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Chunyan Miao School of Computer Engineering Nanyang Technological University SINGAPORE Chapter 4. The Knowledge Base tif a B2B eCommerce Multi-Agent System Yuan Miao Institute of Communication and Information Systems Nanyang Technological University SINGAPORE Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach Trong Dung Nguyen Japan Advanced Institute of Science and Technology Ishikawa JAPAN Chapter 7. Combining Temporal Abstraction and Data-Mining Methods in Medical Data Analysis

Srikanta Patnaik Department of Electronics and Telecommunication Engineering University College of Engineering, Burla

Sambalpur, Orissa INDIA Chapter 10. Cognition Techniques and Their Applications

John Psarras Institute of Communications & Computer Systems National Technical University of Athens Athens GREECE Chapter 2. Expert Systems Technology in Production Planning and Scheduling Zhiqi Shen Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 6. Agent-Based eLearning Systems: A Goal-Based Approach S. B. Tor Singapore-MIT Alliance Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems for the Design of Engineering Systems

w. Y. Zhang

Design Research Center School of Mechanical and Production Engineering Nanyang Technological University SINGAPORE Chapter 1. Techniques in Knowledge-Based Expert Systems for the Design Systems

VOLUME 4: INTELLIGENT SYSTEMS Cheng-Leong Ang Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid System for Business Forecasting Sistine A. Barretto Advanced Computing Research Centre The University of South Australia Adelaide

of Engineering

xxvi

Contributors

AUSTRALIA Chapter 6. Techniques in the Utilization of the Internet and Intranets in Facilitating the Development of Clinical Decision Support Systems in the Process of Patient Care

Billy Fenton International Test Technologies and University of Ulster Letterkenny, Donegal IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis of Electronic Systems Robert Gay Institute of Communication and Information Systems School of Electrical and Electronic Engineering Nanyang Technological University SINGAPORE Chapter 4. An Intelligent Hybrid System for Business Forecasting Victor Giurgiutiu Mechanical Engineering Department University of South Carolina Columbia, South Carolina USA Chapter 8. Mechatronics and Smart Structures Design Techniques for Intelligent Products, Processes and Systems Marc-Philippe Huget Leibnitz Laboratory Grenoble France Chapter 9. Engineering Interaction Protocols for Multiagent Systems Richard w: Jones School of Engineering University of Northumbria Newcastle upon Tyne England UNITED KINGDOM Chapter 2. Intelligent Patient Monitoring in the Intensive Care Unit and the Operating Room Jean-Luc Koning Technological University of Grenoble LCIS Research Laboratory

Valence FRANCE Chapter 9. Engineering Interaction Protocols for Multiagent Systems

Xiang Li Singapore Institute of Manufacturing Technology SINGAPORE Chapter 4. An Intelligent Hybrid System for Business Forecasting Liam Maguire Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis

~f Electronic

Systems

T. M. McGinnity Department of Informatics University of Ulster Derry NORTHERN IRELAND Chapter 5. Intelligent Systems Technology in the Fault Diagnosis of Electronic Systems Tolety Siva Perraju Verizon Communications Waltham, Massachusetts USA Chapter 3. Mission Critical Intelligent Systems Mauricio Sanchez-Silva Department of Civil and Environmental Engineering Universidad de los Andes Bogota COLOMBIA Chapter 7. Risk Analysis and the Decision-Making Process in Engineering Garimella Vma South Asia International Institute Hyderabad INDIA Chapter 3. Mission Critical Intelligent Systems James R. Warren Advanced Computing Research Centre The University of South Australia

xxviii

Contributors

Mawson Lakes AUSTRALIA Chapter 6. Techniques in the Utilization of the Internet and Intranets in Facilitating the Development of Clinical Decision Support Systems in the Process of Patient Care

Xuan F. Zha Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 1. Artificial Intelligence and Integrated Intelligent Systems: Applications in Product Design and Development VOLUME 5: NEURAL NETWORKS, FUZZY THEORY AND GENETIC ALGORITHM TECHNIQUES Kazem Abhary School of Advanced Manufacturing and Mechanical Engineering University of South Australia Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms F. Admiraal-Behloul Division of Image Processing Leiden University Medical Center Leiden THE NETHERLANDS Chapter 4. Fuzzy Rule Extraction Using Radial Basis Function Neural Networks in High-Dimensional Data

Kemal Ahmet Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD / CAM Integration Carl K. Chang Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Management Systems

Lian Ding Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology and Applications in CAD/CAM Integration Shing-Hwang Doong Department of Information Management Shu- Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yujia Ge Department of Computer Science Iowa State University Ames, Iowa USA Chapter 7. Genetic Algorithm Techniques and Applications in Manapemcnt System, Andrew Kusiak Department of Mechanical and Industrial Engineering University ofIowa Iowa City, Iowa USA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Chin Lai Department of Information Management Shu- Te University Yen-Chau TAIWAN Chapter 10. Computational Intclligcnce for Facility Location Allocation Problems Wen F. Lu Product Design and Development Group Singapore Institute of Manufacturing Technology SINGAPORE Chapter 6. Evaluation and Selection in Product Design for Mass Customization Lee H. S. Luong School of Advanced Manufacturing and Mechanical Engineering University of South Australia

xxx

Contributors

Mawson Lakes AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms

Romeo Marin Marian CSIRO Manufacturing & Infrastructure Technology Woodville North, SA AUSTRALIA Chapter 8. Assembly Sequence Optimization Using Genetic Algorithms Stergios Papadimitriou Department of Information Management Technological Education Institute of Kavala Kavala GREECE Chapter 9. Kernel-Based Self-Organized Maps Trained with Supervised Biasfor Gene Expression Data Mining Johan H. C. Reiber Division of Image Processing Department of Radiology Leiden University Medical Center Leiden THE NETHERLANDS Chapter 4. Fuzzy-Rule Extraction Using Radial Basis Function Neural Networks in High-Dimensional Data Kwang-Kyu Seo Division of Computer, Information and Telecommunication Engineering Sangmyung University Chungnam KOREA Chapter 2. Neural Network Systems Technology and Applications in Product Life-Cycle Cost Estimates Joaquin Sitte Faculty of Information Technology Queensland University of Technology Brisbane AUSTRALIA Chapter 3. Neural Network Systems Technology in the Analysis if Financial Time Series

Renate Sitte Faculty of Engineering and Information and Technology Griffith University Queensland AUSTRALIA Chapter 3. Neural Network Systems Technology in the Analysis

of Financial

Time Series

Ram D. Sriram Design and Process Group Manufacturing Systems Integration Divison National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization FuJ. Wang Design and Process Group Manufacturing Systems Integration Division National Institute of Standards and Technology Gaithersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Design for Mass Customization Juite Wang Department of Industrial Engineering Feng Chia University Taichung, Taiwan REPUBLIC OF CHINA Chapter 5. Fuzzy Decision Modeling of Product Development Processes Chih-Hung Wu Department of Information Management Shu-Te University Yen Chau TAIWAN Chapter 10. Computational Intelligence for Facility Location Allocation Problems Yong Yue Faculty of Creative Arts and Technologies University of Luton Luton UNITED KINGDOM Chapter 1. Neural Network Systems Technology andApplications in CAD/CAM Integration

xxxii

Co ntributors

Xuan F. Zha Design and Process Group Manufactu ring Systems Integration Divison Nati onal Institute of Standards and Techn ology Gaith ersburg, Maryland USA Chapter 6. Evaluation and Selection in Product Designfor Mass Customization

VOLUME I. KNOWLEDGE-BASED SYSTEMS

PLATFORM-BASED PRODUCT DESIGN AND DEVELOPMENT: KNOWLEDGE SUPPORT STRATEGY AND IMPLEMENTATION

XUAN F. ZHA AND RAM D. SRIRAM

1. INTRODUCTION

Product family is a group ofrelated products that share common features, components, and subsystems, and satisfy a variety of market niches. Product platform is a set of parts, subsystems, interfaces, and manufacturing processes that are shared among a set of products (Meyer and Lehnerd 1997). A product family comprises a set of variables, features or components that remain constant in a product platform and from product to product. Platform-based product family design has been recognized as an efficient and effective means to realize sufficient product variety to satisfy a range of customer demands in support for mass customization (Tseng and Jiao 1998). The platform product development approach usually includes two main phases: 1) the establishment of the appropriate product platform; and 2) the customization of the platform into individual product variants to meet the specific market, business and engineering needs. The establishment, maintenance and application of the right product platform are very complex. Contemporary design processes have become increasingly knowledge-intensive and collaborative (Tong and Sriram 1991a,b; Sriram 2002). Knowledge-intensive support becomes more critical in the design process and has been recognized as a key solution towards future competitive advantages in product development. To improve the product family design for mass customization process, it is imperative to provide knowledge support and share design knowledge among distributed designers. Several quantitative frameworks have been proposed for both phases in platform product development. 3

4

X uan F. Zha and R am D. Sriram

T hey provide valuable manager ial guidelines in implementing th e platform product development approa ch. H owever, ther e are very few systematic qualitative or intelligent m ethodologie s to support th e product development team members to adopt thi s platform product developm ent practice, despite the progress made in several research projects (Z ha and Lu 2002a,b). T he aim of thi s chapter is to discuss knowledge support methodol ogies and techno logies for platform-based product family design. An int egrat ed modul ar product family design process with kno wledge support is explored. This process includes customer requirements modeling, produ ct architectu re modeling, produ ct platform establishment, product family generation, and product assessment. Th e driving force behind th is work is to develop a form al, technical approach based on th e modular product design paradigm to efficientl y and effectively model and synthesize a family ofproducts (product platform and variants) which can provide increased produ ct variety necessary for today's market. The organization of th is chapter'is as follows. Section 2 reviews the background and cur rent research status related to platform-base d prod uct development and product family design . Sections 3 and 4 outline a platform-based produ ct development model and a modul ar design methodology for product family design. Section s 5 and 6 discuss th e module-based product fam ily design pro cess and discuss a knowledge support framework for modular product family design respectively. Section 7 addresses the relevant issues and techn ologies for implementing the knowledge int ensive support system for modular product family design . Section 8 summarizes the chapter and explores the future work. 2. LITERATURE RE VIEW

In thi s section, we bri efly review th e backgro und and current research status related to platform-based product developm ent and product family design. Various approaches and strategies for designin g families of products and mass customi zed goo ds are reported in the literature. These techniques appear in varied disciplines such as operations research (Gaithe n 1980), computer science (N utt 1992), marketin g, management science (Kotler 1989; Meyer et al. 1993; Pine 1I 1(93), and engineering design (Fuj ita et al. 1997 ; Simpson et al. 1998,2001; Ulrich et al. 1995). Two key conc epts underlie existing schemes for product family modeling: product family architecture and produ ct family evolution . Th ere are three kinds of approaches wi dely used for representi ng architec ture and m odularity for prod uct family: 1) produ ct-modeling language (Erens et al. 1997), 2) graph ic representation (Ishii et al. 1995; Agarwal and C agan 1998), and 3) module or building block (BB) (Tseng and Jiao 1996; Gero 1990; Fujita and Ishii 1997; Rosen 1996). T he product modeling language allows product families to be represented in three dom ains: functional, technological, and physical. It provides an effective means for representing product variety, but offers little aid for design synthesis and analysis. In th e graph structure, the different types of nod es denot e the individual compo nents, subassemblies and fasteners, and the links denote dependencies between th e nod es. However, it lacks

Platform-based product design and development

5

the ability to model product family constraints. Although the grammar approach is conjoint with the graph representation to improve its capability of representation, graph grammars are only able to implicitly capture product architecture information and product family information by production rules (Siddique and Rosen 1999, 2001). A model specifically tailored for representation of product family architecture is the building block model, which is derived from the concept of using modules to provide varieties. Building blocks are organized in hierarchical decomposition tree architecture (systems, modules, and attributes) from both functional and technical viewpoints (Kusiak and Huang 1996; Jiao et al. 2000). Under the hierarchical representation scheme, product variety can be implemented at different levels within the product architecture. However, module-based product architecture reasoning systems are currently being developed from different viewpoints (Rosen 1996). Much work done in strategic management and marketing research seeks to categorize or map the evolution and development of product families (Meyer et al. 1993; Wheelwright et al. 1989, 1992). Sanderson (1991) introduces the notion ofa "virtual design" to evolve into product families. Wheelwright and Clark (1992) suggest designing "platform projects" and Rothwell and Gardiner (1990) advocate "robust designs" as a means to generate a series of different products within a single product family. These product family maps are less formal and are intended primarily for strategic management; they are actually product platforms that can be used to generate product variants to form a product family. However, none of these approaches have been formalized for design synthesis. The basic concept of a family ofproducts or multi-product approach is to obtain the largest set of products through the standardized set of base components and production processes (McKay et al. 1996). A key aspect in developing product families is to consider the flexibility of assembly and manufacturing process. Stadzisz and Henrioud (1995) describe a methodology for the integrated design of product families and assembly processes through the use of web grammars (Pfaltz and Rosenfeld 1969). The work clusters products based on geometric similarities to obtain product families so as to decrease product variability within a product family and minimize the required flexibility of the associated assembly system. It is more applicable for later design stages when more quantitative information is available. Tseng and Jiao (1996, 1998) developed a set of approaches entitled "Design for Mass Customization (DFMC)" with an emphasis on how to "set up a rational product family architecture in order to conduct family-based design, rather than design only a single product." The family-based DFMC approach groups similar products into families based on functional requirements, product topology or manufacturing and assembly similarity. Accordingly, it provides a series of steps to formulate an optimal product family architecture. Their work is also more applicable in the later stages of design, particularly once the system architecture has been established. GonzaleZugasti (2000) proposes a four-step interactive process model for designing a platformbased product family: design requirements and models (e.g. function requirements, and

6

X uan F. Zha and Ram D. Sriram

design constraints, etc.), platform design, variants design, and platform evaluation, renegotiation, and iteration. Th e most impo rtant characteristics that have been stressed in the literature for designin g produ ct families are modularity (Chen et al. 1994, 1996; Martin and Ishii 1996; Sanderson 1991; Ulrich and Tung 1991), commo nality and reusability (Collier 1981, 1982; M cDerm ott et al. 1994), and standardization (Lee and Tang 1997; Ulrich and Eppin ger 1995). The concept of functional modularity should be incorp orated with the requirement s of product families from the produ ct life cycle perspective. Ulrich and Tung (1991) give a summary of different types of mod ularity. C hen et al. (1996) describe a family of produ cts as a "family of designs" which conforms to a given ranged set of design requir ements and recommend designing product families by changing a small numb er of compo nents or modules. Ishii and his team (Ishii et al. 1995; Martin and Ishii 1996; Chang and Ward 1995) emphasize the computational approaches for product variety design, including representation, measurement and evaluation of product varieties. "Design for Variety" refers to product and proc ess designs that meet the best balance of design modularity, component standardization, and product offering . Uzu meri and Sanderson (1995) emphasize flexibility and standardization as a means for enh ancing produ ct flexibility and offering a wide variety of produ cts. McD erm ott (1994) and Co llier (1981) stress commo nality across products within a produ ct family as an effective means to provide produ ct variety. Ulrich (1995) and Ulri ch and Eppin ger (1995) investigate the role of product architecture and the impact on produ ct change, produ ct variety, component standardization, produ ct performance, and produ ct developm ent manageme nt. In reviewin g prior work, we found that several quantitative frameworks have been proposed for product family design. T hey provide valuable managerial guidelines in implementin g the overall platform-based product family develop ment . T he overview of related research on platform-based produ ct design and developm ent can be summarized as show n in Figure 1. Th ere are generally two approaches for product family design. O ne is the top-down approach that adopts platform-based product family design (Simpson 1998, 2001). T he other is the bottom-up approach which implements family based product design through re-de sign or modi fication of constituent component s of product. The form er one is the current dominant research approach. Current research and development work is mainly in the realm of academics and does not provide support for knowl edge-b ased processes. There are very few systematic quantitative or intelligent methodologies that support product development team members to adopt this platform prod uct developme nt practice, despite the progress made in several research projects (Z ha and Lu 2002a,b). The most recent work in the area of prod uct family design comes from Fuj ita et al. (1999,2001) and Simpson er al. (2001). Mu ch of their work lays a solid fou ndation for the work proposed in this research. Th e approach advocated in this work is for companies to realize a family of modular product s that can be easily modified, configured and quickly adapted to satisfy a variety of customer requirem ents or target specific market niches with kno wledge suppor t.

Platform-based product design and development

7

Set based med els . Finch. 1997 Management perspecti ves : Me)"r;1993

ModulCl" Product Architecture

Plalformproject : VlJheet.. rightand C,.rI<. 1992

Uirioh and Tung (1991). Ulrich (1995). and Urch and E:ppinger (1995). Pahl.nd eeilz (1900). ROSlOn . 1996. 'JIJ 01 .1.1999. <:h. aod OJ 2001

Sta rl or d U: Cenlle r1llr Desi9n Research. Ishliel .1, Designlor ....riety GIT : System Realization lob, MGtree . Rosen 01. 1. Deoision-b....d 8. Oplirrizati>n,b.sed appro.oh MIT : Cenlllr lor h noll31ion Produa Oe""lopmem. 0t10 01.1; plat1orm·b.sed approa ch

Rob.Jst plattam . ROlhwe ' .nd Gordinr. 1990. Simpson. 1998.2001 trterfa oe mamgemert ; Graph Grammar Approaoh , Siddique and Rosen . 2001.1999

Ihrietydesign and synthesis, Ishii.0l.1.1900. Fujita. 01.1. 1999

HKUST: C. 'll:rtor U... CU1)nHz::arD) _. TJug.J ao & Of etal, gue r.J lpDbltm,

'If'IC'..

Prod...:t to mijy architecture. TSlOn g .nd Jiao, 1998 Prod...:t oommcna l~ yindex .

et 1Il.1995

p'XllcU ,m ot1 .. leeds U: E.gl... rll g

M,Oermom.nd Stook 1994, Ishii

OU I; ' Cu lt r. U
GBOMO.Ji.o. 01. 1.1 999.

p,xllot 1 "pltxllot

Commcnal~ yd,"erenliatlcn poirt . set up c os t. Manin. nd l
u ",~ "

Prod...:tfamil Yeoolullon.

IITU. SIYTeo h: ZI•• ,a

.1.

wa.n.g.et.. '.•20.0.3• • . . . .

i

i

1l.UI; .....mDlfmO
orm oa HIIproa,ot g ll ilf 'fOr mas,

(lts ~ • • dU~'

Cl ltcm rz:3tba

Figure 1. Overview of related work on platform-based product family design and development.

3. PLATFORM-BASED PRODUCT DESIGN AND DEVELOPMENT

A product family may have its origin in a differentiation process of a base product or in an aggregation process of distinct products. The product family has most impacts on a firm's ability to efficiently deliver large product variety and has profound implications for subsequent product development activities. The product family design process is tightly linked to issues of importance to the entire enterprise: product change, product variety, component standardization, product performance, manufacturability, and product development management. An effective platform for product family can allow a variety of derivative products to be created more rapidly and easily (cost and time savings), with each product providing the features and functions desired by a particular market segment (Simpson et al. 1998,2001). An interactive process for designing a platform-based product family was summarized in (Gonzale-Zugasti 2000). Figure 2 shows an overview of the interactive process applied to cellular phone family design. The steps in the product family design process shown in Figure 2 are described in more detail below:

1. Design requirements and models (e.g. customer requirements, function requirements, and design constraints, etc.) The first step is to construct mathematical models that connect the process models, design choices to the performance indices for products

8

Xu an F. Zh a and Ram D. Sriram

Designing IndMduai Products

Product Prod uct Producs A 9 C

Variant Vol/I A 9

Vari3nt C

Design Requi rements

0

Choose Plattlnn Specifio3lions

(0

Re·negotillte

0 Designing v.m3nts

Design Models

Designing PllIt10nn

Pl3tbrm

Figure 2. Platform-based product family design implementation.

in a family. Design proce ss models are descriptions of the sequence of activities that take place in the design process. They are often drawn in the form of flow diagram s, with feedback show ing the iterative returns to the earlier stages. T hese would include performance, as well as cost model s and would also incorporate revenue and competition models in the case of commercial products. 2. Plaiform design . With design requirements and model s, the design team can create a set of individually designed products as a baseline case against which platformbased variants can be compared. Based on these individually designed produ cts, the representatives from the design team or subsystem experts can explore the commonali ties of the design and decide on the common platform . Th e decision is based on the similarity of the requirements, the flexibility of the subsystems involved, and ot her concerns such as availability of resources, manufacturability and assemblability, schedule constraints, etc. 3. Vilriallts design. Once a platform is generated, a portion of the design will be handed over to the individual design teams who can complete and optimize the design of their respective products by adj usting the variant variables. 4. Piatform evaluation, re-negotiation, and iteration. The new designs form an alternative product family, which can then be compared to the baseline case of individually designed prod ucts or to oth er platform-based altern atives in terms of technica l

Platform-based product design and development

9

performance, cost, risk, etc. If the platform-based family is not acceptable, it may be necessary to renegotiate the platform choices and iterate through the design loop to arrive at an adequate family design. 4. PRODUCT PLATFORM AND PRODUCT FAMILY MODELING

Within a platform-based design and development strategy, there are different ways to create a product family. Based on the way to create a product family, there are two categories of product platforms: integral platform and modular platform. The integral platform is a single, monolithic part of the product that is shared by all the products in the family. Although it seems to be a restrictive type of platform, real examples exist, such as the telecommunications ground network for interplanetary spacecraft described in (Gonzale-Zugasti 2000). The term of 'integral' is used since the single common platform is an integral part ofeach variant; it cannot be replaced by a different piece or module. The modular platform is a more general case of platform, in which the product is divided into modules that can be swapped by others of different size or functionality to create variants. Modular systems provide the ability to achieve product variety through the combination and standardization ofcomponents. Within a modular platform, the platform is the set of modules that is reused across the product family. Companies usually have a set of modules already designed for previous products that could be reused, as well as the resources to design new versions of the same modules or modules with new functionality. In addition, there exists the possibility of purchasing modules from existing catalogs, or even outsourcing the design of new ones. The modular platform-based product family design and development process advocated in this research generates a re-configurable product platform that can be easily modified and upgraded through the addition, substitution, and exclusion ofmodules to realize module-based product family. Therefore, the focus of discussion in this section is on modular product family modeling, product platform generation, and product family evaluation. The detailed module-based product family design process will be discussed in the next section. 4.1. Product family architecture modeling

A product family architecture represents the conceptual structure and logical organization of product families from viewpoints of both customers and designers. A welldeveloped product family architecture can provide a generic architecture to capture and utilize commonality, within which each new product instantiates and extends so as to anchor future designs to a common product line structure. Thus, the modeling and design of product architectures is critical for mass customizing products to meet differentiated market niches and satisfy requirements on local content, component carry-over between generations, recyclability, and other strategic issues. The modeling and representation scheme used in this research is to combine recent developments in product representation (e.g., Fujita and Ishii (1997), Zha and Du (2001) and Rosen (1996)) into a hybrid approach. The hybrid approach hierarchically decomposes product families into products or systems, modules, and attributes,

10

Xuan F. Zha and R am D. Sriram

Iodul s , and Attributes

}

T~sot

Modules :

Common

--mr-

Vlriant M31.M32

. ...._ ..

,:~

Uni~e

M21.M

.~1

....................

Figure 3. Products. mo dules. and attr ibutes.

as show n in Figure 3. Under this hierarchical representation scheme, product variety is implem ent ed at different levels within the product architecture. Discrete mathematics and matri x are used as a form al foundation for configuration design of modular product architectures. Based o n the hybrid representation, a knowledge support product modul e reasoning system is developed. D etails will be discussed later. 4.2. Product family evolution representation

Product family maps or catalogs are int ended for strategic managem ent and can be used as product platforms to generate product variants to form a product family (Wheelwright and Sasser 1989). In thi s research, the product family map or catalog is used to trace th e evolution of a produ ct family, as shown in Figure 4. Th e market segmentation grid is used to facilitate identifying platform leveraging strate.gies in a product family (Meye r ct a1. 1997). Th e major market segme nt serviced by produ cts is listed horizontally in a market segmentation grid and the vertical axis reflects different tiers of pri ce and performance within each market segment. Similar to the wo rk in (Simpson 1998), th e market segmentation gr id is applied to identify modul e-based product platform scaling opportu nities from overall design requirements. As a qualitative approach, the beachh ead method is m ost helpful for this research to identi fy and develop a common platform within a product family, as shown in Figure 5.

Generation 1: Orig inal Produei Platform

'-.II' o( p,ahJCI tm ' u n dCloClClDrlI IQO1I:'""c '"Of, rocIuc: 1.

Product Family Evolution, Platform Renewal, and lIew Product Creation

• Pralu:11

- ' fOlIu:1 2 .. 'fOlIuc:l:J

To llCtllC J"" tln: 1I, u:au In rcllllll'Clduc:ldclZICiClm ltna.a'lrm lM,tI:IOOt\Iru::lU'lylr n:w lIS .I alf:lnh ZW01kcln, zn:I h'1, IftZlr"t.lO:Ulre ,IOCZ"U .

Cost reduction and/or add new fe atures

J _________

Generation 2Ne w Generaton of Produei Famil lew gErEr:J lon of llf'Ollu:::1 'tIIlI I,

_ zUn' .... ~ lIudrv InClOl'l

<:D'\ bc lSrU:l(lprd

by Zltdlrv rc ..

--------

Improvement in core technologies and/or add new core t echnologies

Generation 3:New Produei Platform

.. Pfoducl 1

' rwlIIl'QdVC:''1l a~

-'fIOClu:I2 " ' flOdu: 13

..'1OIIu::1 • (I e.., tnlftz 0

c:n lie Ifl'IEIOOU

.. '1Odu:1 t

..'1a1uc:12

" '.

CO'! tcrnctOQ IU zn:I

ff't t:G"'t ll u hrctlol t c::rn:toa1c, brr::ld'lrcwmznzl5 .

. .I'tduc:' I

...1Odu:1]

" ~'.

-,

-;

,

"

:

., ~.

~

•.•. ••••••,.. •.•.

'..

'4

Generation NNew pro:uei Family and New Pr o duei Platform

I r uslf'G conlN.I:JU:S dl:U:lqun rnl. rc~ lI r rc,.,"ono1 l1l'Od U: "

DEdCloElqai .

Q

- ' l'CCfucl t

.. ' fOdu:12

- ' I"Oduc:I ::J

..'1OlIu: ' I

Figure 4. Product family evolution: product family map and the development of new generations.

High Cost High Performance

tscale Up

Platform Extension

----------------- ---

Mid·range Low Cost LowPerformance

.... .

; cale D Segment A

........

" A,

Segment B

Segment C

Commom Product Platform(s)

Figure 5. Platform extensions, and scale up and down for different market niche.

11

12

Xu an F Zha and Ram D. Sriram

Figure 6. Structured GA for product design implementation.

4.3. Product family generation

Product family is generated through configuration design, in which a family ofproducts can widely vary the selection and assembly of modules or pre-defined building blocks at different levels of abstraction so as to satisfy diverse customer requirements (Tseng and Jiao 1996, 1998; Fujita et al. 1998, 1999). The essence of configuration design is to synthesize product structures by determining what modules or building blocks are in the product and how they are configured to satisfy a set of requirements and constraints. There are many approaches to address module assembly and configuration design, such as assembly incidence matrix, genetic algorithms (Chen et al. 1999; Zha and Du 2001; Brown 1998; Leger 1999). In this research, the structured genetic algorithms (sGA) (Dasgupta and McGrego 1994; Sriram 1997) based product representation and evolutionary design scheme are employed for product family generation through modules configuration, as shown in Figure 6. The sGA product representation uses regulatory genes that act as a switch to turn genes on (active) and off (passive). Each gene in higher levels acts as a switchable pointer that has two possible targets: when the gene is active (on) it points to its lower-level target (gene), and when passive (off) it points to the same-level target. At the evaluation stage only the expressed genes of an individual are translated into the phenotypic functionality, which means that only the genes that are currently active contribute to the product, hence to the fitness of the product. The passive genes do not influence fitness and are carried along as redundant genetic material during the evolutionary process. Therefore, the utilization of the sGA approach to product families can be summarized as follows. First, genes represent modules that are either active or passive, depending on whether or not they are part of the product architecture. Then, a family of products relying on the addition or subtraction of modules meeting customer requirements could be evaluated by alternating different "active" and "passive" modules. A product family would thus correspond to product variants that have different active and passive combinations of modules.

Platform-based product design and development

13

4.4. Product family evaluation for customization

The customization stage aims at obtaining a feasible architecture of product family member through reasoning product family module space according to customer requirements (Meyer et al. 1997). There are two steps involved in this stage. First, customer requirements such as function, assembly, and reuse need to be converted to constraints (Suh 1990). Then, the reasoning is performed at two levels: namely module and attribute levels, to determine feasible product family member architecture. In order to evaluate a family of products for mass customization, suitable metrics are needed to assess the appropriateness of a product platform and the corresponding family of products (Krishnan and Gupta 2001). The metrics should also be useful for measuring the various attributes of the product family and assessing a platform's modularity. With respect to the process of modular platform based product family design and customization, the evaluation ofproduct family can be viewed from three different level perspectives: product platform, product family and product variant. The product variant level evaluation is actually the same as or similar to the individual product design evaluation. Various traditional design evaluation approaches are applicable, and the metrics for this level evaluation include cost, time, assemblability, manufacturability, etc. The platform and family level evaluation is focused on the overall benefit of product family development. The metrics at these levels reflect that the main goal of designing productslfamilies is to maximize the benefits to the company. Thus, they can be used to monitor the platform and product family development. It is related to the impact a Research and Development (R&D) project has to platform component revenue and investments into resources. If the impact is high the activities have to be reviewed and planned with care. Data from the ongoing and estimated business can be used to rank R&D projects according to their future impact to the business process and the total platform revenue. The strategy is defined in relationship to the component categories of a product platform. A product platform in nature represents a set of functions, features, parameters, components, and information around which a product architecture to base a family of products and technologies can be developed (Simpson 1998). A global product platform is in general the common basis for multiple product variants targeted to meet specialized requirements for specific applications and markets. The offered modules, features and parameters have to be compliant with the specific market and application needs. Technologies and resources used for R&D, engineering and manufacturing have to be harmonized as well. Maximum global market coverage with minimum internal variation in product, processes, and tools should be the major business goal. Existing product platforms have to be adapted to global markets and application needs, or merged with other product lines strong in specific markets and features and/or harmonized with each other. Development activities between product families have to be co-ordinated regarding their contribution to a common platform concept and impact on market needs. Meyer and Lehnerd (1997) describe measuring the performance of product families in general. Other platform related strategies to minimize product

14

Xuan F. Zha and R am D. Sriram

variety are describ ed in (Krishnan and Gupta 2001; Jiao and T seng 1998; Sand erson 1991). Metr ics and advanced analysis of sales data sho uld make the situation transparent for strategic R &D decisions. R & D projec ts are ranked, in many cases, only by th eir developm ent costs and risks and not by follow up costs caused by the developme nt and their influence on th e tot al platfor m revenue. T here is no easy way to communi cate metri cs based charting method s. Th e method has to set the R & D activities in relation ship to the ability to inte grate the results into the business process and the related platfor m. Technology managers have to identify, analyze and decide w hich proposed and ongoing R & D activities brin g the most benefit to th e overall platform strategy within an organization. A platform strategy encompasses R&D portfolio plannin g and assessme nt for ongo ing and planned projects based on metric s. In this aspect, Meyer et a!. (1997) have proposed platform efficiency and platform effectiveness as two metho ds to measure R & D performance, focused on platforms and their follow- on produ ct variants within a produ ct family. They define platform efficiency as the degree to whi ch a platform allows economical generati on of derivative products. At the follow-o n product level this means: Platform efficiency =

Derivative Product Engineering Costs Platform Engineering Costs

- -- --:-------':::.-----':::.--

The qu estion this measure seeks to answer is: How mu ch did the follow- on product cost to develop as a fraction of what was allocated to the base platform? In a similar manner, platfor m effectiveness is defined as the degree to which the produ cts based on produ ct platform pro duce revenue for the firm relative to the cost of developin g those produ cts. At the follow-on product level this means: Platform effectiveness

Derivative Product = - ---,-Net- Sales - -of-a ---,- - - - - -,---

Development Costs of a Derivative Product

O ther meth ods that can be useful for measurin g perform ance for a produ ct family perspective, proposed by M eyer and Lehn erd (1997), are cycle time efficiency (i.e. elapsed time to develop a derivative produ ct compared with the elapsed tim e to develop the platform), technological com petitive responsiveness (i.e. tracking the degree to wh ich a firm has beat en its competitors to the market place with new features or capabilities in its products) and profit potential (i.e. targetin g the profitability of derivative products by examining gross margins). These metrics do not explicitly tell management when to create a new platform. However, they provide a rich context to determi ne when product platfor ms sho uld be replaced and what to expect from new produ cts based on these new platfor ms. In this research , the following two met rics have been used in platform-based family level evaluation (Simpson 1998): (a) Market ifficiency (TIM) embodies a tradeoff between the marketing and the engineering design, offering the least amo unt of variety so as to satisfy the greatest amount

Platform-based product design and development

15

of customers, i.e., targeting the largest number of market niches with the fewest products. (b) Investment ifficiency (rJ I) embodies a tradeoff between the manufacturing and the engineering design, investing a minimal amount of capital into machining and tooling equipment while still being able to produce as large a variety of products as possible. Therefore, they can be represented by the following two equations, respectively: I)M 1)1

= Ntm/N M

(1)

= Cm/N v

(2)

where, N tm and N M are the number of the targetable market niches and the total market numbers, respectively; C m and N, are the manufacturing equipment costs and the number of the product varieties, respectively. Of course, a tradeoff also exists between the market efficiency and the investment efficiency as an increase in the investment efficiency through a decrease in product variety can cause a decrease in the market efficiency. 5. MODULE-BASED PRODUCT FAMILY DESIGN PROCESS

As shown in Figure 3, product variety can be implemented at different levels within the product architecture. From the aspect of product design, component standardization through a modular architecture has clear advantages in the areas of cost, product performance and product development. Decomposing the problem into modules and defining how modules are related to one another creates the model ofa design problem. The modularization process, as shown in Figure 7, is achieved through the following steps (Zha and Lu 2002a,b): (1) The requirement analysis and modeling for a product (family) is carried out both from the customer and the designer viewpoints using design function deployment (DFD) and Hatley/Pirbhai technique (Sivaloganathan et al. 2001; Rushton & Zakarian, 2000). A function-function interaction matrix is generated. (2) The combination of heuristic and quantitative clustering algorithms is used to modularize the product (family) architecture, and a modularity matrix is constructed. (3) All modules in the product (family) are identified through the modularity matrix, and the types (functions) of all these modules can be further identified according to the module classifications. (4) The functional modules are mapped to structural modules using the functionstructure interaction matrix. (5) The hierarchical building blocks or design prototypes (Gero 1990) are used to represent the product (family) architecture from both the functional and the structural perspectives (Zha and Du 2001).

OJ'

II

to

l'

..

cr,

..

II

I

I

I

,

•

9

I

,

Figure 7. M odu le- based pro duc t family design for mass customizatio n.

U

.

I U

I

•

I

"

,

--

Mod

J

S1nIcturt UOIlllarllyMablx

_

.

'[

· · J:

~Aioollt~ Ii,.

~.--

j1 :,IU] .1;.-----,"

£rev

..

LlJ · j ~ ·•..

,•

o

;~-

~ ~~ t>tt-D~V""'"

0) ..

:]

0)

""

--

;~ ~ ~~

...........""'"..........

, "

Function . Structu•• '-'lMacDon MatrtJe

r_l'"

Function - Function Inl.r&etlon UatItX

Platform-based product design and development

17

(6) A genetic algorithm is used to configure and optimize produ ct family architecture to achieve one or multipl e main objectives (see Section 4.3). Other design objectives are tran sforrn ed into con straints for modules or their attributes. In addition, cost and profit model s are also built as system constraints. (7) Th e produ ct family architecture is rebuilt to form a hierarchical architecture by using the optimized modul es from both the fun ctional and structural perspectives. (8) The produ ct family module space forms a product platform. The product family portfolio is deri ved from the product family module space. (9) Standard interfaces to facilitate addition, removal, and substitutio n of modules are developed. (10) T he product family can be generated by module confi guration /reconfiguration. (11) Product variant is evaluated and selected to satisfy the customer requirements. T herefore, the steps for creating a module-based product family can be outlined as follows: 1) decompose products into their representative functions; 2) develop modules with one-to- one (or many-to-on e) correspondence with functi on s; 3) group common functional modules into a commo n produ ct platform; and 4) standardize interfaces to facilitate addition, remo val, and substitution of modules. The module-based product family design process is to develop a re- configurable produ ct platform that can be easily modified and upgraded through the addition, substitution, and exclusion of modules to realize module-based produ ct family. Figure 8 describes the mathematical model for rnodul arization pro cess in modul ar product family design for mass custornization. Figure 9 gives an example of modular platform-based motor truck family design and development (modules ---+ tru ck platform ---+ truck variants). T he fundamental issues und erlying the product family design include product information modeling, product family architecture, produ ct platform and variety, modularity and commonality, produ ct family generation, and produ ct assessme nt and customization, etc. Followin g the philosoph y of the above stages, a modul arized approach is proposed for prod uct family design , in which a re-configurable product platform that can be easily modified and upgraded through the addition , substitutio n, and exclusion of modules is developed. An effective product family platform can allow a variety of derivative products to be created more rapidly and easily, with each product providing the features and functions desired by a particular market segment (Simpson et al. 1998, 2001). Different from the traditional modul ar design approach , the modul ar family design process is rou ghly divided int o two main stages: 1) product (family) planning, and 2) family design. It ranges from captur ing th e voice of custom ers and market trends for generatin g pro duct design specifications, formulating a produ ct platform, to custo mizing produ cts for customers' satisfactio n. T he product plann ing stage embeds the voice of custo mers into the design objective and generates produ ct design specifications. Th e produ ct family design realizes sufficient produ ct variety- a family of products to satisfy a range of custom er demands. In the next section, we will discuss a knowledge sup ported modular product family design pro cess.

18

Xuan E Zha and Ram D. Sriram

Customer/ Market Trends

•

Fuuetioml Objectives

•

Opcrational F1u.etional

• •

Requirements

General F\lnetional RequiRll1cnls (OFR) Weia.WsforGFR

Prod uct Platform

¢

Modular Product Desiin Engineering Design Specification

Similarity Matrix

imilari~ IndexQ

D

Optimization

Modules

or

Model

Subsystems

A

EDS &GFR 1

o

Weights 1

~.~

09 >I<

C...d

X nm

•

=

.... 1

{l , if component n belongs to module m }

...

0, Otherwise

To maximize the sum of aD. the similariDes B

Figure 8. Modularization process in modular product family design.

Platform-based product design and development

19

Figure 9. Modular truck family design and development (Volvo).

6. KNOWLEDGE SUPPORT FRAMEWORK FOR MODULAR PRODUCT FAMILY DESIGN

The design process is knowledge intensive as there is a large amount of knowledge that designers call upon and use during the design process to match the ever-increasing complexity of design problems. Given that even the most routine of design tasks is dependent upon vast amounts of expert design knowledge, there is a need for some sort of knowledge support. Design knowledge refers to the collection of knowledge needed to support the design activities and decision-making in design process. Successfully capturing design knowledge, effectively representing it and easily accessing it are crucial to increase the design "science" contents compared to the "art" nature for product family design process. The main characteristics for product family design are modularity, commonality/reusability, and standardization. Designing product families requires knowledge defining their characteristics. Details are discussed below. 6.1. Knowledge support scheme, challenges and key issues

Once the concepts of a product platform and a product family architecture are established to describe product families, a representation or modeling scheme is needed to model product families. Existing representation/modeling schemes for product families vary in the literature, including two types of representational models: product family architecture and product family evolution. These models are related to the formulation ofthe product platform for product family generation and play crucial roles in the down stream stages such as product family evaluation. The fundamental issues underlying the

20

Xuan F. Zha and Ram D. Sriram

Pr c.du> t Varia rt s

Q

'/ Product Famly Design

Figure 10. Knowledge support framework for module-based product family design.

product family design process include product information modeling, product family architecture, product platform and variety, modularity and commonality, product family generation, and product assessment. With respect to the modular family design approach discussed above, a knowledge intensive support framework is developed, asillustrated in Figure 10. Design knowledge is classified into two categories: product information and knowledge, and process knowledge. These two categories ofknowledge are utilized to support two main stages, product planning and family design, in the whole process of modular product family design. How knowledge is modeled and supports the modular product family design process will be discussed below. The knowledge support product family planning stage assists the designer to capture the voice of customers and market trends and embed them into the design objective for generating product design specifications (PDS) and customizing products for customers' satisfaction. The knowledge support for product family design assists designers to realize sufficient product variety- a family of products to satisfy a range of customer demands. With the understanding of the fundamental issues in product family design, a more detailed scheme with knowledge support shown in Figure 11 is adopted in customer requirements modeling, product architecture modeling, product platform establishment, product family generation, and product assessment. The modular product family

Platform-based product design and development

ProllJct

-". .' : f/.7: .. Generationof

PI nnill)

Product Fern ~y

~, \.........

.....

Configuration Design

odutar esrgn

__._

'-, ..__ Product PlaU-orm ~_..., --- .- ...-.

14---

21

.., '

\!

[Generation 01 Custom ized

--_....

Product

Prod\lct

\

Custom lzatl on . ..

/

l

. .111

Produci Platlor m

Building

Figure 11. Knowledge suppo rt scheme for modular product family design process.

design process is roughl y divided int o two main stages: product platform generation and product assessment, and is implemented through product planning for design specifications (e.g. function requir ements and design constraints) generation, modular design , configuration design and produ ct assessment. T herefo re, the key research issues for the knowledge support scheme for modul ar product family design can be summarized as follows: (1) Design information and kn owledge modeling: design kn owled ge captur ing, classification, representation , and organization and man agem ent ; (2) Product architecture modeling: representing product variety, compo nent modularization and standardizatio n, product management, etc.; (3) Produ ct platform establishment: exploring methods for feature-b ased module design and configuration design; (4) Produ ct family generation : generating product variant s or family members; and (5) Produ ct assessment: evaluating produ ct variants. Each of the above issues has many detailed sub-issues to be addressed. T he challenging, but criti cal, on es are th e product/ family architecture representation and produ ct platform establishme nt. w hich are related to produ ct architecture mod eling, prod uct platform gen eration , and process from the produ ct architecture modelin g to

22

Xuan F. Zh a and R am D. Sriram

(1) Product falri ly architecture Diff iculty : product family inf or m.tlo n modeling I....gration of m
.

-+ product

(3) Relationshi ps bet"", n product p latform and var iety Dllfic
V ari ant der i vat ion Commonality extract ion

Figure 12. Product family design-key research issues.

the product platform generation, as illustrated in Figure 12. The product family architecture should represent th e conceptual struc ture and logical organization of produ ct families from viewpoints of both customers and designers (engineering related). A well- developed produ ct family architecture can provide a generic architecture to capture and utilize commonality, within which each new product expands so as to anchor future designs to a common product line struc ture. 6.2. Product family design knowledge modeling and support

Based on the above describ ed knowledge support scheme, the implementation of knowledge supported modul e-b ased product family design can be achieved through two steps: 1) knowledge modeling, 2) and knowledge support process, which are discussed in this section. 6.2.1. Prodllctfamily design knowledge modeling isslies

T he com plexity and diversity of engineerin g knowledge results in high demands for knowledge modelin g in enginee ring: the many different aspects and their relationship s have to be described in a complete, consistent, coherent, and concise way. Even if we assume that the corresponding advanced knowledge processing capabilities exist, adequa te modeling of enginee ring knowledge provides a challenge. 0 - 0 and STEP provide some expressiveness and formal rigor as platforms for knowledge modeling in produ ct family design. Co mmo nKADS (http:/ / www.commonKads.uva.nl/) as a

Platform-based product design and development

23

Design Knowedge Organisation &

M1nagenet

• Classification • Represelltation

Desk ner

Product Farnly

Desi!Jl Knov.1edge - Source .7:",.,.__

.--

~ ---

.~

DesilJl Knov.1edge Rep eselltation

• Abstraction • Fonna lization

~

Design KnMedge Captlre

Figure 13. Knowledge modeling in product family design.

dedicated knowledge oriented approach can be seen as a powerful framework for knowledge modeling in general, but in its current concrete form it is not expressive and differentiated enough in order to fulfill the high knowledge modeling demands in engmeermg. Product design knowledge is a collection of data/information and knowledge needed to support the design activities and decision-making in productlfamily design process. It includes all information defined and created during the design process and all knowledge used to create that information. The former is often defined as product knowledge, which includes all product or artifact related information needed throughout the whole design process such as product specifications, concepts, structure and geometry. The latter is referred as process knowledge, which can be described in two aspects: design activities/tasks and design rationale. Design knowledge modeling is to capture, represent, organize and manage design knowledge in the design process. Further, the knowledge modeling process for productlfamily design is to elicit design knowledge in product family design and establish a comprehensive knowledge repository that can be retrieved and reused when necessary. The key issues related to product family design knowledge modeling are shown in Figure 13, which include design knowledge capture, classification, representation, organization and management.

24

Xuan F. Zha and Ram D. Sriram

IrtEV81 Plaform

MotlJlar

P, ....orm

STEP Figure 14. Platform information model and product platform lifecycle (modified from Sivard 2(00).

The approach is to model a product family architecture, according to the semantics used in product development, prepared for the information needs of configuration, as shown in Figure 14. The product structure and components of the generic information platform (GIP) (Sivard 2000) are represented in the physical domain of axiomatic design and configuration rules and mappings are represented as constraints and mappings between the functional, physical, and process domains. Ideally, this model is adapted to the STEP product-modeling standard, thereby creating a standardized information platform covering the reasoning of development as well as order processing. It is relevant since it contains modeling constructs for representing alternatives, configuration rules and many other aspects of product platforms. Further, it is considered as one of the most general product modeling standards and is being adopted by many PDM suppliers. Still, it lacks principles for how to represent many product platform concepts. Apart from studies of product and product family design, a basis of the research is knowledge based configuration systems and the information modeling and application. The purpose with adapting the conceptual model to a standard is twofold: 1) the standard provides functionality and detailed information models, 2) a

Platform-based product design and development

25

standard format supports the exchange of information between applications and users. With help of product platform, customers' requirements are satisfied either by standard models or customer models configured from standard or custom modules and/or components. 6.2.2. Knowledge modeling /representation for product family design

Product family design starts from a set of customer/functional requirements of the product. The requirements are implemented by a set of modules described in terms of design variables of the product principle. These design variables of a module propagate to the functional requirements on the lower level elements of the module, so on and so forth until to all the modules and element are specified. With respect to the product family design process, three groups of knowledge are required: 1) How to deploy the functions of products (module) to lower level modules; 2) How to select the solutions among the standard ones or the custom ones; and 3) After being selected, all of the solutions have to be configured to be an end product. The performance of each of them has to be estimated to help the decision making of both the designer and the customer. As discussed above, product family design knowledge can also be classified into two categories: product information and knowledge, and process knowledge. These two categories of knowledge are utilized to support two main stages, product planning and family design, in the whole process of modular product family design. The product family design knowledge should be abstracted and classified into different categories, e.g., off-line and on-line, product data/information and design process knowledge, through analysis of product-family design process. Different categories of product family design knowledge are represented in different ways from multiple views ofproductlfamily design process. Since product design knowledge includes all product data/information needed throughout the whole family design process, a new product data/information model must be employed, which may include customer/task requirements, design specifications, functions-behaviors, structures, assemblies, performance constraints/metrics, etc. Product Definition:

Product Variety:

Customer/Task Requirements; Specifications; Functions-Behaviors-Structures; Performance Objectives and Constraints; Assembly Structure; Module Details; Family Parameters;

As shown in Figure 11, the product repository may be extensively composed of functions, means, structures, features library, modules library, types, attributes, relationships, rules, constraints, evaluation/selection criteria, etc. In practice, an effective

26

Xuan F. Zha and Ram D. Sriram

Figure 15. The architecture of product platform and its construction process.

way to create a product datalinformation representation model is to integrate the database representation model and the design process model. Such a datalinformation model still needs to be divided into two parts: one for modules and the other for module assemblies. The module representations may follow the object-based formalism, while the module assemblies may be based on the graph theory and its incident matrix representation. Following the requirements of designing product families with a high degree of commonality as well as designing several products around reusable components, the two main elements of the architecture are: 1) generic product specifications and 2) reusable solution libraries. Product architectures and component architectures are treated in a similar way, enabling a hierarchical structure of structures. Thus, classes or families of components may be selected from the solution library and integrated into the framework, as shown in Figure 15. Therefore, a multi-level hybrid representation schema (meta level, physical level, geometric level) is adopted to represent the product design process knowledge in different design stages at different levels, based on a combination of elements of semantic relationships with the object-oriented data model. For illustration, an object-oriented representation instance for robot family and its parameterized module information (e.g. link and joint modules) are described as follows:

Platform-based product design and development

27

Object (Joint module) { Motor type: [maxonRE25.118799, maxon2260.8755, J; Gearhead type: [maxon16.118188, maxon26.110396, J; Material type: [steel, copper, aluminum, ... J; Number of DOFs: [1,2,3J; Motion type: [translation, rotationJ; Active attribute: [passive, activeJ; Generalized force ranges: [force, torqueJ; Connected module types: [Link, joint, otherJ; Motion ranges: [displ. (8), vel. (V), accel. (A) J ; Adjustable parameters: [initial posesJ; Assembly pattern: [no., input/output portsJ; Dimension parameters: [len.(L),wid.(W), heigh.(H)J; Dynamic parameters: [mass, center of mass, inertialJ; }

Object (Link module) { Connected module types: [link, jointJ; Assembly pattern: [no., input/output portsJ; Fixed dimensions: [displacement and orientationJ; Changeable parameters: [displacement or orientationJ; Dynamic parameters: [mass, center of mass, inertialJ;

6.2.3. Knowledge support process for modular product family design

Once the design knowledge repository is built up, the user or designer can utilize the knowledge in it to solve problems in product family design. As discussed in Sections 3, 4 and 5 above, the whole design process was roughly divided into two main stages: product platform formulation for family generation and product evaluation or assessment for mass customization. Thus, the knowledge support process covers these two stages. Incorporating the modularization process described above, the knowledge supported modular product family design process can be fulfilled. The knowledge support process in product design evaluation for mass customization experiences the elimination of unacceptable alternatives, the evaluation of candidates, and the final decision-making under the customers' requirements and design constraints (Zha and Sriram et al. 2003). With respect to the traditional approach for product evaluation (Pahl and Beitz 1996), the knowledge resources utilized in the process include differentiating features, customers' requirements, preferences and importance (weights), trade-offs (e.g. market vs investment), assemblability and manufacturablity, and utilities functions, and heuristic knowledge (e.g. production rules), etc. In applying the above knowledge support scheme for modular product family design, the following points should be noted: (1) System requirement modeling and analysis should be the first step in development of modular product family. (2) Development of modular product family is a complex task. A systematic and structured approach is a mandatory. (3) Functional analysis is best suited for developing new product family, rather than modifying existing ones.

28

Xuan F. Zha and Ram D. Sriram

(4) Large complex products or systems have a considerable amount of constraints that limit the design of modular product families. 7. KNOWLEDGE INTENSIVE SUPPORT SYSTEM FOR PRODUCT FAMILY DESIGN

A knowledge support system is developed to assist the designer in product family design process to generate, select and evaluate product families automatically. Figure 16 shows a web client / server implementation architecture for the knowledge support system to support modular product family design. As shown in Figure 16, the web based design framework uses the design with modules, modules network, and knowledge support paradigms, which are techniques by which knowledge-based systems utilize the connectivity provided by the Internet to increase the size of the user base whilst minimizing distribution and maintenance overheads. The knowledge intensive support system can thus exploit the modularity of knowledge-based systems, in that the inference engine and knowledge bases are located on server computers and the user interface is exported on demand to client computers via network connections (e.g. internet, WWW). Therefore, modules under the knowledge support framework are connected together so that they can exchange services to form large collaborative integrated models. The module structure leads itself to a client (browser)/knowledge server oriented architecture using distributed object technology. The implementation of knowledge intensive support system uses two-tiered client/knowledge server architecture to support collaborative design interactions with a web-browser based graphical user interface (GUI). The underlying framework and the knowledge engine are written in JavaTM, which integrated with Java Expert System Shell, Jess/FuzzyJess (Ernest 1999, NRCC 2003). It also integrates with existing application packages such as CAD and database applications. CQRBA serves as an information and service exchange infrastructure above the computer network layer and provides the capability to interact with existing CAD applications and database management systems through other Object Request Brokers (ORB). In turn, the framework provides the methods and interfaces needed for the interaction with other modules in the networked environment. Based on the architecture of the knowledge support system, its functionality is achieved through implementing the following subsystems: web GUI, knowledge repository, and advisory system for modular product family design. The knowledge repository is able to capture, store and retrieve design knowledge, including customer requirements, design objectives, design modules, design rationales, evaluation criteria, and product varieties, etc. (Szykman et al. 2000, 2001). The modular design advisory system (Design Advisor) includes decision-making mechanism and product module reasoning engine. The knowledge supported product module reasoning engine is developed to reason about sets of product architectures, to translate design requirements into constraints on these sets, to compare architecture modules from different viewpoints, and to enumerate all feasible modules using the "generate-and-test" or heuristic approaches. The web GUI provides users with the following abilities:

Platform-based product design and development

AA

Client Side

Server Side

r-:::lr-:::l

~~

Mod~ar D~

S

Cont

DecisionM8ker

Product D81abese

RuiesS

Conslrms

Knowedge BllSe

Figure 16. Internet and web-enabled knowledge support system architecture.

29

30

Xuan E Zha and Ram D. Sriram

(1) examines the customers' requirements and the configuration of design problem models, (2) generates a product platform, (3) analyzes tradeoffs and varieties by modifying design parameters within modules, (4) searches for product alternatives in a product family, and (5) selects the final solutions with the knowledge-based support systems and/or an optimization tool (e.g. GA and SA). The web GUI is a pure client of a knowledge server, delegating all events to the associated server. For wide accessibility and interoperability, the GUI is implemented as a web browser based client application. The front-end side of the application is implemented as a combination of XML (eXtension Markup Language) documents, VRML (Virtual Reality Modeling Language) and Java applets. The back-end side system components include a knowledge repository, modular design server, product family generation server, product evaluation server, models and modules base server, CAD and graphics server, and a database server, and knowledge assistant and inter-server communications explanation facilities (Siegel 1996; IONA 1997). The commercial ORB implementation of Java applets (OrbixWeb™) is employed for the CORBA-based remote communication between the GUI Java applets and the back-end side system components. The Design Advisor system, consisting of cluster analysis module, ranking module, selection module, neural-fuzzy module, and visualization and explanation facilities, was developed in (Zha and Sriram et al. 2003). The current capabilities ofthe prototype include capturing and browsing of the evolution of product families and of product variant configurations in product families, ranking and evaluation, and selection of product variants in a product family. The comprehensive fuzzy decision support system can visualize and explain the reasoning process and makes a great difference between the knowledge support system and the traditional program. With this subsystem, the designer can represent the design choices available as a fuzzy AND/OR tree. The fuzzy clustering and ranking algorithms employed in it are able to evaluate and select the (near) overall optimal design that best satisfies customer requirements. The selected design choice is highlighted in the represented tree. Figure 17 demonstrates a modularity and XML representation of power supply for Zip disk drive. Figure 18 gives a screen snapshot for the prototype system used for power supply family design. When fully developed, the knowledge intensive support system for product family design can result in the following benefits: (1) capture and manage design information and knowledge (e.g. know-how), retrieve previous knowledge; (2) provide real-time information and knowledge services to help or assist designers in family-based product design; (3) support communication and collaborative teamwork by sharing the most up-todate design information and knowledge; (4) reduce product development cycle time and lower total cost;

Roct OraUpD

D

P,lmll"'eO D P,lmltlva1 D Pflmltlvo2 DP,lmtttvo3 [ ) PflmltlV". [ ) P'lmltlv"S D P,lmll..."e

Figure 17. Modularity and XML representation of power supply for Zip disk drive.

Figure 18. Screen snapshot of power supply family design. 31

32

Xuan E Zha and Ram 0. Sriram

(5) improve customer satisfaction; and (6) improve the competitiveness and sales of a company. 8. SUMMARY AND FUTURE WORK

This chapter presented a framework for platform-based product development and knowledge support for product family design. An integrated modular product family design scheme is proposed with knowledge support for customer requirements' modeling, product architecture modeling, product platform establishment, product family generation, and product assessment. The developed methodology and framework can be used for capture, representing, organizing, and managing product family design knowledge and offer support in the design process. Finally, the issues related to the implementation of the knowledge support framework for product family design are addressed. The system implementation architecture and functionality are provided to support platform-based product family design and development. When fully developed, the system can support product family design effectively and efficiently and improve customer satisfaction. Future work is required to further develop a web-based knowledge repository and design support system for module-based product family design. Also, the model presented in the chapter will be incorporated and fit into the core product model (Fenves 2000) and the product family evolution model (Wang et al. 2003) recently developed at the National Institute of Standards and Technology, USA. Disclaimer

Commercial equipment and software, many of which are either registered or trademarked, are identified in order to adequately specify certain procedures. In no case does such identification imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose. Part of the work was done while the first author was at Singapore Institute of manufacturing Technology, Singapore. REFERENCES Agarwal, M. and Cagan,]., 1998, A Blend of Different Tastes: the Language of Coffeemakers," Environment and Planning B: Planning and Design, 25(2): 205-226. Brown.T), c., 1998, "Defining Configuring," http://www.cs.wpi.edu/~dcb/Config/EdamConfig.htm!. Al EDAM special issue on Configuration. Chang T-S. and Ward A. C,; 1995, "Design-In-Modularity with Conceptual Robustness," Design Technical Conference ASME 1995, DE-Vol. 82. Chen, I. M., Yeo, S. H., Chen, G., and Yang, G. 1.., Kernel for Modular Robot Applications: Automatic Modeling Techniques, Int. J Robotics Research, pp. 225-242, 1999. Chen, W, Allen,]. K., Mavris D., and Mistree, E, 1996, "A Concept Exploration Method for Determining Robust Top-Level Specifications," Engineering Optimization, Vol. 26: pp. 137-158. Chen, W, Rosen D., Allen J., and Mistree, E, 1994, "Modularity and the Independence of Functional Requirements in Designing Complex Systems," Concurrent Product Design, Vol. 74: pp. 31-38. Collier, 0. A., 1981, "The Measurement and Operating Benefits ofComponent Part Commonality," Decision Sciences, Vol. 12(1): pp. 85-96. Collier, 0. A., 1982, "Aggregate Safety Stock Levels and Component Part Commonality," Management Science, Vol. 28(22): pp. 1296-1303. Cho,]. R., 2000, Product Structuring for Customer, Assembly and Maintenance, Assembly Automation Lab., Industrial Engineering, Pusan National University, Korea.

Platform-based product design and development

33

Dasgupta, D. and McGregor, D. R., 1994, "A More Biologically Motivated Genetic Algorithm: The Model and Some Results," Cybernetics and Systems: An International Journal, Vol. 25: PI'. 447-469. Du, X. H., Jiao, J. X., and Tseng, M. M., 2001, Product Platform Planning for Mass Customization, Department of Industrial Engineering & Engineering Management, The Hong Kong University of Science and Technology, Hong Kong. Erens, F. and Verhulst, K., 1997, "Architectures for Product Families," Computers in Industry, Vol. 33 (2-3): 1'1'.165-178. Ernest J. Friedman-Hill, The Java Expert System Shell, http://herzberg.ca.sandia.gov Ijess, Sandia National Laboratories, USA, 1999. Fenves, S. J., 2001, "A Core Product Model for Representing Design Information," NISTIR 6736, NIST, Gaithersburg, MD. Finch, W W, 1977, Preclicate Logic Representations for Design Constraints on Uncertainty Supporting the Set-Based Design Paradigm, Ph.D Thesis, The University of Michigan, Ann Arbor. Fujita, K., 2000, "Product Variety Optimization under Modular Architecture," Proceedings of Third International Symposium on Tools and Methods of Competitive Engineering (TMCE2000), PI'. 451-464. Fujita, K., Sakaguchi, H., and Akagi, S., 1999, "Product Variety Deployment and its Optimization under Modular Architecture and Module Commonalization," Proceedings of the 1999 ASME Design Engineering Technical Conferences, Paper No. DETC99/DFM-8923, ASME. Fujita, K., Akagi. S., Yoneda, T., and Ishikawa, M., 1998, "Simultaneous Optimization of Product Family Sharing System Structure and Configuration," Proceedings of the 1998 ASME Desion Engineering Technical Conferences, Paper No. DETC98/DFM-5722, ASME. Fujita, K. and Ishii, K., 1997, "Task Structuring Toward Computational Approaches to Product Variety Design," Proceedings ofthe 1997 ASME Design Engineering Technical Conferences, Paper No. 97DETC/DAC3766, ASME. Gaithen, N., 1980, Production and Operations Management: A Problem-Solving and Decision-Making Approach, The Dryden Press, New York. Gero, J. S., 1990, "Design Prototypes: A Knowledge Representation Schema for Design," AI Magazine 11(4): 26-36. Goldberg, D. E., 1989, Genetic Algorithms in Search, Optimization, and Machine Learning, AddisonWesley Publishing Company, Inc., New York. Gonzale-Zugasti, J. P, 2000, Models for Platform-Based Product Family Design, Ph.D Thesis, MIT, Cambridge. Corti, S. R., Gupta, A., Kim, G.J., Sriram, R. D., and Wong, A., 1998, "An Object-Oriented Representation for Product and Design Process," Computer-Aided Design, Vol. 30, No.7, PI'. 489-501. Gu, P and Sosale, S., 1999, "Product Modularization for Life Cycle Engineering," Rohotics and ComputerIntegrated Manufacturing" Vol. 15(5): PI'. 387-401. Ishii, K.,Juengel, C, and Eubanks, E, 1995, "Design for Product Variety: Key to Product Line Structuring," ASME Design Theory and Methodology Conference, Boston, MA, DE-Vol. 83: PI'. 499-506. IONA, Orbix2 Programming Guide: IONA Technologies Ltd., 1997. Jeffrey B. D., Gonzalez-Zugasti,J. P, and Otto, K. N. 2001, "Modular Product Architecture," Design Studies, Vol. 22(5): PI'. 409-424. Jiao, J. X., Tseng, M. M., Ma, Q., and Zou, Y, 2000, "Generic Bill of Materials and Operations for High-Variety Production Management," Concurrent Engineering: Research and Application, Vol. 8, No.4, PI'. 297-322. Krishnan, V. and Gupta, S., 20rll, "Appropriateness and Impact of Platform-based Product Development," Management Science, 47(1): PI'. 52-68. Kotler, P, 1989, "From Mass Marketing to Mass Custornization,' Planning Review, Vol. 17(5): PI'. 10-15. Kusiak, A. and Huang, C C, 1996, "Development of Modular Products," IEEE Trans. on Components, Packaging, and Manufacturing Technology, Part-A, Vol. 19(4): PI'. 523-538. . Lee, H. L. and Tang, C S., 1997, "Modeling the Costs and Benefits of Delayed Product Differentiation ," Management Science, Vol. 43(1): PI" 40-53. Leger, Chris, Automated Synthesis and Optimization ofRobot Configurations: An Evolutionary Approach, Ph.D Thesis, The Robotics Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, 1999. Martin, M. and Ishii, K., 1996, "Design for Variety: A Methodology for Understanding the Costs of Product Proliferation," 1996 Design Theory and Methodology Conference (Wood, K., ed.), Irviue, CA, ASME, Paper No. 96-DETC/DTM-1610. McDermott, C. M. and Stock, G. N., 1994, "The Use of Common Parts and Designs in HighTech Industries: A Strategic Approach," Production and Inventory ManagementJournal, Vol. 35 (3): PI'. 65-68.

34 Xuan E Zha and Ram 0. Sriram

McKay, A., Erens, E, and Bloor, M. S., 1996, "Relating Product Definition and Product Variety," Research in Engineering Design, Vol. 8 (2): pp. 63-80. Meyer, M. H., 1997, "Revitalize Your Product Lines through Continuous Platform Renewal;' Research Technology Management, Vol. 40(2): pp. 17-28. Meyer, M. H. and Utterback, J. M., 1993, "The Product Family and the Dynamics of Core Capability," Sloan Management Review, Vol. 34 (Spring): pp. 29-47. Meyer, M. H., Tertzakian, P, and Utterback,J. M., 1997, "Metrics for Managing Research and Development in the Context of the Product Family," Management Science, Vol. 43(1): pp. 88-111. Meyer, M. H. and Lehnerd, A. P, 1997, The Power of Product Platforms, New York: The Free Press. NRCC (Nationa Research Council of Canada), Fuzzy Logic in Integrated Reasoning, webpage: http:// www.iit.nrc.ca/lR_public/fuzzy/, 2003. Nutt, G.]., 1992, Open Systems, Prentice Hall, Englewood Cliffs, NJ. Pahl, G. and Beitz, W, 1996, Engineering Design-A Systematic Approach, New York: Springer. Pfalrz, J. L. and A. Rosenfeld, 1969, "Web Grammars," Proceedings of First International Joint Conference on Artificial Intelligence, Washington, nc. pp. 609-619. Pine, B. J., 1993, Mass Customization- The New Frontier in Business Competition, Boston, MA, Harvard Business School Press. Paredis, C. J.J., An Agent-Based Approach to the Design ofRapidly Deployable Fault Tolerant Manipulators, Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, 1996. Rosen, 0. W., 1996, "Design of Modular Product Architectures in Discrete Design Spaces Subject to Life Cycle Issues," 1996 ASME Design Automation Conference, Irvine, CA. 96-DETC/DAC-1485. Reddy, G. and Cagan,]., 1995, "An Improved Shape Annealing Algorithm for Truss Topology Generation," ASMEJournal of Mechanical Design, Vol. 117: pp. 315-21. Rothwell, R. and Gardiner, P, 1990, "Robustness and Product Design Families," Design Management: A Handbook of Issues and Methods (Oakley, M., ed.), Basil Blackwell Inc., Cambridge, MA, pp. 279-292. Rushton, G. Z., 2000, Development of Modular Vehicle Systems, Department ofIndustrial and Manufacturing Systems Engineering, University of Michigan, Dearborn. Sanderson, S. and Uzumeri, M., 1995, "Managing Product Families: The Case of the Sony Walkman," Research Policy, Vol. 24: pp. 761-782. Samuel, A. K. and Bellam, S., 2000, http://www.glue.umd.edu/~sbellam/. Sanderson, S. W, 1991, "Cost Models for Evaluating Virtual Design Strategies in Multi-cycle Product Families,"Journal of Engineering and Technology Management, Vol. 8: pp. 339-358. Shirley, G. v., 1990, "Models for Managing the Redesign and Manufacture of Product Sets," Journal of Manufacturing and Operations Management, Vol. 3 (2): pp. 85-104. Siddique, Z. and Rosen, 0. W, 2001, "On Discrete Design Spaces for the Configuration Design of Product Families," Artificial Intelligence in Engineering, Design, Automation, and Manufacturing, Vol. 15, pp.I-18. Siddique, Z. and Rosen, 0. W, 1999, "Product Platform Design: A Graph Grammar Approach," Proceedings of DETC' 99, 1999 ASME Design Engineering Technical Conferences, Sept. 12-16, 1999, LasVegas, Nevada, DETC99/DTM-8762. Siegel, J., COREA: Fundamentals and Programming: OMG, 1996. Simpson, T. W., 1998, A Concept Exploration Method for Product Family Design', Ph.D Dissertation, System Realization Laboratory, Woodruff School of Mechanical Engineering, Georgia Institute of Technology. Simpson, T. W., Maier,J. R. A., and Mistree, E, 2001, "Product Platform Design: Method and Application," Research In Engineering Design, Vol. 13, pp. 2-22. Sivaloganathan, S., Andrews, P T.]., and Shahin, T. M. M., 2001, Design Function Deployment: A Tutorial Introduction, Journal of Engineering Design, Vol. 12, No.1, pp. 59-74. Sivard, G., 2000, A Generic Information Platform for Product Families, Doctoral Thesis, Royal Institute of Technology, Sweden. Sriram, R. D., 1997, Intelligent Systemsfor Engineering: A Knowledge-based Approach, London: SpringerVerlag, UK. Sriram, R. 0., 2002, Distributed and Integrated Collaborative Engineering Design, Sarven Publishers, Glenwood, MD 21738, USA. Stadzisz, P. C. and Henrioud, J. M., 1995, "Integrated Design of Product Families and Assembly Systems," IEEE International Conference on Robotics andAutomation, Nagoya, Aichi,Japan, Vol. 2 of3: pp. 1290-1295. Stone R. B., Kristin L. W, and Crawford, R. H., 2000, "A Heuristic Method for Identifying Modules for Product Architectures," Design Studies, Vol. 21(1): pp 15-31.

Platform-based product design and development

35

Stodes, M., 2000, Managing Engineeting Knowledge: MOKA Methodology for Knowledge Based Engineering Applications, MOKA Consortium, London. Suh, N. P., 1990, The Principles of Design, New York: Oxford University Press. Szykman, S., Sriram, R. D., and Regli, W C; "The Role of Knowledge in Next-generation Product Development Systems,"Journal of Computing and Information Science in Engineering, Transactions of ASME, Vol. 1, pp. 3~ 11. Szykman, S., Racz,). W, Bochenek, C., and Sriram, R. D., 2000, "A Web-based System for Design Artifact Modeling," Design Studies, Vol. 21, No.2, pp. 145-165. Tichem, M. et al., 1997, "Designer Support for Product Structuring-Development of a DFX Tool within the Design Coordination Framework," Computers in Industry, Vol. 33(2-3): pp. 155-163. Tong, C. and Sriram, 0. (Eds.), 1991a, Artificial Intelligence in Engineering Design: Volume 1Representation: Structure, Function and Constraints; Routine Design, Academic Press. Tong, C. and Sriram, 0. (Eds.), 1991b, Artificial Intelligence in Engineering Design: Volume IlIKnowledge Acquisition, Commercial Systems; Integrated Environments, Academic Press. Tseng, M. M. and Jiao, ). X., 1996, "Design for Mass Customization," CIRP Annals, Vol. 45, No.1, pp.153-156. Tseng, M. M. and Jiao, ). X., 1998, "Product Family Modeling for Mass Customization," Computers in Industry, Vol. 35(3-4): pp. 495-498. Ulrich, K. and Tung, K., 1991, "Fundamentals of Product Modularity," Proceedings of ASME Winter Annual Meeting Conference, Atlanta, GA. DE Vol. 39: pp. 73-80. Ulrich, K., 1995, "The Role of Product Architecture in the Manufacturing Firm," Research Policy, Vol. 24(3): pp. 419-440. Ulrich, K. T. and Eppinger, S. D., 1995, Product Design and Development, McGraw-Hill, Inc., New York. Uzumeri, M. and S. Sanderson, 1995, "A Framework for Model and Product Family Competition," Research Policy, Vol. 24: pp. 583-607. Vuuren, W V and Halman,). I. M., 2001, "Platform-Driven Development of Product Families: Linking Theory with Practice," Proceedings of Conference on "The Future of Innovation Studies", Eindhoven University of Technology, The Netherlands. Wang, F., Fenves, S. )., Sudarsan, R., and Sriram, R. D., Towards Modeling the Evolution of Product Families, Proceedings of2003 ASME DETC, Paper No. CIE-48216. Wheelwright, S. C. and Sasser, WE., 1989, "The New Product Development Map," Harvard Business Revieu\ Vol. 67 (May-June), pp. 112-125. Wheelwright, S. C. and Clark, K. B., 1992, "Creating Project Plans to Focus Product Development," Harvard Business Review; Vol. 70 (March-April): pp. 70--82. Yu,). S., Gonzalez-Zugasti,). P., and Otto, K. N., 1999, "Product Architecture Definition Based Upon Customer Demands," Journal ofMechanical Design, Transactions ofthe ASME, Vol. 121(3): pp. 329~335. Zha, X. F. and Du, H., 2001, "Mechanical Systems and Assemblies Modeling Using Knowledge Intensive Petri Net Formalisms," Artificial Intelligencefor Engineering Design, Analysisand Manufacturing, Vol. 15 (2), pp.145-171. Zha, X. F. and Lu, W F.,2002a, "Knowledge Support for Customer-Based Design for Mass Customization," AID'02, Kluwer Academic Press, pp. 407-429. Zha, X. F. and Lu, W F., 2002b, "Knowledge Intensive Support for Product Family Design," Proceedings of 2002 ASME DETC, Paper No. DETC02-DAC 34098. Zha, X. F., Web-based Knowledge Intensive Intelligent Support for Robot Family Design, Proceedings of 2002 IEEE/RSJ International Conference on Intelligent Robots and System, Vol. 2, 2002, Page(s): 1814-1819. Zha, X. F., Sriram, R. D., Lu, W F., and Wang F., 2003, "Evaluation and Selection in Product Design for Mass Customization," Intelligent Knowledge-based Systems: Business and Technology in New Millennium, Cornelius T. Leondes (ed), Kluwer Academic Publishers, USA.

KNOWLEDGE MANAGEMENT SYSTEMS IN CONTINUOUS PRODUCT INNOVATION

MARIANO CORSO, ANTONELLA MARTINI, LUISA PELLEGRINI, AND EMILIO PAOLUCCI

1. INTRODUCTION

Knowledge Management (KM) is relatively new but still a very hot topic in management research and practice. Leading companies are reshaping their organisations in order to increase their ability in managing knowledge sharing and transfer within and across their organisational boundaries. Since the early 90s' management literature has progressively highlighted the importance of KM as the main source of long-term competitive advantage; many contributions emerged from different fields reflecting, therefore, very diverse roots. Product Innovation (PI), in particular, is one of the most promising areas where Knowledge Management is today applied and studied. It is assuming a central role in strategic competition because of competitive advantage entity and endurance, and the intrinsic imitation difficulties related to path dependency [1; 2; 3]. Furthermore, the continuous rise of technological opportunities, new competitors and new customer requests, as well as the hyper-competition which characterizes the environment [4], not only have ascribed great importance to PI, but have also imposed a complete change in the organization and management of New Product Development (NPD) projects. As product development processes are becoming more and more frequent and interrelated, management attention progressively shifts from the single project to the reuse of design solutions over time [2; 5; 6; 7] in a project family [8; 9] as well as to the company level process oflearning and knowledge transfer and reuse [10; 11; 12; 13]. Accordingly, management literature regarding PI process organization and management evolved from a "relay race" approach to a cognitive 36

Knowledge management systems in continuous product innovation

37

approach, that is, from an approach interpreting Product Innovation as an activity where the most important strategic and organizational variables are properly planned, to an approach which considers product development as a knowledge-intensive activity

[14].

While focusing attention on PI issues connected with knowledge creation and management, the cognitive approach offers a new perspective for supporting management of the PI process: it requires companies to become more effective in managing knowledge, overcoming space, time and organisational barriers, mostly due to the separation between knowledge source and the locus where knowledge itself is potentially used [15]. Overcoming these barriers, that may hinder synergy and learning, is the essence ofKM. For western European Small and Medium Enterprises (SMEs) in particular, the main challenge in Product Innovation rather than managing major R&D projects is to continuously improve products and services in all phases of the product life cycle, making engineering phases assume great importance, differently from what happens for large companies. This implies the ability to create and manage knowledge in all company processes, also leveraging on external sources of knowledge. Organizational/managerial tools along with new emerging Information and Communication Technologies (ICTs), particularly Internet applications, can playa key role in this process, potentially refraining competition. By providing quick and easy access to external sources of knowledge and to new and more intense communication channels with partners, ICTs can reduce the importance of traditional constraints on SME innovation ability, while leveraging their flexibility and responsiveness. Using stateof-the-art technologies, innovative SMEs will become more capable of developing and exploiting their intellectual capital both inside their borders and in knowledgeintensive and dynamic networks. Less innovative SMEs are probably doomed to be progressively swept off the market by new competitors from Eastern Europe and developing countries. In the area of PI the use ofInternet, Extranets and Intranets and other tools such as Product Data Management, Virtual Prototyping, and Computer Aided Design is expected to substantially reshape the overall process of knowledge creation, embodiment and reuse. Notwithstanding these facts, current managerial literature on KM in Product Innovation is characterized by an ICT bias and disregards the importance of integration between three types of levers: organizational, managerial and ICT tools. But while there is a growing need to manage Knowledge in PI, traditional literature was lacking of empirically tested supportive models to help managers understand 1) the processes through which knowledge is managed across wide and dynamic networks, 2) the tools supporting such processes and 3) their impact on performances. New empirically grounded contributions are therefore needed to support SMEs in developed regions to adequately combining such tools in order to rethink their Knowledge Management Systems (KMS) to sustain Product Innovation in their specific environments. This chapter aims at identifying and describing the emergent configurations for KM within SMEs, as well as the determinants of the adoption of such configurations

38

Corso et al.

and the impact on firm performances. A particular emphasis is placed on the use of Internet technologies and their impact on the sharing and transfer of knowledge in firms-both internally and with other partners. We present results obtained in a broad research project which combines comparative case studies and survey methodologies. In the early stage of the project (see §3.2), the nature of the investigated phenomenon and the substantial lack of consolidated models located our analysis in the pre-paradigmatic phase of the theory development, suggesting the application of methodologies based on the case analysis. In the second stage of the project (see §4), evidence was based on a survey on a casual sample of 127 SMEs localized in Northern and Central Italy. SMEs belonging to this part of Italy are of two types: (i) firms that sell directly to consumers; (ii) suppliers of large organizations localized into the same geographical area. The results presented are likely to be applicable to SMEs with similar characteristics. The rest of the chapter is divided into five sections. The next section provides a definition of Knowledge Management Systems; the §3 discusses the state of the art literature on KM in PI. Sections four and five describe respectively, the investigation framework and the methodology adopted in the empirical research. Section six presents and interprets the results from the field studies. Finally, §7 provides conclusions and suggestions for further research undertakings. 2. KNOWLEDGE AND KNOWLEDGE MANAGEMENT

2.1. The concept of knowledge in management literature

It is commonly accepted that Knowledge represents the most significant resource of our time and the main source of power and competitive advantage [16]. Yet, the concept of knowledge is very complex and it has been approached from several points of view in literature and it is therefore very hard to give one single definition of knowledge. We will therefore resort to a multidimensional definition proposing a set of six complementary definitions that, while not singularly comprehensive, together may give an overview ofhow the concept of knowledge has been used in management literature.

• Knowledge is based on human beliif. This derives from the Greek philosophers who believed "Knowledge is a true justified belief". Knowledge, in other words, is not static, absolute and objective, but rather dynamic, relative and subjective as it emerges from beliefs that are person dependent. Knowledge, therefore, alwaysinvolves a person who knows, and it is based on his/her their perspectives and intentions. Organization, as a consequence, can only learn through individuals. • Knowledge is a purposeful set of information. Knowledge is more than information and data [17]. Data are single observations about facts, so they are not necessarily meaningful; information results from placing data together, including the context, in messages that are meaningful to someone. Knowing, finally, does not mean only having information about a certain topic, but also using it according to a certain purpose. Knowledge therefore always concerns action and is a result of purposeful

Knowledge management systems in continuous product innovation

39

human thinking. Thinking is the process that makes information useful. Thinking is the key to piecing information together, reflecting on experience, generating insights and using those insights to solve problems. • Knowledge is dynamically accumulated over time. Different Knowledge base at the individual or organizational levels derives from different paths or trajectories ofaccumulation of information. The uniqueness and competitive advantage of an organization may be explained in terms of the unique process of knowledge acquisition, articulation and enhancement. This knowledge accumulated over time createsjirm specific resources [18; 19; 20; 21; 22; 23] or core competences [16] that are the key to understanding a company's strategy and results. The stock of knowledge that a company controls at a certain time also influences its ability to learn: to this aim it has been introduced the concept of "absorptive capacity", which is the ability of a firm to recognize, create, store and reuse critical knowledge according to prior level of relevant knowledge. Over time the Knowledge base is operationalized and embedded in routines which include "the forms, rules, procedures, conventions, strategies and technologies around which organizations are constructed and through which they operate. They also include the structure ofbeliefs, frameworks, paradigms, codes, cultures and knowledge that buttress, elaborate and contradict the formal routine". • Knowlet{r,;e circulates at organizationalleve!. People do not learn on their own: the transfer on knowledge among individuals within a certain community helps to create new knowledge. In communities people come to embody ideas, perspectives, prejudice, language and practices of their community. Knowledge circulates through communities. The organization can facilitate this process encouraging and co-ordinating communication and mutual learning. The transfer of knowledge from one community to the other can happen in both tacit and explicit forms. • Knowledge can be shared in tacit or explicit forms: Explicit, or "codified", knowledge refers to knowledge that is transmittable in formal, systematic language. It is discrete, or "digital", and it is captured in records of the past such as libraries, archives, and databases and is assessed on a sequential basis. However most knowledge remains in tacit forms, deeply rooted in a specific context. It entails knowledge which is difficult to express, formalize or share in an explicit way. It involves both cognitive (i.e., mental models which include schemata, paradigms, beliefs that help individuals to perceive and define their world) and technical (i.e., concrete know-how, crafts, skills to apply in specific contexts) elements. Tacit knowledge can be classified in four categories: a) hard to pin down skills-"know how", b) mental models, c) ways of approaching problems-the decision tree people use-and d) organizational routines. Tacit knowledge is intangible and it is difficult to imitate, so according to the Resource Based View, it is potentially an important asset to create competitive advantage [19; 20; 21; 22; 23]. Explicit knowledge is knowledge that can be more easily described and transferred using documents, artefacts or software and can be more promptly transferred and shared. Tacit and explicit knowledge are not totally separate but mutually complementary entities [24]. The assumption is that knowledge is created through the interaction between tacit and explicit knowledge. In particular, knowledge is created through four patterns of interaction between tacit

40

Corso et al.

and explicit knowledge: socialization, explicitation, combination and internalizatIOn.

• Knowledge is created at the boundaries if the old through an incremental process. The process of creation of knowledge relies on combination, comparison and synthesis of what people already know in terms of experience, abilities, information and explicit knowledge. The output of the process oflearning is new knowledge. Although knowledge can allow radical changes and discontinuous innovation, the process of learning through which knowledge is acquired is always somehow continuous and incremental. Knowledge, so defined, can be classified according to different dimensions: a) the nature of what is known [25]: - Declarative knowledge (know what) - Procedural knowledge (know how) - Causal knowledge (know why) - Self motivated creativity (care why) b) the level of diffusion in an organization - Individual level - Group level - Organizational level - Inter-organizational level c) the level of generality and abstraction [26; 17] - Abstract and General knowledge - Specific knowledge d) the way it is capitalized in the organization [27; 28; 29] - Embrained knowledge - Embodied knowledge - Encultured knowledge - Embedded knowledge - Encoded knowledge e) the scope of knowledge [30; 31] - Component knowledge - Architectural knowledge

2.2. Defining a knowledge management system

Knowledge Management is a very complex and multidisciplinary field. Many scholars argue that the term "Knowledge Management" may be in itself perceived as contradictory: knowledge is not a corporate resource as it belongs to individuals. The purpose of KM is to enhance the firm performance by explicitly designing and implementing tools, processes, systems, structures, and culture to improve the creation, acquisition, application and exploitation of knowledge essential for present operations and for future competitive success.

Knowledge management systems in continuous product innovation

41

Many definitions ofKM have been proposed: - KM is the systematic, explicit, and deliberating building, renewal, and application of knowledge to maximize an enterprise's knowledge-related effectiveness and returns from its knowledge assets [32]; - KM is getting the right knowledge to the right people at the right time so they can make the best decision [33]; - KM is bringing tacit knowledge to the surface, consolidating it in forms by which it is more widely accessible, and promoting its continuous creation; - KM is a set of policies, procedures and technologies employed for operating a continuously updated linked pair of networked databases; - KM is the processes of capturing, distributing, and effectively using knowledge; - KM is the process of capturing the collective expertise and intelligence in an organization and using them to foster innovation through continued organizational learning [25; 34; 35]. All these definitions underpin some relevant aspects ofKM: - KM is a configuration of technical, organizational and managerial choices; - The direct effect ofKM is influencing people's behavior and, consequently, company performance; - Knowledge Management can improve effectiveness in all phases of the knowledge lifecyc1e going from Knowledge assimilation and generation, to transfer and sharing, and capitalization and reuse. A more comprehensive and at the same time operative definition of Knowledge Management can therefore be as follows: Knowledge Management is the sum of management systems, organisational mechanisms, information and communication technologies (the Levers) through which an organisation fosters andfocuses individual and group behaviour in terms of assimilation andgeneration, transfer and sharing, capitalisation and reuse of knowledge, in tacit orexplicit forms, and that is useful to the organisation.

Knowledge Management is not about managing "knowledge", nor about managing people, both are more and more difficult especially when dealing with complex tasks of knowledge workers. KM is rather about creating an organizational environment where people are naturally encouraged to learn and share knowledge. Knowledge Management can therefore be viewed as an emergent process in which people are encouraged to align goals, integrate bits and pieces of information within and across organizational boundaries, and produce new knowledge, which is usable and useful to the organization. Knowledge Management Systems (KMS) therefore exist and must be designed in the context of organizations, organizational culture and other management systems. The managerial challenge is to create a sustainable work organization-or configuration of organizational mechanisms, reT and management tools-which enables efficiency, innovation and good quality of working life.

42

Corso et al.

3. LITERATURE REVIEW

Management literature has highlighted how knowledge becomes the only source of sustainable competitive advantage in turbulent contexts and the cognitive perspective represents the most adequate approach to analyse and understand Product Innovation. The roots of cognitive perspective can be found in the Resource-Based View [36; 18]: 'a resource based theory oj thefirm thus entails a knowledge-based perspective' [37], as knowledge leads to a set of capabilities enhancing survival and growth chances [38]. The Resource-Based View considers the firm as a set ofresources whose accumulation and use over time, through innovation processes, explain the dynamics of competitive advantage acquisition and exploitation [19; 20; 21; 22; 23]. More in particular, the Resource-Based View highlights how the exclusive possession of resources-the inputs the firm owns or controls [39]-originates rents, as resources are not uniformly distributed among firms and are characterized by mobility barriers. The combination of resources creates distinctive competencies, which allow the firm to reach positions of competitive advantage over competitors. The competitive advantage sustainability depends on resource combination and characteristics, with particular reference to the aspects ofvalue (the ability to size opportunities or thwart competitive threats), scarcity (the lack of competitors in the industry), the imperfect possibility of imitation (the resource can be sustained for long periods without competitors replicating or acquiring it), and the lack of substitutes (the lack of strategic equivalents) [23]. Accordingly, because of its characteristics of tacitness, inimitability and immobility, knowledge is a major source of competitive advantage [40]. As during the last few years the cognitive approach has produced a large but fragmented mass ofliterature, the objective ofthis paragraph is to tie together this literature in order to offer an interpretative review, following a historical-evolutive perspective. We intend to produce a coherent framework to help understand what is actually known regarding Knowledge Management in Product Innovation and what is the emergent trend in the research itself. As PI becomes a daily concern and knowledge assumes a critical role, companies are required to become more effective in managing knowledge within and across their organisational borders. This entails overcoming organisational, time, and space barriers, mostly due to the separation between the source of knowledge and the locus where knowledge itself is potentially applied [12; 41]. Overcoming these barriers that may hinder synergy and learning is the essence of Knowledge Management [42]. Following the example of excellent companies, seminal contributions show how sustainable competitive advantage may derive from a systemic approach in managing knowledge in the PI process. Excellent companies, in particular, show superior ability both in enlarging the scope ofthe PI process (including all main sources ofknowledge) , and in proactively fostering the overall process ofknowledge creation and management [34; 43; 44; 45]. On the basis of the framework represented in figure 1, we can therefore review literature on Knowledge Management in Product Innovation tracking how it evolved towards systemic management ofknowledge along two main dimensions: i) the scope of

Knowledge management systems in continuous product innovation

~

101

~ :X

... z ~

::: i=

~ :Il

:;;

:s

s "

W

.. .... ~~... .. .. .. . . .. ='"

Relationship with external actors

PI portfolio

C

.. 'l:l OIl C

'l:l .. ~

g

~-

E

CIl

C

E

-

Single PI proce s

ORGANISAIIO U :ARNING

it~

>~ c:> '" C C

f

..

43

t

~-

INTERM ULTI-PROJ ECT

ORGANISATIONAL

MA AGEME

DI:'SIGN

.

OIl"

'l:l

C

~~ ~ c:>,:

c '"

~

Figure 1. Knowledge Management in Product Innovation-s-a Literature Review.

the knowledge-creating PI system, and ii) the emphasis in the Knowledge Management process. The first dimension summarises the degree to which contributions in literature progressively enlarge the boundaries of the PI process to take into account possible sources or uses of knowledge, both internal and external. On this dimension, management scholars in the cognitive approach progressively shifted attention from knowledge integration among PI phases within the same project, to knowledge integration among different PI projects over time and, finally, to knowledge integration with internal and external partners outside the traditional boundaries of product development. The second dimension-the emphasis in the KM process-is related to the level to which the different contributions consider the overall process of knowledge creation and management. A KM process is in general described as a sequence of three or more sub-processes or phases [46; 47; 48], not necessarily sequentially or hierarchically ordered: - Knowledge transferring and sharing (Knowledge Transfer); - Knowledge capitalisation and reuse (Knowledge Capitalisation) - Knowledge assimilation and generation (Knowledge Creation). Literature showed different levels of completeness in analysing the Knowledge Management process going from mere attention to information and knowledge sharing, to knowledge codification and storing for reuse and, finally, to the overall process of knowledge creation and management.

44

Corso et al.

The combination of these two dimensions produces a bi-dimensional space where evolutions over time ofthe main streams ofliterature can be mapped (Fig. 1). However readers should be warned about the fact that overlapping and fuzzy borders between different streams exist'. 3.1. Main streams in literature

Concurrent engineering Since the early '80s Concurrent Engineering (CE) has been considered the new paradigm for product development. When compared with more traditional approaches, CE is characterised by stronger emphasis on integration among different product development phases: phased program planning is replaced by the joint participation of different functional groups in the product development process [10]. This creates many advantages that have been highlighted in literature; for example, shorter time to market [49], better communication and less inter-functional conflict [1; 10; 50; 51; 52; 53], fewer reworks and loops and, consequently, higher quality and lower cost products [49]. As far as Knowledge Management is concerned, CE played a key role in the development of a cognitive perspective in Product Innovation. CE, in fact, stressed the importance of a richer and more continuous communication within the development process, shifting attention from the transfer of articulated and complete information to the sharing of knowledge often in tacit forms. The need for overlapping ongoing activities, as a matter of fact, implies working in cross-functional groups-often co-located-where stronger and richer communication is fundamental to making innovation and co-ordination possible [54]. As the main emphasis is on the integration and speed ofa specific innovation process, knowledge is shared and socialised in tacit and contextual forms while limited emphasis is placed on codifying knowledge or on abstracting and generalising from current experience to foster future innovation. Flexible design With CE management attention shifted from designing structures for innovation to designing the innovation process, thus inducing a more holistic perspective to product development. In KM terms, however, CE limited its focus to the implementation and sharing of existing knowledge, without taking into account the overall learning process. CE, moreover, maintained a rigid separation between the locus of knowledge generation, when the product concept is generated, and the locus of implementation, when the product is actually developed [52; 53; 54; 56]. Iansiti (1995) highlights how "concurrent engineering models normally do not imply the simultaneous execution ~f conceptualisation and implementation, but rather thejoint participation if different functional groups in the execution if these separate and sequential sets of activities"2 along which the product is 1 In the analysis of literature we consider articles published in major English-language North American and European journals. These studies have been selected on the basis of the citation degree by other researchers. 2 Iansiti. M. (1995). p. 41.

Knowledge management systems in continuous product innovation

45

defined, designed, manufactured and launched in the market. But in extremely turbulent environments, unpredictable technological and market changes create deadlines that even the fastest development process cannot meet. In such environments, the ability to react to newly discovered information during project evolution becomes the key factor for the competitive advantage itself. In this context a new and more flexible model of product development is emerging [56; 57; 58), which, in deep contrast with the traditional one, implies the ability to move the concept freeze milestone as close to market introduction as possible. This implies the ability to overlap the two fundamental development phases: on the one side, the concept development (analysis of customer needs and technological possibilities together with their translation into a detailed concept) which aims at specifying product features, architecture and critical components, and, on the other, the implementation phase (translation of the product concept objectives onto a detailed design and, thus, onto a manufacturable product). In a Knowledge Management perspective this means taking into account and fostering rapid learning loops within the overall product development process. Multi-project management

Starting in the late '80s a new stream of literature emerged highlighting the potential limits of CE in a long-term horizon. One of the main criticism was that while emphasising integration among PI phases, CE potentially isolates each innovation process from the rest of the organisation. As Product Innovation is becoming more and more frequent and resource consuming, however, effectiveness in managing the single product is not enough. Success depends even more on exploiting synergies amongst projects by both fostering commonality and reuse of design solutions over time, thus shifting attention to project families. In particular, re-using design solutions [5; 6] and focusing on product families [8; 9] means concentrating attention on the architecture of the product, that is on the way components and skills are integrated and linked together into a coherent whole [30]. In this way it is possible to devote more attention to managing sets of related projects, thus avoiding inefficiencies connected with individual projects 'micromanagement' and obtaining better performances in terms of common parts ratio, carried-over parts ratio and design reuse [2]. Although Multi-Project Management was nothing new in management literature, the problem of portfolio management in Product Innovation could hardly be linked to the traditional applications in engineering projects. The latter, in fact, focuses on contexts where the main problem is managing interdependencies among simultaneous projects deriving from the sharing of a common resource pool [59]. In PI, on the contrary, most interdependencies derive from transfer of knowledge and solutions between projects over time [1; 60; 61]. Analysing interdependencies, some authors focus on the actual object of the interaction [34; 62] distinguishing between interactions related to the exchange of tangible technological solutions (e.g., parts, components), of codified knowledge (patents, processes and formulas) and of non codified know how, generally person-embodied. Others focus on the scope of the interaction [39), distinguishing between component level and architectural level. A third, and last, group of contributions focuses on the

46

Corso et al.

approach in the tranifer process, that can either be reactive-when solutions and knowledge from past projects are ex post retrieved and reused-or proactive, when solutions are deliberately developed to be used in the future for projects that have not yet been planned [61; 7]. Many authors showed how traditional reactive policies based on carry over of parts and subsystems are intrinsically limited and may also be detrimental to innovation [1; 63]. Excellent companies instead use proactive policies where ex-ante efforts are made to predict characteristics and features of new parts and subsystems to suit future applications. Depending on the architectural or component knowledge embodied in the solutions, these proactive polices are named "product platforms" or "shelf innovation" [5; 6; 7; 8; 9; 64]. The urgency to manage interdependencies among projects over time induced many companies to conceive new organizational and managerial approaches. In many cases this entailed the introduction of new roles and intermediate decision levels, such as Product Manager, Platform and Program Manager [6; 65]. Cusumano and Nobeoka (1992-n. 2) explicitly introduce commonality and reuse of design solutions over time in their strategy-structure-performance framework, systematising the management literature on the PI process in the auto industry. Other authors stressed the importance of developing product plans at companies or product family levels [5; 6; 61]. In particular Wheelwright and Sasser (1989-n. 5) emphasise the necessity of a 'New Product Development Map' which allows managers to understand technological and market forces driving past and present evolution ofproduct lines from one generation to another, thus providing "a context for relating concurrent projects to one another'", Linking the intensity of project changes to manufacturing process innovation, Wheelwright and Clark (1992-n. 6) allege that many NPD failures are caused by the lack of an aggregate plan for coordinating existing projects. Meyer and Utterback (1993-n. 9) and Sanderson and Uzumeri (1995-n. 8) emphasise not only the necessity to shift attention from single projects to product families, in order to enable the development and sharing of key components and assets, but also the opportunity to go beyond individual product families, in order to consider relationships between product families, as they enable higher commonality in technologies and marketing. More in particular, Meyer and Utterback (1993-n. 9), connecting product families to the management of a firm's core capabilities, develop a normative model to map product families and evaluate the dynamics of the embedded core capabilities. The resulting product family map developed into four hierarchical levels-the family itself, the platforms, the product extensions and, then, the single products-and constitutes the basis to assess the evolution of a firm's core capabilities, analysed into their four key dimensions-product technology, customer needs comprehension, distribution and production. Sanderson and Uzumeri (1995-n. 8), instead, trace back Sony's decade-long dominance in Walkman production to its skill at managing the evolution of its product families and, more exactly, to four specific tactics of product planning: the variety-intensive product strategy, the multilateral management of product design, the judicious use of industrial design and the commitment to minimizing design cost. "Wheelwright and Sasser (1989-n. 5), p. 125.

Knowledge management systems in continuous product innovation

47

In all casesproduct solutions are considered the most powerful vehicles to accumulate and transfer knowledge from one product to another. Organisational learning

A rich stream of literature from different research fields emerged in the last decade dealing with organisational learning in Product Innovation [42]. Compared with the previously described streams these contributions place much more emphasis on the dynamics of knowledge creation and transfer over time. As in Multi-Project Management, the focus is on the relations among projects over time rather than on the single development process. While Multi-project literature mostly focuses on knowledge embodied in design solutions, organisational learning literature emphasises the importance of transferring knowledge also in tacit form or embedding it into processes and organisational routines [34; 44; 66]. While Multi-Project literature, moreover, considers the reapplication of knowledge as a rather automatic process, Organisational Learning literature emphasises how the issue is too articulated to be dealt with normatively [66; 67] and how learning and reuse of knowledge may face barriers at both the organisational and the individual levels, calling therefore for an aware support by management [29]. In particular many potential difficulties entangle the process oflearning across different projects [11; 12; 68]. Von Hippel and Tyre (1995-n. 68) focus their attention on problems connected with knowledge reuse when dealing with innovative projects. Imai, Nonaka and Takeuchi (1995-n. 44) devote their attention to the urgency to unlearn past lessons in order to eliminate dangers in terms of NPD toughening. Arora and Gambardella (1994n. 26) emphasise how knowledge has to be abstracted from each specific project and generalized in order to extend past experience to future PI projects. Abstraction and generalization entail, respectively, the selection of some relevant information and elements, as well as the definition of those criteria which allow knowledge to be applied. Only abstract and general knowledge allow the creation of both a long-term competitive advantage in different product/market segments and new businesses: firms competitiveness comes from the ability to build at the best cost and time conditions with respect to competitors, the key competencies to develop new products [16]. Other authors stressthe importance ofthe role ofmanagement in designing adequate enablers for learning to take place in Product Innovation [34; 44]. Bartezzaghi et al. (1997-n. 12) suggests that designing adequate vehicles to support knowledge storing and dissemination over time is a fundamental lever to foster innovation. These vehicles should be designed coherently with the organisation's corporate and national culture [69]. Nonaka (1991-n. 34) and Hedlund (1994-n. 43) classify the different processes ofknowledge conversion and introduce the concept ofknowledge creating spiral: new knowledge is generated through cycles of knowledge socialisation, externalisation, combination and internalisation. Nonaka and Konno (1998-n. 70) reaffirm the above model, describing a 'space' (the concept ofba') that is conducive to knowledge creation. Other authors focus on the concept of' communities of practice' which is a special type of informal network that emerges in an organization and to which access is

48

Corso et al.

dependent on social acceptance [71; 72; 73; 74]. As these communities playa role in the creation of collective knowledge, managers should respect the 'situated activity' in order to develop them. Most contributions, however, share the underneath assumptions that Product Innovation is the outcome of NPD projects over time. Downstream phases are considered important only as far as they can provide information for feeding next generation product development, or even constraints that should be anticipated and considered during development [1]. Some contributions, however, diverge from such perspective, indicating the necessity to extend innovative efforts to the overall product life-cycle [13; 62; 75; 76; 77]. Itami (1987-n. 62) suggests how excellent company experimentation and technological strategies are often aimed at generating knowledge through trial and error learning processes that leave the lab and extend to production activities and market. In this way it is possible to start preventive or experimental commercialisation, allowing an important facing with consumers in a phase in which it is still possible to introduce modifications and technological improvements. Bartezzaghi et al. (1999-n. 78) and Corso (2000-n. 13) summarise by stressing the importance of shifting attention from product development to Continuous Product Innovation (CPI), a cross-functional knowledge-based process leading to life-long Product Innovation that implies Product Innovation along all its life cycle. Product development should be considered only as the first, yet important, phase in Product Innovation, which is also extended to down-stream phases such as manufacturing, and after-sale services. While in traditional models feed-back is stored for feeding next generation product development, in Continuous Product Innovation all stages in the product life cycle are potential opportunities for innovation. Inter-organisational design

Starting from the CE concept of inter-functional teams, two partially overlapped streams recently emerged in product development literature further expanding the scope of the PI process to take into account the importance of assimilating and integrating knowledge from outside the traditional boundaries of R&D. Some authors stressed the importance ofdesigning new roles within R&D, such as gatekeepers [50; 78; 79], in order to bridge to the external environment. Others stressed the importance of direct and early involvement of customers and suppliers in inter-organisational groups [1; 11; 80; 81]. More and more contributions stress how for the single firm external complexity hinders the possibility to manage the knowledge system supporting the whole Product Innovation process: not only researchers but also companies themselves become specialised nodes within complex and dynamic knowledge creating networks. Reid et al. (2001-n. 81) highlights alliance form as the optimal collaborative structure for the knowledge-based enterprise, proposing a research model based on an alliance life cycle. Analysing how inter-organisational groups develop knowledge in the PI process, some authors focus on the network, comparing different industries, and highlighting interface aspects facilitating inter-organisational collaboration. Studies are based on

Knowledge management systems in continuous product innovation

49

evidence from different industries, such as automotive [1; 82), ICT (57), automation technology [83], packaging equipment [84), biotechnology [85], pharmaceutical [86; 87], and aeronautical [88]. A second group of contributions investigates the specific relationships the firm builds with actors belonging to the supply chain (vertical agreements), with competitors (horizontal agreements) and with complementary firms and external institutions (cross agreements). In the past, contributions regarding vertical agreements with customers showed how early customer involvement can significantly enhance the success probability in innovation activities and how such involvement should take place in different situations [89; 90; 91; 92]. More recently contributions started focusing on supplier involvement, with emphasis placed on the critical role played by suppliers in the achievement of high performances in Product Innovation [84; 93]. A relevant group of contributions analyses the Japanese approach emphasising how creation of tight relationships with suppliers is based on strong interactivity, continual information exchange and the deep reciprocal reliance [1; 82; 94; 95]. Others explicitly focus on Knowledge Management, with emphasis on the advantages of managing suppliers as sources of knowledge rather than vendors of parts and equipment [10; 11; 96; 97; 98]. A final group of contributions enlarge the scope of the knowledge creating system outside the boundaries of the supply network. Clark and Fujimoto (1991-n. 1) highlight the increasing role played by horizontal agreements with competitors, emphasising how their objectives are shifting from pure market control and influence on standards and regulations to the joint development of technologies and components. Other authors stress the importance of cross agreements with complementary firms and external institutions in order to develop technological breakthroughs, or simply to scan technological opportunities and assimilate knowledge [93; 99; 100; 101]. 3.2. The literature evolutive trend: towards KM configurations

Following a Knowledge Management perspective, literature can be analysed in terms of the scope of the knowledge creating system underpinning the Product Innovation process and of the emphasis placed in analysing the different phases of the knowledge creation and management process. This analysis shows how literature, starting from Concurrent Engineering, progressively enlarged the scope of the PI process shifting from the need to remove cross-functional barriers within the same project, to the need to remove time separation which isolates by different PI projects and, finally, to the opportunity to build inter-organisational relationships. Similarly, it shows how emphasis in the KM process progressively went from mere information and knowledge exchange to knowledge embodiment and transfer for reuse and, finally to the overall process of knowledge creation, diffusion and refinement over time. Each of the above-mentioned developments represents a gradual evolution rather than an abrupt leap; this evolution has progressively added and refilled the previous results, rather than contrasting and substituting them. The strong emphasis CE placed on the need to overcome the functional barriers isolating knowledge sources involved in a project has constituted the starting point from which literature has indicated the

50

Corso et al.

opportunity to search synergies both internally, with other projects, and externally, with knowledge sources outside the organizational borders. Similarly, the CE emphasis on knowledge exchange constitutes the foundation for knowledge reuse and creation. In this sense, each single stream in the evolution of the two considered variables-scope of the Knowledge creating system-and emphasis in the KM process-presupposes and, hence, comprehends the former contributions. In a Knowledge Management perspective, this means that Product Innovation literature shows a unique trend starting from CE and moving towards a more systemic and comprehensive approach to Knowledge Management in PI. The diffusion of new organizational models based on distributed teams and crosscompany collaboration, and the availability of tools based on new ICT, challenged the traditional approaches to the creation and sharing ofknowledge, requiring management practitioners and scholars for more aware and innovative KM approaches. But while there is a growing need to manage Knowledge in PI, traditional literature was lacking of empirically tested supportive models to help managers understand 1) the processes through which knowledge is managed across wide and dynamic networks, 2) the ICT tools and the organizational/managerial mechanisms supporting such processes and 3) their impact on performance. In the last few years, different contributions tried to fill this gap. Most articles highlight the existence of different approaches characterized by a different emphasis on the use of technologies and organizational and managerial tools for managing the flow of knowledge in codified or articulated forms. In particular, in Hansen et al. (1999-n. 102) such Configurations are named Codification Strategy (knowledge is codified and stored in databases where it can be accessed and used easily by anyone in the company) and Personalization Strategy (knowledge is closely tied to the person who developed it and shared mainly through direct person-to-person contacts: computers chief purpose is to help people communicate knowledge, not to store it). Corso et al. (200l-n. 103) have identified three different ICT Approaches SMEs follow in the adoption of ICT in Product Innovation by drawing evidence from analysis of a multiple-case study on 47 SMEs in Northern and Central Italy. On the basis of a contingency framework, such approaches can be related to product and system complexity. More exactly, the empirical research has clearly shown how SMEs are influenced in their choice by product complexity, acting as a deterrent to ICT tool adoption in the PI process, and by system complexity, determining the need for technological co-ordination between SMEs and their customers. While confirming a general gap in the adoption of ICT tools by SMEs, Corso et al. (2001-n. 103) shows how the latter cannot be ascribed to generic considerations concerning cultural lags. The pattern of ICT adoption should rather be analysed in the frame of the wider Knowledge Management System which also include organizational mechanism and management practice. If compared with larger enterprises, in particular, SMEs tend to place more emphasis on the management of knowledge in tacit forms, and communication channels are inter-firm rather than intra-firm. Corso et al. (2003-n. 104) goes further, linking the above ICT patterns with KM internal processes. Three different KM configurations emerge: "Traditional",

Knowledge management systems in continuous product innovation

51

"Codification" and "Network-based"4. The 'Traditional approach' was followed by firms leveraging on traditional mechanisms to transfer and consolidate knowledge both internally and externally, relegating ICT tools to a marginal role; hence, emphasis is on teams, paper documents, interpersonal relationships, gatekeepers and interaction with customers and suppliers. The 'Codification approach' is typical ofthose firms giving great importance to ICTs (particularly CAE, CAM, 2D CAD, DB) containing design solutions and lntraNets, for consolidating and transferring knowledge, and making it codified and peopleindependent. The 'Network-based approach', lastly, is internally characterized by the same behavior showed by firms belonging to the 'Traditional approach': knowledge transfer and consolidation mainly rely on traditional tools (teams, paper documents and interpersonal relationships). At the inter-firms level the use of organizational Levers, with particular reference to gatekeepers and interactions with customers and suppliers, is supported by 'border' ICTs, which is those tools allowing the exchange of data across interfaces toward the external environment (i.e., mainly 3D CAD and InterNet connections) . What is still lacking is the development of empirically tested supportive models to help managers in designing and implementing organisational and managerial tools to foster Knowledge Management. Agenda for future research should therefore analyse in more detail processes through which knowledge, in its different forms, is assimilated, created, transferred, stored and retrieved across wide and dynamic networks, as well as the organisational and managerial tools through which firms can influence such processes. Finally, much more emphasis should be devoted to the influence and potential benefits of emerging Information and Communication Technologies based on internetworking. In the present chapter we are exploring three research questions: RQ1. Find out how widespread the three KM Configurations are and if these configurations coverthe whole field. Three hypotheses arepossible: 1) allthe configurations existin sufficiently large numbers, and together they cover a great percentage of all possible KM configurations; 2) only one or two configurations are really widespread, and there is no other widespread configuration; 3) only one or two configurations are widespread and there are also one or two other large configurations; RQ2. For those configurations, we investigate the drivers which explainsuch choices; and RQ3. Their impact on performance. 4. THE INVESTIGATION FRAMEWORK

Based on literature and previous case studies, we developed the research investigation framework shown in Figure 2, which analyzes three groups of variables and their relationships: Contingencies, KM Configurations and Performances. 4These configurations are used as cluster seed point in the survey data analysis (see Table 5).

52

Corso et al.

b)

CONTINGENCIES

RQ3

PERFORMANCES

KNOWLEDGE MANAGEMENT CONFIGURATIONS

(RQ!)

c)

Figure 2. The Investigation Framework.

Contingencies are exogenous to the model and point out how some firm-specific variables can influence the choice of the ICT and organizational tools-the Leverswhich support the KM process in Continuous Product Innovation. KM Configurations identify the set of Levers SMEs adopt in order to transfer and consolidate knowledge. Knowledge transfer focuses on the flow of knowledge both within and outside the organizational boundaries ofthe firm, while knowledge consolidation represents the efforts organizations perform to capture and consolidate knowledge for future retrievals. Finally, the last block-Performances-sheds light on the effects that the different Configurations have in terms of performance. In the model, the choice of the Levers, made according to Contingencies (arrow a), produces effects in terms of Performance (arrow b). The relationship between Levers and Performances is not one way: if in the short run Levers can have a relevant impact on Performance, in the long run, they tend to affect the choice and use ofICT tools, as well as the selection of the most appropriate organizational tools (arrow c). Specific variables in each group were identified in the previous phase ofthe research, using comparative case studies based on semi-structured interviews [103; 104]. Although a large number of Contingencies were identified in this first part of the research, we focused on those which evidence from case studies showed to have the greatest influence in increasing PI complexity inside SMEs, namely the level of geographical dispersion, the product complexity, the degree of customization and the position in the supply chain. The level of dispersion specifies the existence of only one manufacturing site in front of two or more manufacturing sites. Particularly in SMEs, innovation focuses on the engineering phase rather than on R&D; for this reason the level of geographical dispersion influences the need to transfer knowledge between the different sites. Two indicators define the Contingency connected with product complexity: the internal architectural complexity and the technological complexity [103; 104]. The former conceptualizes the need to integrate the different items into the product's final architecture: the larger the number of components and subsystems,

Knowledge management systems in continuous product innovation

53

the more difficult the architectural choice and, consequently, the more relevant is the role of the architectural knowledge [30]. It is measured in terms of the number of both components and sub-systems (from now on called items) in the bill of materials. Technological complexity translates the variety of distinct knowledge and skill basis which need to be integrated into the final product: the greater the technological complexity, the greater the span of control; that is, the more the variety of skills and required capabilities within the firm. Hence, the multi-technological nature of the products has significant implications for their management in terms ofcompetencies to be developed and knowledge bases to be mastered and integrated. The technological complexity is measured by the Herfindahl-Hirschman Index, which considers the sum of the squared cost percentages attributed to the different embedded technologies (mainly mechanical, electromechanical, electronic, hydraulic and software). Catalogue or custom production (degree of customization) are the variables which explain the different knowledge source: differently from what happens in catalogue production, in the case of customization the customer contributes to the definition of product characteristics, thus becoming an external source of knowledge. Finally, the position in the supply chain is defined by the production of final products or components/subcomponents: it conceptualizes the need to integrate the manufactured item into the final product architecture, hence (in KM terms) the need to own the architectural competence regarding the modality of the integration. KM Configurations are identified by organizational and ICT Levers, which represent 'vehicles' capturing and disseminating knowledge within and outside organizational boundaries (final customers and suppliers) and to future projects. Organizational Levers refer to [12] i) people and ii) reports and databases. People (i) [97) are represented in this chapter by the following Levers: 1) interpersonal relationships between the R&D department members, 2) internal meetings for the transfer of design solutions which emerged in past projects, 3) project teams involving members from other departments (3a) or customers/suppliers (3b), 4) gatekeepers connecting the investigated firm with the external environment and, finally, 5) interaction with customers and with suppliers. Reports and databases (ii) are represented in this chapter by 6) paper documents and 7) ad hoc databases (DB) for storing design solutions. The abovementioned variables can be classified according to two dimensions (Table 1): i) the level of codification Table 1 Organizational levers classification Level of codification Articulated/ explicit levers bJl

.5 c

"0

0-

'0

"" bJl l-o

"

Ci

";;l

>:

s

l-o

.5 ";;l

>:

l-o

lJ

>< >LI

- Meetings (2) - Paper documents (6) - Databases for design solutions (7)

Tacit levers - Interpersonal relations (1) - Project Teams (3a) - Project Teams (3b) - Gatekeepers (4) - Interaction with Customers/Suppliers (5)

54 Corso et al.

of the Levers-that is, the possibility to articulate and, hence, embody knowledge in concrete and tangible representations [41] such as documents and software [105], and ii) the degree of openness towards the external environment [103]. ICT Levers can be classified into two groups: the specific ICT tools adopted in the PI process and the tools supporting integration among organizational units and external actors. In reference to the first aspect, a large number ofICTs have been analyzed: Product Data Management (PDM), two-dimensional Computer Aided Design (2D CAD), Computer Aided Engineering (CAE) and Computer Aided Manufacturing (CAM). With regard to the second aspect-ICT supporting integration-the degree of openness towards the other departments and the external environment has been analyzed. We investigated the presence of i) internal networks (intra-Nets), which connect different departments inside organizational borders or within a group (including e-mail and file sharing to support communication within the technical office and with the other departments), ii) external networks (inter-Nets) connecting different actors along the supply chain and iii) three-dimensional CAD (3D CAD) which-allowing to share virtual objects that can be jointly modified with customers and suppliers-has been included in the tools supporting integration, as the previous step of the research [103] suggested how this tool has usually been adopted by SMEs in order to technologically support coordination especially with customers. The last block in Figure 2 deals with the Performances connected with the PI process: they typically have an operative nature and assess the effectiveness and the efficiency of the PI process. 5. THE RESEARCH METHODOLOGY

The first phase of the research project started in 1999. In this stage, the research framework illustrated in the previous section was refined and specific variables in each class were identified. Evidence was based on a comparison of case studies in a statistically non significant sample based on interest. The use of semi-structured interviews and the selection of an interest based sample, although introducing statistical limitations, allowed researchers to gain a broader understanding of the research issue [for more information see the previous publications of the authors referred as 103; 104]. The second stage of the research-whose results are described in this chapter-was fielded in 2000. One of the explicit objectives of this study was to investigate the emergent Configurations ofICT and organizational tools for KM in SMEs, discussing Contingencies driving the choice of such Configurations and their impact on Performances. In this second stage, the triangulation with the survey methodology aimed at validating the results obtained by means of case studies. The research sample

The study was carried out on 127 SMEs in Northern (Piedmont and Lombardy) and Central (Tuscany) Italy, operating in the mechanical, electronic, plastic and chemical industries.

55

Knowledge management systems in continuous product innovation

Table 2 Population and sample characteristics Sample Population (%) Industries Mechanical Electronic Plastic Chemical Total

Lombardy

Lombardy

Piedmont

Tuscany N.

46.00 23.00 17.00 14.00

76.00 17.17 1.00 5.83

30.30 27.40 23.10 19.20

24

100.00

100.00

100.00

Piedmont

Tuscany

Total

%

N.

%

13 7

43.63 20.00 23.63 12.74

36 12 1 2

70.58 23.53 1.96 3.93

5 8 5 3

22.73 36.36 22.73 18.18

65 31 19 12

51.18 24.41 14.96 9.45

55

100.00

51

100.00

21

100.00

127

100.00

11

N.

%

N.

%

Table 3 Sample characteristics (turnover/employee and the average employee number per industry)

Turnover/employee (Euro)

Average employee number

Mechanical Electronic Plastic Chemical

160,102 232,406 227,241 268,557

100 110 67 148

Total

196,254

102

Industries

The source of the firm nominatives was the Kompass yearbook. Two main criteria were used in deciding the random selection of the sample: - Small and medium size, in terms of employees (from 35 to 350) and turnover (from 2.5 to 60 million Euro), because of the need to define what we meant by SME; - Manufacturing firms belonging to the mechanical, electronic, plastic and chemical industries, because ofthe importance ofsuch sectors in the Italian economic systemboth in terms of number of firms and turnover amount on the Italian GDP In Lombardy, 535 companies were contacted; ofthese, 61 firms (11.4%) returned the questionnaire, but only 55 had been completed (10.28%). In Piedmont, 600 SMEs were contacted, with a 12.17% response rate (73 firms), but only 51 SMEs (8.5%) completed the questionnaire. In Tuscany, 139 SMEs were contacted: 21 of them returned the questionnaire (15.11%) completely filled in. The higher response rate for Tuscany does not depend on the way firms were contacted; the gap with the other two regions can be explained in terms ofa lower number of callsfor survey participation in research projects in this area. In Table 2, population and sample characteristics in terms of SME distribution per geographical area and industry are summarized. Table 3 describes the ratio between turnover and employee number and the average employee number for each industry.

56

Corso et al.

Survey development

After a first telephone contact and a preliminary discussion with managers regarding research project aims, selected SMEs were invited to fill in the questionnaire published on the Web. A message with the link to the research project Website was sent to all people who were contacted for the survey. The Website contains a description of the research aims, instructions for filling in the questionnaire, the researchers' telephone numbers for further explanations/assistance and the questionnaire in html version. The representative of each SME responsible for filling in the questionnaire could read, fill in and send the questionnaire on line. Data were automatically transferred to a database, and then checked for reliability by the researchers. In comparison with traditional survey tools, the use of the Internet allowed advantages both for the interviewers (rapidity in receiving filled-in questionnaires and in data entry) and the interviewees (rapidity of the filling-in and forwarding phase). However, for those firms who do not have Internet access or are unwilling to use it, the questionnaire was sent by fax. In order to reduce fill-in time, the questionnaire tackled only comparative scale answers (ordinal scales in which respondents have to choose the answer with the highest priority), multiple choice answers, interval data (for example: numerical scales asking firms to give a vote ranging from 1 to 4) and relative data. Non-comparative scalesor open questions were used only for quantitative information or when there was not any ambiguity in the answer or when it was impossible to fix a priori alternatives or intervals. Moreover, the html format of the questionnaire allowed a tight control on its filling in. The questionnaire, which contains 87 questions, is structured into five sections: 1) general information regarding SMEs in order to characterize each firm based on its size, dispersion and competitive context; 2) the manufacturing system: its complexity, the innovations recently introduced, the relationships with customers and suppliers; 3) the product: its complexity and the innovations introduced; 4) the PI organization; 5) the use ofICT tools within SMEs in PI. The incentive provided to participants consisted of a personalized report containing the comparison between the KM approach they followed with the one adopted by firms with similar characteristics. Working papers derived from the research were also provided. Data analysis tools

Different statistical techniques have been used in relation to each research question (Table 4) because of the different objects analyzed. The explanation of the statistical tool choice is reported in the following section. Table 4 Data analysis tools Research question

Statistical tools

RQ1: KM Configuration identification RQ2: KM Configuration Drivers RQ3: KM Configuration impact on Performances

Cluster analysis Non-linear regression (Probit model) Factor analysis and Association analysis

Knowledge management systems in continuous product innovation

57

6. RESULTS

RQ1. Analysis of the d!ffusion level of the three KM configuration METHODOLOGY. In order to analyse the diffusion level of the three KM Configurations and respond to the first research question we resorted to cluster analysis. In particular, we used K-means clustering (i.e., non-hierarchical technique). The potential risk of poor explanations that could derive from cluster analysis in pure survey approaches was bypassed thanks to the insight gained in the first stage of our research project [106]. As a matter of fact, the three different approaches-the Traditional, the Codification and the Network-based-identified in the first research step [104] were used as cluster seed points [106] (Table 5). RESULTS. Cluster analysis divided SMEs into three groups (Table 6), which represent the ICT and organizational approaches to KM in Product Innovation. Only the Levers with a clustering role were included in the analysis process: the elimination of variables that are not distinctive (i.e., that do not differ significantly) across the derived clusters, allows the cluster analysis to maximally define clusters based only on those

Table 5 Cluster seed points

Cluster

Interaction 2D 3D CAE/ Intra- InterProject DB for design with customers teams solutions and suppliers Gatekeep. CAD CAD CAM Net Net

Traditional Codification Network-based

1 0 1

0 1 0

1 0 1

1 0 1

0 1 0

0 0 1

0 1 0

0 1 0

0 0 1

In this Table. which should be read horizontally, only Levers actually used in the clusteranalysis havebeen inserted.

Table 6 KM configuration clusters ClustefTRADITIONAL Organizational tools

:0

Project Teams DB for design solutions Interaction with eust/sup Gatekeepers

~

ICT tools

v

~

.;:;

'"C

v

N

»,

'"

0:

0<::

2DCAD 3D CAD CAE/CAM Intra-Net Inter-Net

SMEs (N. and %) per cluster

ClustercODIFICATION

ClusterNETwoRK-lJASED

SMEsN.

%

SMEsN.

%

SMEsN.

%

19 12 42 33

42,22 26,67 93,33 73,33

9 26 18 16

26,47 76,47 52,94 47,06

21 18 36 30

53,85 46, IS 92,31 76,92

SMEsN.

%

SMEsN.

%

SMEsN.

%

17 8 4 26 23

37,78 17,78 8,89 57,78 51,11

29 19 24 33 27

85,29 55,88 70,59 97,06 79,41

32 39 14 26 39

82,05 100,00 35,90 66,67 100,00

45

34

39

This tableshouldbe readvertically: for each cluster, the values in the firstcolumn representthe number ofSMEs, belonging

to that cluster, with the specific variable, while the values in the second column represent the {Xl of SMEs showing the specific variable, with respect to the total number ofSMEs in the cluster.

58

Corso et al.

variables exhibiting differences across the objects [29]. Hence, some Levers were not included because of their very high diffusion rate (paper documents and interpersonal relationships were in use in almost all the SMEs), or very low one (PDM and internal meetings were very scarcely adopted). INTERPRETATION. The 45 companies in the first cluster (KMTRADITIONAd follow what we called a "Traditional" approach. On the intra-firm side, such approach is characterized by a very low diffusion of ICT tools. The most common computer-based tools for KM are, as a matter of fact, internal networks (Intra-Net) used to support internal communication. Interactions and information sharing with the external actors mainly rely on gatekeepers and interaction with customers and suppliers, while the use ofICTs such as 3D CAD and Inter-Net are not widespread. Hence, the "Traditional Approach" is characterized (both internally and externally) by the use of traditional mechanisms for transferring and consolidating knowledge and by the relegation of ICT to a marginal role. Cluster KMcODIFICATION (34 SMEs), is characterized by a high adoption rate of ICT tools supporting knowledge diffusion and storage at both the intra-company and the inter-company level. At the internal level and from an organizational point ofview, firms belonging to this cluster adopt DB for design solutions; from a technological point ofview, knowledge is managed and shared inside the company mainly by means of2D CAD, CAM, CAE and Intra-Net, all presenting, in this cluster, the utmost adoption percentage. At the inter-firm level, the interaction with the external actors along the supply chain is supported (on the ICT side) by Inter-Net, while the interaction with customers and suppliers, as well as the use of gatekeepers, seems to be less important, especially in comparison with the other two clusters. We can argue that this cluster is characterized by the strongest effort in managing and transferring knowledge in codified forms. ICT plays a key role in codifying knowledge and makes it peopleindependent. Finally, the 39 SMEs belonging to the third cluster (KMNETwoRK-BASED) adopt what we named a 'Network-based approach'. At the intra-company level, emphasis is mainly on traditional organizational tools such as paper documents and interpersonal relationships, even if a high diffusion degree of 2D CAD should be noted. At the external level, knowledge sharing is strongly supported by the interaction with customers and suppliers Lever and the use of gatekeepers; it is interesting to note how this cluster presents the utmost adoption percentage ofInter-Net and 3D CAD, supporting collaboration with external actors along the supply chain. Hence, it is possible to conclude that the number ofKM Configurations identified in the first research step are three, all three exist in sufficiently large numbers, and together they cover a great percentage of all possible KM Configurations. RQ2. The drivers expiainino KM confiourations

METHODOLOGY. In order to understand if and how different Contingencies influence the choice of KM Configurations, a two-stage analysis has been performed. In the first stage, the frequency analysis was aimed at creating homogenous groups of SMEs for each Contingency. In the second stage, a nonlinear regression univariate model

Knowledge management systems in continuous product innovation

59

Table 7 Contingencies Contingencies"

Groups

Description

LD

LD 1 LDmulti

One manufacturing site Two or more manufacturing sites

82 44

lAC

IACcompl IACsimpl

no. of items > 10 1 ~ no. of items ~ 10

78 48

TC

TCcompl TCsimpl

0.2 < HH index < 0.8 0.8 ::0 HH index < 1

73 51

3

DC

DC c

DCa

Catalogue and catalogue with modifications Production on order

84 43

0

PSC

PSC c PSCfp

Producers of components and sub-components Producers of finished products

57 69

*Legend: LD: Level of Dispersion TC: Technological Complexity PSC: Positionin the SupplyChain

Number ofSMEs

Missing

lAC: Internal Architectural Complexity DC: Degree of Custornization

Table 8 Probit model connecting contingencies with respectively KMTRADITIONAL, KMCOlJIFICATION and KMNETWORK-BASED (dependent variables) Coefficient KMTRADITIONAL

Coefficient KMcODIFICATION

Coefficient KMNETWORK-BASED

Level of Dispersion (LD)

0.11 (0.24)

-0.11 (0.25)

-0.30 (0.25)

Internal Architectural Complexity (lAC)

0.08 (0.28)

0.09 (0.29)

-0.61 (0.30)**

Technological Complexity (TC)

0.20 (0.28)

-0.38 (0.20)*

0.01 (0.30)

Degree of Customization (DC)

0.50 (0.25)**

-0.39 (0.26)

0.10 (0.25)

Variable

Position in the Supply Chain (PSC)

-0.40 (0.18)**

-0.38 (0.19)**

-0.18 (0.19)

The first number represents the value of the coefficient; the number in brackets is the standard error. The meaning of the asterisks is described below: *Prob. < 10% **Prob. < 5% *** Prob. < 1%

has been used. The Probit model was chosen because the dependent variable (KM Configuration) is binary. The results of the frequency analysis are reported in Table 7. RESULTS. The results of the Probit analyses are presented in Table 8, which considers Contingencies as independent variables and, respectively, the Traditional, the Codification and the Network-based approaches as the dependent variables. INTERPRETATION. All but one ofthe Contingencies that were identified as relevant in the comparative case analysis in the first step of the research confirmed to be relevant in explaining the choice of the alternative KM Configurations in the survey (Table 9).

60

Corso et a!.

Table 9 Relations between contingencies and KM configurations KM Configurations Traditional

Contingencies Level of Dispersion Internal Architectural Complexity Technological Complexity Degree of Custornization Position in the Supply Chain

Make-to-order Components

Codification

Complex

Network-based Complex

Components

Only the significant relations between the KM Configurations and Contingencies are reported.

The 'Traditional approach' is followed by those SMEs which produce components on order, and therefore have to manage complexity in PI arising from the need to continuously exchange knowledge with their customers by means of interpersonal relationships. Evidently ICT tool supply is not considered adequate with respect to the complexity of the communication to be managed: in other words, the benefits connected with the existing ICT could be considered poor if compared with its costs, making the ICT investments not profitable. This is coherent both with the importance in this cluster assumed by the gatekeepers and the relationships with customers and suppliers, and the scarce use of CAD tools. The 'Codification approach' is typical in firms manufacturing technologically complex components: for these firms the necessity to internally integrate heterogeneous knowledge bases requires the use of CAD, CAM and CAE tools in order to codify knowledge. At the same time, the use of Intra-nets and Inter-Nets allow, respectively, an easier integration between the knowledge owned by the different designers/departments, and a better management and transfer of knowledge to and from customers. SMEs belonging to the 'Network-based approach' cope with the complexity arising from the need to integrate different parts into the product architecture. Knowledge sharing with the external partners (particularly with suppliers) assumes a key role: the presence of gatekeepers and the interaction with customers and suppliers are strongly supported in this cluster by the use ofinternet-based technologies and 3D CAD, which offers important advantages in terms of: i) clear and immediate understanding of the way the product is evolving-hence, facilitating the anticipation ofpossible incoherencies and manufacturing/assembling problems, ii) enhanced support to the simultaneous work of designers and the interaction of different departments/organizations. RQ3. Impact

ofthe different KM configurations on performances

We applied a two-stage analysis: we used 1) factor analysis in order to reduce complexity and group performance measures in a limited number of groups that can be represented with a single surrogate representative [106] and 2) association analysis of the KM Configurations with the surrogate representatives for each factor. METHODOLOGY.

Factor analysis on performance measures produces a factor structure with items loading on the appropriate factors [factor loadings greater than +.50 are RESULTS.

Knowledge management systems in continuous product innovation

61

Table 10 Factor analysis of PI performances Factor 1: Information Management Efficiency

Factor 2: Timing

Factor 3: Network Integration

Improvements in data storage efficiency Higher internal communication Better re-use of data and information Better standardization of PI procedures Lower cost for information retrieval Reduction in PI faultiness

0.85755 0.72191 0.64367 0.61821 0.61351 0.52698

0.03174 0.01036 0.19369 0.13695 0.19580 0.27168

0.06342 0.17757 0.02220 0.17841 0.09820 0.33956

Idle time reduction Time-to-market reduction Working process and assembling cycles lead time reduction

0.11234 0.11234 0.23691

0.96288 0.96288 0.62583

0.10406 0.10406 0.26361

Better understanding of customer needs Better collaboration with suppliers

0.08426 0.11465

0.08691 0.20575

0.82712 0.75617

Performance dimensions

Table 11 Impact ofICT on performance as perceived in alternative KM configurations Factors

Surrogates measure

Information management efficiency Timing Network integration

Improvements in data storage efficiency Time-to-market reduction Better understanding of customer needs

Traditional

Codification

Networkbased

+

++

+

+ +

considered very significant [106]] (Table 10). More exactly, three factors emerge: 1) Information Management Efficiency, 2) Timing and 3) Network Integration. The first group deals with data re-use and storage, communication, cost ofinformation retrieval, PI standardization and faultiness. The second group (Timing) entails time-to-market, working cycles lead time and compression of idle time in the interaction with customers and suppliers. The last group (Network Integration) deals with aspects such as the understanding of customer needs and the collaboration with suppliers. Because of their inherent significance and high loading factor, we selected "Improvements in data storage efficiency," "Time-to-market reduction," and "Better understanding of customer needs" as surrogates, respectively, for factors 1, 2 and 3. Table 11 shows the main emerging findings of the association between these surrogate measures and the KM Configurations'': 6

Firms adopting a Network-based Configuration perceive the better benefits from the ICT technology in all the investigated performance INTERPRETATIONS.

SThe results of the association analysis connecting the KM clusters with each surrogate are available at request. 6We checked differences in economic and PI performances. As regards the former, no significant statistical gap exists between the clusters in terms of employees; turnover, assets and ROE (data available at request). Such differences should be evaluated in the long period.

62

Corso et al.

dimensions. On the contrary companies following a Traditional KM Configuration do not appear to perceive relevant contributions from the ICT tools used. Lastly SMEs following a Codification approach perceive lower benefits from ICT than those following the Network-based approach, in all the analyzed dimensions. Although assumptions about cause-effect relationship between use of ICT and PI performance is based on company perception rather than on hard data, empirical results reinforce the hypothesis that web based applications in SMEs are more effective when used across company boundaries to connect with external sources of knowledge and in strong connection with inter-company organizational integration mechamsms. 7. IMPLICATIONS FOR MANAGERIAL ACTION AND FUTURE RESEARCH

Results from the empirical analysis enhance understanding of the behaviors of SMEs regarding the adoption ofICT and Organizational tools for Knowledge Management in the area of Product Innovation: the mix of tools-the Levers-is not incidental but driven by specific Contingencies. In particular, the focus is on those Levers (ICT and Organizational mechanisms) which facilitate the management of the variable with the highest complexity degree. Companies in the selected sample of SMEs cluster into three main Configurations of choices. SMEs producing components on order tend to use a Traditional approach, making a lesser use of new ICTs and mainly relying on organizational tools both at the internal and external level. At the same time, they appear to be the least satisfied with the results achieved in terms of the analyzed performances. Companies producing technologically complex components tend to adopt a Codification approach, using solution databases in connection with 2D CAD, CAM, CAE applications aimed at codifying knowledge. The use of Internet-based communication technologies is focused both internally and externally. Although investing relevant resources in ICT, these companies do not perceive very high benefits in terms of efficiency and integration. Finally, companies producing architecturally complex products use more 3D CAD and Internet-based technologies and perceive more benefits from ICT. For these companies, the role of ICT is communication and inter-firm integration, rather than the management of internal knowledge which in fact does not represent a major issue for them. Therefore, companies integrating different components and sub-systems into the final product architecture seem to benefit the most from Internet-based technologies. The latter, perceive the same benefits in terms of Timing as the Codification approach and the highest benefits not only in terms of better integration with customers and suppliers, but also in terms of Information Management Efficiency. It is worthwhile to observe that, different than what is commonly assumed for larger companies, the need for support in internal communication and in complexity management at the ICT level do not appear to be the main driver for SMEs in the adoption of internet-based tools in Product Innovation.

Knowledge management systems in continuous product innovation

63

According to the results, our analysis highlights the strategic role Internet-based technology can play in supporting PI in SMEs which integrate by different parts into the final product architecture and managing technologically complex components. Future research will explore this issue further, complementing the use of survey with more intensive and qualitative research methodologies such as longitudinal case studies and action research. The latter are fundamental in exploring casual relationships between KM Configurations and performance and to analyze the process of implementation ofKM systems. Furthermore, management research should give a positive contribution in the development and implementation ofKM tools and models, more adequate to the needs of SMEs. REFERENCES

[1] Clark, K. B. and Fujimoto, T. Product Development Performance-Strategy, Organization, and Management. In: The MIorld Auto Industry, K. B. Clark and T. Fujimoto (eds.). Boston: HBS Press,

1991. [2] Cusumano, M. A. and Nobeoka, K. Strategy, Structure and Performance in Product Development: Observation from the Auto Industry. Research Policy 21 (3): 265-293 (1992). [3] Dosi, G. Technological Paradigms and Technological Trajectories: a Suggested Interpretation of the Determinants and Directions of Technical Change. Research Policy 11: 147-162 (1982). [4] D'Aveni, R. A. Hypercompetition. New York: Free Press, (1994). [5] Wheelwright, S. C. and Sasser,W E. The New Product Development Map. Harvard Business Review: 112-125 (May-June 1989). [6] Wheelwright, S. C. and Clark, K. B. Creating Project Plans to Focus Product Development. Harvard Business Review: 71-82 (March-April 1992). [7] Corso, M., Muffatto, M., and Verganti, R. Reusability and Multi-Project Policies: a Comparison of Approaches in the Automotive, Motorcycle and Earthmoving Machinery Industries. Robotics and Computer Integrated ManufacturingJournal15 (1): 155-165 (1999). [8] Sanderson, S. and Uzumeri, M. Managing Product Families: the Case of the Sony Walkman. Research Policy 24: 761-782 (1995). [9] Meyer, M. H. and Utterback]. M. The Product Family and the Dynamics of Core Capability. Sloan Management Review Spring: 29-47 (1993). [10J Nonaka, I. Redundant and Overlapping Organisation: a Japanese Approach to Managing the Innovation Process. California Management Review 69 (6): 96-104 (1990). [11] Imai, K., Nonaka, I., and Takeuchi, H. Managing the New Project Development Process: How Japanese Companies Learn and Unlearn. In: The Uneasy Alliance. ManaJiing the Productivity- Technology Dilemma, K. B. Clark, R. H. Hayes, and C. Lorenz (eds.). Boston: Harvard Business School Press, 1985. [12] Bartezzaghi, E., Corso, M., and Verganti, R. Continuous Improvement and Inter-Project Learning in New Product Development. International journal of Technology Management 14 (1): 116~138 (1997). [13] Corso, M. From Product Development to Continuous Product Innovation: Mapping the Routes of Corporate Knowledge. International Journal of TechnoloJiY Management forthcoming. [14] Brown, N. T. and Eisenhardt, K. M. The Art of Continuous Change: Linking Complexity Theory and Time-Paced Evolution in Relentlessly Shifting Organizations. Administrative Science Quarterly, 42: 1-34 (1997). [15] Corso, M., Martini, A., and Pellegrini, L. "Knowledye Management Configurations In Network-Rased Product Innovation: A Case Study Based Approach", CINet Conference-Continuous Innovation in Business Processes and Networks, 15-18 September 2002, Dipoli Congress Centre, Espoo, Finland: 185-196 (2002). [16] Hamel, G. and Prahalad, C. K. Competing for the Future. Harvard Business ReviewJuly-AuJiust: 122-128 (1994). [17] Zack, M. H. Developing a Knowledge Strategy. California Management Review 41 (3): 125-140 (Spring 1999). [18] Penrose, E. T. Limits to the Growth and Size ofFirms. American Economic Review, 45: 531-543 (1955). [19] WernerfeIt, B. A Resource-Based View of the Firm: Ten Years After. Strategic Management Journal 5: 171~180 (1984).

64

Corso et al.

[20] Rumelt, R. P. Toward a Strategic Theory of the Firm. In: Competitive Strategic Management, R. Lamb (eds.). Englewood Cliffs NJ: Prentice Hall, 1984, pp. 556-570. [21] Barney,]. Firm resources and Sustained Competitive Advantage. Journal ofManagement 17 (1): 99~ 120 (1991). [22] Peteraf, M. A. The Cornerstone of Competitive Advantage: a Resource Based View. Strategic Management Journal 14: 179-191 (1993). [23] Collis, D.]. and Montgomery, C. A. Competing on Resources: Strategy in the 1990s. Harvard Business Review 73 (4): 118-128 OullAug 1995). [24] Nonaka, I., Byosiere, P, Borucki, C. c., and Konno, N. Organizational Knowledge Creation Theory: a First Comprehensive Test. International Business Review 3 (4): 337-351. (Dec 1994) [25] Quinn,]. B., Anderson, P, and Finkelstein, S. Managing Professional Intellect: Making the Most of the Best. Harvard Business Review 74 (2): 71-80 (Mar/Apr 1996). [26] Arora, A. and Gambardella, A. The Changing Technology of Technological Change: General and Abstract Knowledge and the Division ofInnovative Labour. Research Policy 23: 523-532 (1994). [27J Fiol, C. M. and Lyles,M. A. Organizational Learning. Academy ofManagement Review 10 (4): 803-813 (October 1995). 128] Argyris, C. and Schon, D. A. Organizing Learning: a Theory of Action Perspective. Reading, Mass.: Addison-Wesley Publishing Company (1978). [29] Senge, P M. The Fifth Discipline. The Art and Practice of The Learning Organization. New York: Doubleday, 1990. [30] Henderson, R. M. and Clark, K. B. Architectural Innovation: the Reconftguration of Existing Product Technologies and the Failure of Established Firms. Administrative Science Quarterly 35: 9-30 (1990). [311 Henderson, R. M. and Cockburn. I. Measuring Competence? Exploring Firms Effects in Pharmaceutical Research. Strateoic Management Journal 15: 63-84 (Winter 1994). [32] Wiig, K. M. Integrating Intellectual Capital and Knowledge Management. Long Range Planning 30 (3): 399-405 (june 1997). [33] Petrash, G. D. Journey to a Knowledge Value Management Culture. European Management Journal 14 (4): 365-373 (1996). [34] Nonaka, I. The Knowledge-Creating Company. Harvard Business Review: 96-104 (NovemberDecember 1991). [35J Davemport, T. H. and Prusak, L. VVorking Knowledge-How Organizations Manage What They Know. Harvard Business School Press, 1998. [361 Ricardo, D. Principles of Political Economy and Taxation. London:]. Murray, 1817. [37] Conner, K. R. and Prahalad, C. K. A Resource-Based Theory of the Firm: Knowledge Versus Opportunism. Organization Science 7 (5) 447-501 (September-October 1996). [38] Kogut, B. and Zander, U. Knowledge of the Firm, Combinative Capabilities, and the Replication of Technology. O~~anization Science 3: 383-397 (1992). [39] Amit, R. and Schoemaker, P]. H. Strategic Assets and Organizational Rents. Strategic Management Journal 14: 33-46 (1993). [40] Pan, S. L. and Scarbrough, H. A Socio-Technical View of Knowledge-Sharing at Buckman Laboratories. Journal ofKnowledge Management 2 (1): 55-66 (1998). [41] Clark, K. B. and Wheelwright, S. C. Managing New Product and Process Development. New York: The Free Press, 1993. [42] Beamish, N. G. and Armistead, C. G. Selected Debate From the Arena of Knowledge Management: New Endorsements for Established Organizational Practices. International Journal of Management Reviews3 (2): 101-111 (2001). [431 Hedlund, G. A Model of Knowledge Management and the N-Form Corporation. Strategic ManagementJournal 15: 73-90 (1994). [44] Nonaka, I. and Takeuchi, H. The Knowledge-Creating Company. New York: Oxford University Press. 1995, pp. 93-138. [45] Staples, D. S., Greenaway, K., and McKeen, J. D. Opportunities [or Research About Managing the Knowledge-Based Enterprise. InternationalJournal of Management Reviews 3 (1): 1-20 (2001). [46] Martensson, M. A Critical Review of Knowledge Management As a Management Tool. Journal of Knowledge Management 4: 204-216 (2000). [471 Alavi, M. and Leidner, D. Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issnes. MIS Quarterly 25: 107-136 (2001). [48] Shin, M., Holden, T., and Schmidt, R. From Knowledge Theory To Management Practice: Towards an Integrated Approach. Information Processing and Management 37: 335-355 (2001).

Knowledge management systems in continuous product innovation

65

[49] Sullivan, 1. P Quality Progress (1986). [50] Ancona, 0. G. and Caldwell, 0. F. Beyond Boundary Spanning: Managing External Dependence in Product Development Teams.Journal ofHigh Technology Management Research 1: 119-135 (1990). [51] Keller, R. T. Predictors of the Performance of Project Groups in R&D Organizations. Academy of ManagementJournal29: 715-726 (1986). [52] Dougherty, 0. Understanding New Markets for New Products. StrateJ!ic Manaoementjournal l l: 59-78 (1990). [53] Dougherty, 0. Interpretative Barriers to Successful Product Innovation in Large Firms. Organization Science 3: 179-202 (1992). [54] Joice, W F. Matrix Organizations: a Social Experiment. Academy ofManaoement Journal 3: 536-561 (1986). [55] McCord, K. R. and Eppinger, S. Managing the Integration Problem in Concurrent Engineering. ICRMOT Working Paper, WP#95-93 (1993). [56] Iansiti, M. Shooting the Rapids. Managing Product Development in Turbulent Environments. California Management Review 38 (1): 37-58 (1995). [57] MacCormack, A., and Iansiti, M. Product Development Flexibility. 4th International Product Development Management Conference, EIASM, Stockholm, Sweden (1997), [58] Verganti, R., MacCormack A. and Iansiti M. Rapid Learning and Adaptation in Product Development: an Empirical Study on the Internet Software Industry. EIASM 5th International Product Development Management Conference, Como, Italy, May 25th_26 th (1998). [59] Speranza, M. G. and Vercellis, C Hierarchical Models for Multi-Project Planning and Scheduling. European Journal ofOperational Research 78 (1994). [60] Czajkowski, A. F. and Jones, S. Selecting Interrelated Projects in Space Technology Planning. IEEE Transactions on Bngineering Management 33: 17-24 (1986). [61] De Maio, A., Verganti, R., and Corso, M. A. Multi-Project Management Framework for New Product Development. European journal o{Operational Research 78: 178-191 (1994). [62] Itami, H. Mobilizing Invisible Assets. Cambridge: Harvard University Press, 1987. [63] Witter,]., Clausing, D., Laufenberg, 1., and Soares de Andrade, R. Reusability-The Key to Corporate Agility. 2nd International Product Development Management Conference on New Approaches to Development and Engineering, EIASM, Goteborg, 30-31 May (1994). [64] Hayes, R. H., Wheelwright, S. C, and Clark, K. B. Dynamic Manufacturing. Creating the Learning Organization. New York: The Free Press, 1988. [65] Scheinberg, M. V Planning of Portfolio of Projects. Project Manaoemen: Journal 23 (2) (june 1992). [66] Ingelgard, A., Roth,]., Rami Shani, A. B. and Styhre ,A. Dynamic Learning Capability As Enabler for Knowledge Creation: Clinical R&D in a Pharmaceutical Company. 8t h International Product Development Management Conference, EIASM, Twente University, Faculty ofTechnology & Management, Enschede, The Netherlands, June 11-12: 499-513 (2001). [67] Gieskes,]. F. B. and van der Heijden, B. l.]. M. Stimulating Learning Behaviour in Product Innovation Processes. 8 t h International Product Development Management Conference, EIASM, Twente University, Faculty of Technology & Management, Enschede, The Netherlands, June 11-12, 317-332 (2001). [68] von Hippel, E. and Tyre, M.]. How Learning by Doing Is Done: Problem Identification in Novel Process Equipment. Research Policy 24: 1-12 (1995). [69] Smeds, R., Olivari, P., and Corso, M. Continuous Learning in Global Product Development: a Cross-Cultural Comparison. International Journal ofTechnology Management, forthcoming (2000). [70] Nonaka, l. and Konno, N. The Concept of 'Ba': Building a Foundation for Knowledge Creation. California Manacement Review 40 (3): 40-54 (1998). [71] Brown,]' S. and Duguid, P. Organizational Learning and Communities ofPractice: Towardsa Unified View of Working, Learning and Innovating. O~~anization Science 2 (1): 40-57 (1991). [72J Brown,]' S. and Duguid, P Organizing Knowledge. California Management Review 40 (3): 90-111 (1998). [73] Lave, j. and Wenger, E. Situated Learning. Legitimate Peripheral Participation. Cambridge: Cambridge University Press (1991). [74] Wenger, E. C and Snyder W M. Communities of Practice: the Organizational Frontier. Harvard Business Review 78 (1): 139-145 (2000). [75] Bessant,]., Caffyn, S., Gilbert, j., Harding, R., and Webb S. Rediscovering-Continuous Improvement, Technovation 14 (1): 17-29 (1994). [76] Caffyn, S. Extending Continuous Improvement to the New Product Development Process. R&D Management 27 (3): 253-267 (1997).

66 Corso et al.

[77] Bartezzaghi, E., Corso, M., and Verganti R. Managing Knowledge in Continuous Product Innovation. (j'h EIASM Product Development Conference (1999). [78] Allen, T]. Managing the Flowof Technology. Cambridge: MIT Press, 1997. [79] Katz, R. and Tushrnan, M. L. An Investigation Into the Managerial Roles and Career Paths of Gatekeepers and Project Supervisors in a Major R&D Facility. R&D Management 11: 103-110 (1981). [80] Katz, R. Careers Issues in Human Resource Management. Englewood Cliff NJ: Prentice Hall, 1982. [81] Reid, D., Bussiere, D., and Greenaway K. Alliance Formation Issues for Knowledge-Based Enterprises. InternationalJournal ofManagement Reviews 3 (1): 79-100 (2001). [82] Dyer,]. H. Specialized Supplier Networks as a Source of Competitive Advantage: Evidence from the Auto Industry. Strategic ManagementJournal17 (4): 271-291 (1996). [83] Robertson, M., Swan,]., and Newell, S. The Role of Network in the Diffusion of Technological Innovation. Journal ofManagement Studies 3 (33), 333-359 (1996). [84] Bonaccorsi, A. and Lipparini, A. Strategic Partnership In New Product Development: An Italian Case Study. Journal ofProduct Innovation Management 11: 134-145 (1994). [85] Powell, W W, Koput, K. W, and Smith-Doerr, L. Interorganizational Collaboration and the Locus of Innovation: Networks of Learning in the Biotechnology. Administrative Science Quarterly 41: 116-145 (1996). [86] Pisano, G. Knowledge, Integration and the Locus of Learning: an Empirical Analysis of Process Development. Strategic ManagementJournal15: 85-100 (1994). [87] Pisano, G. The Development Factory. Unlocking the Potential ofProcess Innovation. Lessons from Pharmaceuticals and Biotechnology. Cambridge: Harvard Business School Press (1996). [88] Frear, C. R. and Metcalf, L. E. Strategic Alliances and Technology Networks. A Study of a CastProducts Supplier in the Aircraft Industry. Industrial Marketing Management 24 (5): 379-390 (1995). [89] von Hippel, E. The Dominant Role of Users in Scientific Instrument Innovation Process. Research Policy 3: 212-239 (1976). [90] von Hippel, E. Has a Customer Already Developed Your Next Product. Sloan Management Review 18 (2): 63-74 (1977). [91] von Hippel, E. Successful Industrial Products from Customer Ideas. Presentation ofa New CustomerActive Paradigm with Evidence and Implications. Journal of Marketing 42 (1): 39-49 (January 1978). [92] Gupta, A. K. and Wilemon, 0. Accelerating the Development of Technology-Based New Products. California Management Review 2: 24-44 (1990). [93] von Hippel, E. The Sources ofInnovation. Oxford: Oxford University Press, 1988. [94] Cusumano, M. A. and Takeishi, A. Supplier Relations and Management: a Survey of Japanese, Japanese-Transplant, and U.S. Auto Plants. Strategic ManagementJournal12 (8): 563-588 (1991). [95] Edwards, C. T and Samirni, R. Japanese Interfirm Networks: Exploring the Seminal Sources of Their Success. Journal ofManagement Studies 34 (4): 489-510 (1997). [96] Quinn, J. B. Managing Innovation: Controlled Chaos. Harvard Business Review 85 (3): 73-84 (1985). [97] Takeuchi, H. and Nonaka, I. The New Product Development Game. Harvard Business Review: 137146 (January-February 1986). [98] Leonard Barton, 0. The Factory As a Learning Laboratory. Sloan Management Review Fall: 23-38 (1992). [99] Teece, D.]. Profiting from Technological Innovations: Implications for Integrations. Research Policy 15 (6): 285-305 (1986). [100] Lundvall, B. A. Innovation as an Interactive Process: from User-Producer Interaction to the National System ofInnovation. In: Technical Change and Economic Theory, Dosi G. (eds.). London: Pinter, 1988. [101] Lee, Y. S.Technology Transfer and the Research University: a Search for the Boundaries ofUniversityIndustry Collaboration. Research Policy 25 (6): 843-863 (1996). [102] Hansen, M. T, Nohria, T, and Tierney, T What's Your Strategy for Managing Knowledge? Harvard Business Review, 106-116 (1999). [103] Corso, M., Martini, A., Paolucci, E., and Pellegrini, L. Information and Communication Technologies in Product Innovation within SMEs-the Role of Product Complexity. Enterprise & Innovation Management 2 (1): 35-48 (2001). [104] Corso, M., Martini, A., Paolucci, E., and Pellegrini, L. Knowledge Management Configurations in Italian Small-and-Medium Enterprises. Integrated Manufacturing Systems 14 (5) (2003). [105] Kogut, B. and Zander, U. Knowledge and the Speed of the Transfer and Imitation of Organizational Capabilities: an Empirical Test. Organization Science 6 (1): 76-92 (January-February 1995). [106] Hair,]. F.Jr, Anderson, R. E., and Tatham, R. L. Multivariate data analysis with readings, Macmillan Publishing Company. New York: Collier Macmillan Publishers, London, 1987.

KNOWLEDGE-BASED MEASUREMENT OF ENTERPRISE AGILITY

NIKOS C. TSOURVELOUDIS

1. INTRODUCTION

One essential requirement for business survival is the continuous ability to meet customer needs and demands. Market needs cause unceasing changes in product(s) life cycle, shape, quality, and price. Agility is an enterprise-wide response to an increasingly competitive and changing business environment, based on four cardinal principles: enrich the customer; master change and uncertainty; leverage human resources; and cooperate to compete [1], [2]. Agility is more formally defined as the ability of an enterprise to operate profitably in a rapidly changing and continuously fragmenting global market environment by producing high-quality, high-performance, customer-configured goods and services. It is the outcome of technological achievement, advanced organizational and managerial structure and practice, but also a product of human abilities, skills, and motivations [2]. The application of agile manufacturing methods started in the late 1980s as a response to competition from Japan and the other Pacific Rim area countries. Some of these methods include just-in-time manufacturing, flexible manuftcturing systems, computer and communication networks. Several programs and initiatives started to help U.S. companies change their organization and production processes [21. Such programs include the Department of Energy's (DoE) Demand Activated Manuftcturing Architecture [11] (textile/apparel industries), Technologies Enabling Agile Manufacturing (TEAM) [12], etc. In addition, several Agile Manufacturing Research Institutes (AMRIs) have already been established, like the Aerospace Agile Manufacturing Research Center, the 67

68

Nikos C. Tsourveloudis

Machine Tool Agile Manufacturing Research Institute (MT-AMRI), and the Rensselaer Electronics Agile Manufacturing Research Institute (EAMRI). These institutes and their activities have been described in [10]. Agility, like many other general concepts, is ill defined and thus has a different meaning for different people, even within the same organization. Very often agility is confused with flexibility. In manufacturing terms, flexibility refers to product(s) range using certain (production) strategies, while agility refers to quick movement (change) of the whole enterprise in a certain direction. Flexibility normally refers to the capabilities of a factory floor to rapidly change from one task or from one production route to another, including the ability to change from one situation to another, with each situation not always defined ahead of time. Agility refers to the strategic ability of an enterprise to adapt and accommodate quickly unplanned and sudden changes in market opportunities and pressures, thus, in this sense it is wider than flexibility. The problems in measuring both flexibility and agility are more or less the same. Similar to the case of measuring manufacturing flexibility [17], there does not exist a direct, adaptive and holistic treatment ofagility components. In [3], the overall problem of agility measurement is limited to three simple, yet fundamental questions: what to measure, how to measure it, how to evaluate the results. Furthermore, there is no "synthesis method" to combine measurements and determine agility. Indeed, literature review reveals overlaps in the dimensions of agility as well as lack of a universal metric [4]. There does not appear to be a measure that identifies certain parameters/indicators of the agility level, albeit some efforts in that direction. Some guidelines towards agility measurement together with the difficulties of such a task are given in [2], along with a comprehensive questionnaire for the monitoring of various agility factors. These questions are useful because they can be part of the knowledge acquisition procedure of any knowledge-based agility measure. However, it should be emphasized that the agile manufacturing literature is rife with generalities especially when comes to agility metrics. An agility measurement methodology based on the acquired knowledge, is described in this chapter. Knowledge is represented via linguistic IF-THEN rules, which has a number of clear advantages over other representation techniques. First and foremost advantage is the rule simplicity. The know-how knowledge for measuring agility can be, in most cases, easily modeled by the IF-THEN rules. Further, it is easy to make logical inferences, in which various forms of uncertainty and fuzziness are present. This chapter is the based on the research reported by Tsourveloudis and Valavanis in [15]. The proposed framework aims at providing the fundamentals of an adaptive knowledge-based methodology for the measurement of agility. The definition and derivation of a combined agility measure is based on a well-defined group of individually defined (and then grouped) quantitative metrics. By utilizing these metrics, decision-makers have the opportunity to examine and compare different systems at different agility levels.

Knowledge-based measurement of enterprise agility 69

The rest of the chapter is organized as follows. In Sectio n 2, some general steps for achieving and managing agility, are provided. Guidelines for the construc tio n of any agility measure along with the characteristics and the mathematical formulation of the proposed meth odol ogy are presented in Sectio n 3. In Section 4, we define four distinct agility infrastructures used for the measurem ent . Specific measur ing variables are defined and explained. Section 5 gives a bri ef arithmetic example of the metho dology. The chapter con cludes with discussion and a remarks section. 2. MANAGING AN ADAPTIVE INFRASTRUCTURE

Global market need s cause unceasing changes in th e life cycle, shape, quality, and pri ce of produ cts. Manu facturing competitiveness has moved from the "era cf mass production " to the "era qfagility". It is common belief, tod ay, that th e business environment is chan ging faster than firm's ability to enable change. Yesterday's production infrastru cture was built for cont inuous production, stability and manageability. Even the reengineering initiatives of a decade ago were more about redesignin g new processes rather than making those processes easy to change over time. The agility era requires a production infrastru cture that has the capacity to adapt and deliver measurable improvements in manufacturi ng processes. An adaptive produ ction infrastructure responds rapidly to new business conditio ns and opport unities, takes advantage of new technologies, accommo dates unanticipated changes and demonstrates the value of agility th rough a measureme nts-driven approach. An adaptive produ ction/manufacturing infrastru cture can expand or shrink in alignment with business needs. It is useful to see a manufactu rin g system from a design viewpoint . All manufacturing infrastru ctures can break down in conceptual components, the integration ofw hich makes the manufacturing system . These components are: Materials, Processes, Equipments/Tools, Facilities, Support /Logistics and People. In many cases, the "system " fails because the above- me ntioned compo nents are viewed separately o r fail to und erstand the dynamic nature of informatio n going over the production infrastruc ture. A three steps approach for mini mizing the "ag ility gap" in manufacturi ng systems management may be the following: Step 1: Design and plan agility improvements. It is essential to identify business challenges and processes for w hich agility is a basic factor. Key considerations include the company's business strategy, relevant industry and techn ological trend s, competitive pressures and the overall economic environment. Important questions to answer: What does it mean for a particular manufactur ing system to be agile? How agile is the system now? W hat wiII it take to achieve the desired results? Wh at is the cost for the se changes? Step 2: Built an adaptive infrastru cture according to the four fund amental agility design prin ciples: 1) enr ich the custo mer; 2) master change and uncertainty; 3) leverage hum an resources; and 4) cooperate to compete. The infrastru cture must be built to utilize agility metri cs and diagnostics. Adaptive infrastructure solutions need to deliver against some combination of the three key agility metrics: time, range and ease. General

70

Nikos C. Tsourveloudis

conditions for achieving agile manufacturing are the following [17]: • High degree ofintegration in company not based only on the information technology, but also on the human mutual interconnection, • Establishing of work teams based on natural and logical associations, • It is necessary to raise the responsibility level of all employees, • Continuous learning, training, testing and introducing of novelties, • It is necessary to introduce a virtual company, • High trained and versatile experts organized in teams and • Introducing of knowledge, changes and risk management. These requirements must be adapted to the specific needs ofa company with respect to the type of production. Step 3: Measure agility results. Regardless of the structure of the agility measure, it is important that any practical agility metric should [18], [14]: 1. Focus on specific divisions of agility from which overall agility measures will be derived. The observable parameters for each measure should be specified together with the derivation methodology. 2. Allow agility comparisons among different installations. 3. Provide a situation specific measurement by taking into account the particular characteristics of the system/enterprise. 4. Incorporate relevant accumulated human knowledge/expertise. 3. AGILITY MODELING AND MEASUREMENT FUNDAMENTALS

Measuring agility is not a trivial task. Agility metrics are difficult to be defined, mainly due to the multidimensionality and vagueness of the concept of agility [18]. However, in order someone to understand and employ the agile manufacturing principles has to be able to measure agility. In [3], the overall problem of measurement is limited to three simple, yet fundamental questions: what to measure, how to measure it, how to evaluate the results. More recent approaches utilize knowledge-based techniques, such as fuzzy logic, for the assessment of manufacturing agility ([18], [14]). In these works, the overall agility is measured by the synthesis of individual infrastructures identified in the enterprise. Regardless of the structure of each measure, it is important to establish basic principles, which should be satisfied by any such agility measure. It is postulated that any practical agility metric should provide a situation specific measurement by taking into account the particular characteristics of the system/enterprise under study, and allow for comparisons among different installations. Further, it should incorporate all the relevant to agility accumulated human knowledge/expertise by focusing on specific observable measuring parameters that may be defined. In view of the above statements, the proposed agility measurement scheme is [15]: 1. Direct: it focuses on the observable operational characteristics that affect agility (direct measurement), such as product variety, versatility, change in quality, networking

Knowledge-ba sed measurement of enterprise agility 71

etc., and not on the effects of agility (indirect measurement) such as, increased assets or profit s, short delivery times, custo mer satisfaction, etc. The proposed method provides context-specific measurements but witho ut changing its stru ctural characteristics every time. Th e measure will adapt to different manufacturing systems/ enterprises and allow agility comparisons amon g them . 2. Knowledge-based: it is based o n the expert knowledge accumulated from the operation of the system und er examination, or on similar systems. A good metric sho uld be capable of handlin g both numerical and linguistic data, resulting in precise/ crisp (e.g. agility = 0.85) and /or qualitative (e.g. high agility) measurements. 3. Holistic: it combines all known dimensions of agility. Agility is a multidimensional noti on , observable in almost all hierarchical levelsof an ent erprise. For quantification purposes, it is categorized into several distinct (enterprise) itifrastmctures. 3.1. Dimensions of agility

M anufacturing systems engineering lacks analytic and closed-form mathematical solution s albeit in the simplest possible cases. Since manufacturing systems are operated and managed by people, it is necessary to record and utilize human knowledge and perceptions about agility and its factors (parameter quantifi cation and measurement). Algebraic formulae fail in putting together the various dimension s of agility coupled with the human perception of agility. To overcome such problem s, the key idea is to model human inference, or equivalently, to imitate the mental procedure through whi ch experts (managers, eng ineers, operators, researcher s) arr ive at a value of agility by reasonin g from various sources of evidence . To quantify agility, managers and operato rs, frequentl y use verbal or linguistic values, such as low, average, about high and so on. Thus, a valid and suitable candidate solution to the problem of measuring enterpr ise agility should be based on fuzzy logic. The essential concept in agile manu facturing is the integration of organization, people, and techn ology into a coo rdinated interdependent system [21. which respo nds rapidly to changes. The proposed measurin g approach involves all the found ing concepts of agility expressed, for the sake of analysis, in th e following divisions/infrastructures ([14], [15J, (1 8)):

• Production Infrastructure: Deals with plant, processes, equipment, layout, material handling, etc. It can be measured in terms of tim e and cost needed to face unanticipated changes in the produ ction system. • Market Infrastructure: De als with the external enterprise environment, including custo mer service and marketing feedb ack. It may be measured by the ability of the enterprise to identify opp ortu nities, deliver, upgrade products/enrich services, and expand. • People Infrastructure: Deals with the people within the organization. The level of training and motivation of personnel may measure it. • Information Infrastructure: Deals with the information flow within and outside the enterprise. It may be measured by the ability to capture, manage, and share struc tured information to suppo rt th e area of int erest.

72

Nikos C. Tsourveloudis

Figure 1. The architecture of the proposed assessment of agility.

The key idea of this approach is to combine all infrastructures and their corresponding operational parameters as shown in Figure 1, to determine the overall agility. The value of agility is given by an approximate reasoning method taking into account the knowledge that is included in simple IF-THEN rules. This is implemented via multi-antecedent fuzzy IF-THEN rules, which are conditional statements that relate the observations concerning the allocated divisions (IF-part) with the value of agility (THEN-part). Generally speaking, IF-THEN rules are statements ofthe form LHS -+ RHS, where LHS (Left Hand Side) determines the conditions or situations that must be satisfied and RHS (Right Hand Side) is the action(s) that must be taken once the rule is applied (or activated). The terms premise or antecedent and conclusion or consequent are frequently used for LHS and RHS, respectively. Each side of a rule may be written in the form of a conjunction: Al, A2, A3, ... , An -+ Bl, B2, B3, ... , Bm,

which means that whenever Ai, A2, A3, ... , An hold, actions B'l , B2, B3, ... , Bm must be taken. Many times the above rule is written in a natural language manner

Knowledge-based measurement of enterprise agility

73

as follows: IF A1, A2, A3, ... , An THEN B1, B2, B3, ... , Bm. An example of such a rule is: IF

THEN

the agility of Production Infrastructure is Low AND the agility of Market Infrastructure is Average AND the agility of People Infrastructure is Average AND the agility of Information Infrastructure is Average the overall Enterprise agility is About Low

where Production, Market, People, Information infrastructures and Enterprise agility are the linguistic variables of the above rule, i.e., variables whose values are linguistic terms such as, Low, Average, About Low, rather than numbers. These linguistic ratings are represented with fuzzy sets having certain mathematical meaning represented by appropriate membership functions. Since the impact of all individual infrastructures on the overall manufacturing agility is hard to be analytically computed, fuzzy rules are derived to represent the accumulated human expertise. In other words, the knowledge concerning agility, which is imprecise or even partially inconsistent, is used to draw conclusions about the value of agility by means of simple calculus. In order to explain the structure of fuzzy rules and the fuzzy formalism to be used towards measurement, consider that A;, i = 1, ... , N, is the set of agility divisions (here i = 4), and LA; the linguistic value of each division. Then, the expert rule can be formulated as follows IF A j is LA j AND ... AND AN is LAN THEN Gis LG

(1)

or, in a compact representation, (LA! AND LA 2 AND ... AND LAN ---+ LG), where LG represents the set of linguistic values for enterprise agility G. All linguistic values LA; and LG are fuzzy sets, with certain membership functions. 'AND' represents the fuzzy conjunction and has various mathematical interpretations within the fuzzy logic literature. Usually it is represented by the intersection of fuzzy sets, which corresponds to a whole class of triangular or T-norms [13]. The selection of the 'AND' connective in the agility rules should be based on empirical testing within a particular installation, as agility means different things to different people. The parameters at the various agility infrastructures are fuzzy sets with certain membership functions. In fuzzy modeling, most of times the membership functions are empirically chosen. In practice if one knows the extreme values of membership (0: full non-membership, 1: full membership) for a given concept, then one may interpolate between those numbers. In the proposed measurement model the acquired (initial) knowledge is represented with a number of IF-THEN rules. In order to provide a direct measurement of the overall agility one needs to know the agility value of each of the infrastructures. Thus, one has to identify certain parameters that indicate agility for each infrastructure. Before doing so, the agility measurement problem is first formulated via fuzzy logic modeling followed by the definitions of specific measuring parameters for each infrastructure in Section 3.

74

Nikos C. Tsourveloudis

4. MODELING OF AGILITY INFRASTRUCTURES

4.1. Production infrastructure

Agility at the production infrastructure level allows for quick reactions to unexpected events such as machine breakdowns, and minimizes the effect of interruptions of the production process. It refers to the capability of producing a part in different ways by changing the sequence of operations from the one originally scheduled. In order to achieve agility in the production infrastructure (from now on, production agility), a combination of certain desirable characteristics is needed, for example, a combination of multi-purpose machines and fixtures, redundant equipment, material handling devices and process variety. The parameters defined for the measurement of production agility (Aprod), are [15]:

1. Changeover effort (S) in time and cost that is required for preparations in order to produce a new product mix. It expresses the ability of a system to absorb demand variations. It includes the setup time and cost required for various preparations at the production floor such as tool or part positioning and release, software changes etc. Setup time represents the ability of a machine/workstation to absorb efficiently changes in the production process and it influences production agility heavily when the batch sizes or the products cycle are small. Changeover effort is also associated with the transfer speed of the material handling system. 2. Versatility (V), which is defined as the variety of operations the production system is capable of performing. 3. Range of adjustments or adjustability (R) of a system, which is related to the maximum and minimum dimensions of the parts that the production system can handle. 4. Substitutability (SB), which is the ability of a production system to reroute and reschedule jobs effectively under failure conditions. The substitutability index may also be used to characterize some built-in capabilities of the system, for example, real-time scheduling or available transportation links. 5. Operation Commonality (Co), which expresses the number of common operations that a group of machines can perform in order to produce a set of parts. 6. variety £if loads (P), which a material handling system carries such as work pieces, tools, jigs, fixtures etc. It is restricted by the volume, dimension, and weight requirements of the load. 7. Part variety ( Vp ) , which is associated with the number of new products the manufacturing system is capable of producing in a time period without major investments in machinery. It takes into account all variations of the physical and technical characteristics of the products. 8. Part commonality (C p ) , which refers to the number of common parts used in the assembly of a final product. It measures the ability of introducing new products fast and economically and also indicates the differences between two parts. Specifically, let 'I], i = 1, ... , 8, denote the set of parameters of concern, such that LTj , are the linguistic values corresponding to each 'I], The rule, which represents the

Knowledge-based measurement of enterprise agility 75

expert knowledge on how all the previously defined parameters affect the production agility A prod, is:

TH

IF T; is LTj AND ... AND

is LTH THEN Aprod is LA prod

(2)

where LA prod is the linguistic value of production agility, 'AND' denotes fuzzy conjunction, and -+ is the fuzzy implication. 4.2. Market infrastructure

At the level of market infrastructure, agility is characterized by the ability to identify market opportunities, to develop short-lifetime, by customizable products and services and by the ability to deliver them in varying volumes faster and at a lower price. It is associated with the ability of a firm to change focus by expanding or reducing its activities. The parameters identified for the market infrastructure agility (AMarket)' are:

1. Reconfigurability (P s) of the product mix. It is defined as the set of part types that can

be produced simultaneously or without major setup delays resulting from reconfigurations oflarge scale. 2. Modularity index (M D ) , which represents the ease of adding new customized components without significant effort. The significance of product modularity for the agile company is discussed in [5]. 3. Expansion ability (C E), which is the time and cost needed to increase/decrease the capacity without affecting the quality, to a given level. 4. The range of volumes (R v) at which the firm is run profitably. It can be regarded as the response to demand variations and implies that the firm is productive even at low utilization. It is also associated with the hiring of temporary personnel to meet changes in market demand. The generic measuring rule for the agility of this infrastructure, is as follows: IF T; is LTj AND ... AND

14

is LT4 THEN AMarket is LA Market

(3)

where the notation in (3) follows that of (2). 4.3. People infrastructure

The profitability of an agile company is determined by the knowledge and the skills of its personnel and the information they have or have access to. Work force empowerment, self-organizing and self-managing cross-functional teams, performance and skill-based compensation, flatter managerial hierarchies, and distributed decisionmaking authority are all parameters affecting agility. By taking advantage of an agile

76

Nikos C. Tsourveloudis

workforce, a firm is able to respond quickly to unexpected workloads that may arise. The variables defined as agility level indicators of this infrastructure (Apeople), are:

1. Training level (W). Personnel training contributes significantly towards agility and it can be achieved through education and cross-training programs. 2.Job rotation (j). It is related to training and expresses the frequency with which the workers are transferred to new work positions under normal conditions. The generic fuzzy rule can be written as follows (the notation is similar to (2) and (3)):

IF W is LW AND] is LT THEN Apeople is LApeople

(4)

4.4. Information infrastructure

The information infrastructure plays a critical role in the development ofthe enterprise agile capabilities, especially in the context of global and distributed organizations. The concept of multi-path agility [7] is used to improve productivity and response time. It is achieved by improvements in information infrastructure by shortening the response of individual entities on a single path and selecting alternative routes. The variables indicating the information infrastructure agility (AInjo) are:

1. Interoperability (I), which is a measure of the level of standardization and provides an indication ofthe information infrastructure agility. In a distributed, virtual organization, the exchange and storage ofinformation is necessary for the proper functioning of the enterprise. 2. Networking (N), which includes the communication capabilities of an enterprise are defined through ability to exchange information. This exchange takes place at the management level, production level, etc. How well is an enterprise "connected" and capable to provide and utilize information depends heavily on the networking infrastructure, both density of connections and their functionality (bandwidth, reliability, etc.). The generic fuzzy rule for this infrastructure can be written as follows: IF I is LI AND N is LN THEN AInjo is LAInjo

(5)

The notation is similar to (2), (3), (4). 4.5. Discussion

Table 1 lists all proposed parameters for the agility infrastructures modeling and evaluation. The values of these parameters, which can be derived from simulation and/or real-life data, are represented by certain membership functions. Most oftimes the membership functions are empirically chosen in fuzzy modeling. Mathematically speaking, measurement of membership means assigning numbers to objects (points, concepts,

Knowledge-based measurement of ente rprise agility

77

Table 1 Proposed measurement parameters Infrastructure

Parameter

Production

C hangeover effort Versatility R ange of adjustments or adjustability Substitut ability O peration C ommonality Variety ofloads Part variety Part commonality

Market

R econfigurability Modu larity ind ex Expansion ability R ange of volumes

People

Training level Job rotation

Information

Int eroperability N etworking

Symbol

s

V R S8 Co P VI' CI' Ps

MD CE

Rv W

J

I N

etc.), such that certain relation s between numbers reflect analogo us relations between obj ects. For a given context, if we show that there is a mapping J : E ---+ N from an em pirical relation stru cture E into a numerical relation stru cture N, then a scale « E, N, J» exists [13]. Althou gh, th e agility infrastructures and paramet ers show n in Table I are not indepen dent they are comb ined via IF-THEN rules, whi ch is the knowled ge representati on tool within the discussed measuring approach. Given a specific enterprise, and given certain performance criteria, one may experiment with the relative importance of the rules to arr ive at w hat may be considered "acceptable agility measureme nt" . With in the proposed framework , there may be mo re than o ne ways to reach such acceptable agility measurements that reflect ditTerent relative weights of the agility infrastructures. Th ere is no pro of that the selection of a rule or a memb ership function is optimal. But after a certain per iod of measurements for a given ent erpris e, one may check and evaluate the contribution of each rule (and membership function) in the agility assessment. Rules with no contribution can be deleted. Furthermore, the conj unction operator" AND" used in IF-THEN rules can be represent ed by a whole class of intersection based conn ectives. The most frequentl y used "AND " is the min (1\) operato r. A suitable operator maybe the so- called "co mpe nsato ry- AN D " or "y -opera ror" [13], wh ich is an example of averaging operator giving values that range from the inter section to the un ion of the combined sets, as follows: A AND B = Y (A U B) + (1 - y )( A n B). Specific values of y could represent expe rts opin ions for a given context. Consider for example the case of " people infrastruc ture" . The fuzzy rules used in th e measureme nt contain two variables, namely, trainin g level W and jo b rotation J, as follows: IF W is LW AND J is LJ T H EN Ipeople is LI. The value of the conj unction (LW AND LJ ) controls the level of LI . A pessimistic value

78

N ikos C. T sourveloudis

Table 2 Dat a for the agility infrastru ctures Agility Infrastru ctures Produ ction

M arket

People

Infor matio n

< 5 is Low> < Vis High >

< W is Average> <J is Low >

< I is Low>

< Se = 0.7 > < Vp is Average>

(y = 0) restricts the value of LW AN D LJ to th e minim um membership, while the optimistic one (y = 1) outputs the union of the individual membership functions. 5. AN EXAMPLE

An example of how the measurem ent meth odology works is given in this section. It is important to keep in mind that one can select measuring parameters according to the problem at hand. Assume that at a given time the agility parameters of an enterprise take the values presented in Table 2. For the parameters that do not appear in Table 2 data are not available. All variables take values in [0, 1]. The membership functions of the linguistic values are assumed to be sets of ordered pairs (: (x , J.l (x» , whe re x is the value and J.l (x) is the membership grade of x) in the same interval as follows: Low = L = {(O, I), (0.1, I), (0.3, O) }, A lmost Low = A L = {(0.15, 0), (0.3, 1), (0.45, O) }, A verage = A = {(0.3, 0), (0.5, 1), (0.7, O)}, A lmost High = AH = {(0.55, 0), (0.7, 1), (0.85, O) }, High = H = {(0.7, 0), (0.9, 1), (1, 1)}.

The rules are of the Mamdani type [13] and the connective AN D = 1\ = min. For the produ ction infrastructur e, A prod, the activated rules, i.e. rules whose anteceden ts match th e observations and therefore describe better their meaning, are: IF < S is L > AND AND < R is H > AND <Sa is A H> AND < Vp is A > TH EN < Apmd is AH>, IF <S is L> AND < V is H> AN D < R is AH> AND <Sa is AH> AND < Vp is A > TH EN < A Prod is A H> .

By applying the individual-rule based inference [9] we comp ute the discrete membership function of the production infrastructure [15]: L A Prod = {(0.55, 0), (0.6,0.5), (0.8, 0.5), (0.85, O) }.

In practice, a numb er in [0, 1] may be more preferable than a membership function, in order to represent agility. The procedure that converts a member ship function

Knowledge-based measurement of enterprise agility 79

Production Infrastructu re 1

0.8 0,6

0,7

0,4

03 0,3

People Infrastructu re

Figure 2. Agility infrastructuresplot.

into a single point-wise value, is called defuzzification. One can choose among various defuzzification methods reported in the literature. Here, by applying the so-called Center-oj-Area defuzzification method we derive the crisp value of production infrastructure agility, as follows:

o. 0.55 + 0.6 . 0.5 + 0.8 . 0.5 + 0.85 . 0 - - - - - - - - - - - - - = 0.7. 0.5 + 0.5 Similarly, the membership functions of market, AMarket' people, Apeople and information, A Injo, infrastructures are: LAMarket = {(0.15, 0), (0.3, 1), (0.45, O)}, LApeople = {(0.15, 0), (0.3, 1), (0.45, OJ}, LAInjo = {(O, 1), (0.1, 1), (0.3, OJ}. The defuzzified/ crisp values are defLAMarket = 0.3, defLApeople = 0.3 and defLA1nfo = 0.1, as can be seen in Figure 2. The knowledge concerning the overall agility variations is represented by fuzzy rules as in (1). The rule which is closer to the observations, i.e. computed membership functions of the infrastructures, is: IF < AProd is AH> AND AND <Apeople is AL> AND THEN

.

Applying the individual-rule based inference between the above rule and the observed membership functions, we computed the overall agility in a membership function form; that is LG = {(0.15, 0), (0.25,0.5), (0.35, 0.5), (0.45, OJ}. The overall agility (in all four infrastructures) is shown with the grey area in Figure 2. The crisp value ofagility, according to the Center-oj-Area defuzzification method is defLG = 0.3. As mentioned in the previous paragraph, most of times the membership functions are empirically chosen in fuzzy modelling. Further, there is no proof that the selection of the shape of a membership function is optimal. In order to examine the effect the shape of the membership function has on the outputted value of agility, various simulation runs have been performed. Figure 3 presents the variations of agility value

80

Nikos C. Tsourveloudis

1

0,9 0,8 0,7

,e. 0,6

i

0,5

<: 0,4

-6-Gauss ian

0,3

--*-Triangular

0,2

-+-Trapezoidal

0,1

O+-----r-----r----,----,------,

°

0,2

0,4

0,6

0,8

Inputs

Figure 3. Agility measurements for different types of membership functions.

1

0,9 0,8 0,7

.£ 0,6 ~ 0,5

<: 0,4

....... Centro id -+- Mean-<:lf-Maximum

0,3

- -5 mallest-of-Maximrm

0,2

Largest-<:lf-Maximum

0,1

°

0,2

0,6

0,4

0,8

Inputs

Figure 4. The Effect of defuzzification methods on agility measurements.

when using gaussian, triangular and trapezoidal membership functions. As can be seen, agility values are more or less the same for the three different shapes of memberships. The small variations that have been observed indicate that the significance of the membership function type in the proposed measuring methodology is limited. The defuzzification method proved to be factor of increased significance for the measurement of agility. This is due to the important role of defuzzification in fuzzy logic systems. Figure 4 presents the observed variations of agility values for four different defuzzification methods, namely, Centroid (or Center-of-Area), Mean-ofMaximum, Smallest-if-Maximum and Largest-ofMaximum. It can be observed that the

Knowledge-based measurement of enterprise agility

81

outputted agility values depend on the selected defuzzification method. This is a well known structural characteristic of the fuzzy logic based systems, thus, the selection of the defuzzification formula requires a close examination of the problem under study. An extensive discussion on the selection of defuzzification methods can be found in [16]. 6. CONCLUDING REMARKS

An agility measurement methodology based on the acquired knowledge, is described in this chapter. Knowledge is represented via linguistic IF-THEN rules, which has a number of clear advantages over other representation techniques. The challenge in deriving agility measurements stems from the fact that parameters involved in the measurement of agility are not (or may not be) homogeneous. An additional difficulty in measuring agility is the lack of a one-to-one correspondence between agility factors and physical characteristics of the enterprise. As a result there exists inconsistent behavior of some parameters in the measurement of agility. The chapter presents a novel and innovative effort to provide a solid framework for determining and measuring enterprise agility overcoming the above mentioned difficulties. The proposed measurement framework is direct, adaptive, holistic and knowledge-based. In order to calculate the overall agility ofan enterprise, a set ofquantitative agility parameters is proposed, defined with the aid of fuzzy logic and grouped into production, market, people and information infrastructures, all contributing to the overall agility measurement. From a technical point of view the proposed framework has the following advantages [14], [15], [18]: 1. It is adjustable by the user. Within the context of fuzzy logic, one can define new variables, values, or even rules and reasoning procedures. The model, therefore, provides a situation specific measurement and it is easily expanded. 2. It contributes to the acquisition and the representation of expertise concerning agility through multiple antecedent IF-THEN rules. 3. It provides successive aggregation of the agility levels as they are expressed through the already known agility types and, furthermore, incorporates types, which have not been widely addressed such as the agility of the workforce. 4. Can be easily implemented within a simulation testbed. A topic of future research should be the examination of the relationship between the financial performances and the agility level measured in an enterprise. The results of such a study will be useful in determining how much agility is needed and to what extent it affects the profitability of a firm. Further, when one considers a company as a "whole entity" a topic that needs be studied is how the Research and Development sector contributes to the company's agility. Said differently, it is important to tackle how the quality of R&D and related activities, affects the overall agility measurement.

82

Nikos C Tsourveloudis

REFERENCES [1] Goldman, S. L., and Preiss, K., 21st Century Manufacturing Enterprise Strategy: An Industry-Led View, Bethlehem, PA, lacocca Institute at Lehigh University, 1991. [2] Kidd, T. P, Agile Manufacturing: Forging New Frontiers, Addison-Wesley, 1994. [3] Goldman, S. L., Nagel, R. N., and Preiss, K., Agile Competitors and Virtual Organizations: Strategies for Enriching the Customer, New York, Van Nostrand Reinhold Company, 1995. [4] Goranson, H. T., "Metrics and models," Enterprise Integration Modeling (C J. Petrie, Jr., editor), Cambridge, Massachusetts, MIT Press, pp. 78-84, 1992. [5] He,D. W, and Kusiak, A., "Design of assembly systems for modular products," IEEE Transactions on Robotics and Automation, vol. 13, pp. 646-655, 1997. [6] Lefort, L., and Kesavadas, T., "Interactive virtual factory for design of a shopflor using single cluster analysis," Proceedings ofthe1998 IEEE International Conference on Robotics and Automation, pp. 266-271, 1998. [7] Sanderson, A. C, Graves, R. J., and Millard, 0. L., "Multipath agility in electronics manufacturing," Proceedings ofthe 1994 IEEE International Conference on Systems, Man, and Cybernetics, 1994. [8] Tsourveloudis, N. C, and Phillis, Y. A., "Manufacturing flexibility measurement: A fuzzy logic framework," IEEE Transactions on Robotics and Automation, vol. 14, no. 4, pp. 513-524, 1998. [9] Zadeh, L. A., "A theory of approximate reasoning," Machine Intelligence, vol. 9, pp. 149-194, 1979. [10] DeVor, R., Graves, R., and Mills,J., "Agile manufacturing research: Accomplishments and opportunities," llE Transactions, vol. 29, no.l0, pp. 813-823,1997. [11] Demand activated manufacturing architecture, Tech. Rep. DAMA -1-195, Department of Energy, Version 1.1, Feb. 1995. [12] Cobb, C K., and Gray, W H., "Integrating a distributed, agile, virtual enterprise in the TEAM program," CALS Expo 96, 1996. [13] Zimmermann, H.-J., Fuzzy Set Theory and its Applications, 2nd edition, KIuwer, Dordrecht, The Netherlands, 1991. [14] Tsourveloudis, N. C, and Phillis, Y. A., "A Measure for Manufacturing Agility," Proceedings ofthe 4th World Automation Congress, ISOMA-9947, Maui, Hawaii, USA, 2000. [15] Tsourveloudis, N. C, and Valavanis, K. P, "On the Measurement of Enterprise Agility," International Journal of Intelligent and Robotic Systems, vol. 33, no 3, pp. 329-342, 2002. [161 Driankov, 0., Hellendoorn, H., and Reinfrank, M., An Introduction to Fuzzy Control, 2 nd edition, Springer-Verlag, 1996. [17] Balic, J., Phillis, Y. A., Tsourveloudis, N. C, and Pahole, I., "Flexibility in Manufacturing: Models and Measurement," University of Maribor, Faculty of Mechanical Engineering, Maribor, Slovenia, ISBN 86-435-0510-2,2002. [18] Tsourveloudis, N. C, Valavanis, K. P, Gracanin, D., and Matijasevic, M., "On the Measurement of Agility in Manufacturing Systems," Proceedings ofthe 2 nd European Symposium on Intelligent Techniques, Chania, Greece, June 1999.

KNOWLEDGE-BASED SYSTEMS TECHNOLOGY IN THE MAKE OR BUY DECISION IN MANUFACTURING STRATEGY

P. HUMPHREYS AND R. MCIVOR

1. INTRODUCTION

Since the 1970s, the role of the purchasing function has gone through considerable change. In the past, it was regarded as a clerical function with the objective of purchasing a good/service at the lowest price. In the early 1970s, Ammer [1] found that top management viewed purchasing as having a passive role in the organisation, with purchasing being an administrative rather than a strategic function. However, the 1973-74 oil crisis and related raw material shortages drew significant attention to the importance of purchasing [2]. Porter [3], in his seminal work on the forces that shape the competitive nature of industry, identified buyers and suppliers as two of the critical forces. Thus, the strategic importance of the purchasing function to the organisation was beginning to receive recognition in the literature. This trend continued with the purchasing function being recognised as making a significant contribution to an organisation's success [4, 5], and has resulted in purchasing assuming a more strategic role in many organisations [6, 7]. One of the core issues to have emerged in strategic purchasing has been the growing importance of the make or buy decision

[8].

The aim of this chapter is to show how knowledge based systems technology can assist in the area of strategic purchasing. The authors discuss a knowledge based system (KBS) designed to help companies in the make or buy decision, which is arguably the most fundamental component of manufacturing strategy [9]. In recent years, many companies have been moving significantly away from 'making' towards 'buying' [10]. 83

84

P Humphreys and R. McIvor

However, research carried out by Ford et al. [11] has revealed that make or buy decisions are rarely taken within a thoroughly strategic perspective. They found that many firms adopt a short term perspective and are motivated primarily by the search for short term cost reductions with little consideration being given to the content of the decision making process. The make or buy model described in this chapter attempts to overcome some of these problems by offering a structure for an organisation to follow in the make or buy decision. Within the description of this KBS there is specific focus on the issues involved in the application of case based reasoning (CBR) techniques and Multi-Attribute Analysis (MAA) to the automation of the make or buy decision. The development of this system is intended to illustrate that a case based system should be capable of providing sound solutions utilising relatively small case libraries, while avoiding a large rule base which would be required if rule based reasoning was used exclusively. THE MAKE OR BUY DECISION

The make or buy decision is being given more consideration within organisations because of its strategic implications. The make or buy decision can often be a major determinant of profitability, making a significant contribution to the financial health of a company [12]. Prior to the early 1970's, buying by organisations had been done largely on the basis of obtaining the best price, while taking into account a few other factors such as quality and delivery. However, in many cases a significant number of factors such as delivery reliability, technical capability, cost capability and the financial stability of the supplier were not taken into consideration [13]. Few companies have taken a strategic view of make or buy decisions, with many companies deciding to buy rather than make; for short-term reasons of cost reduction and capacity [11]. In addition, some organisations may find themselves in a position that has been inherited from past management decision-making. Their position in the supply chain is already established and the extent of vertical and horizontal integration already mapped out. However, this is likely to have occurred due to a series of short term decisions with no consideration for the long term strategic direction of the organisation. Some of the key problems encountered by companies in their efforts to formulate an effective make or buy decision are as follows:

(i) No Formal Methodfor Evaluating the Decision Many companies have no firm basis for evaluating the make or buy decision. Blaxill and Hout [14] have found that many firms make sourcing decisions primarily on the basis of overhead costs. The choice of which components to outsource is made by ascertaining what will save most on overhead costs, rather than on what makes the most long-run business sense. Companies are failing to consider issues such as: • What are the organisational implications of the sourcing decision? • Do the internal design and manufacturing capabilities lag behind potential suppliers? • Will customers recognise a difference in the finished product if the company out sources some of its components?

Knowledge-based systems technology

85

(ii) Inaccurate Costing Systems In many instances, companies base their sourcing decisions on cost issues. However, the results of studies carried out on the cost accounting practices and financial performance systems used by US manufacturing systems has shown that many of the accounting systems in these organisations have not kept pace with the changes in industry and the technology used in production [15]. This situation can lead companies to choosing a strategy of de-emphasising and overpricing products that are highly profitable, and expanding commitments to complex, unprofitable lines. Furthermore, surveys by the American Management Association (AMA) show that companies very little inclination to adopt new costing methods such as Activity Based Costing [16]. (iii) The Competitive Implications of the Decision Sourcing decisions can impact upon flexibility, customer service and the core competencies of the organisation. Prahalad and Hamel [17] postulate that companies who measure competitiveness in terms of price only are possibly inviting the erosion of their core competencies. The embedded skills that give rise to the next generation of competitive products cannot be 'rented-in' by outsourcing. It is interesting to note the contrasting practices of US and Japanese car makers. GM tend to view such major parts as gearboxes and engines asjust components, whereas Honda view the engine as a critical component and would never consider outsourcing its manufacture or design. A DESCRIPTION OF THE MAKE OR BUY MODEL

The first stage in developing the system was to conduct a literature review to develop a clear understanding of the process involved in the make or buy decision. Make or buy is a central theme to the ideas of manufacturing strategy, as discussed in the work of Hayes et al. [18] and Platts and Gregory [19]. The issue is considered from a variety of perspectives including the level of vertical integration of the firm, the span of manufacturing processes in a business and the nature of vendor evaluation and relationships. It is clear from its central position as one of the structural decision areas, that the impact on the other areas will be significant. In particular the make or buy decision will influence issues such as capacity and facility design as well as new product development. There are few practical accounts of a methodical approach to the make or buy decision process to be found in the literature, although discussion of the factors involved has received a significant amount of coverage. For example, authors such asJennings [20] and Quinn and Hilmer [21] identify issues such as costs, core and peripheral activities, supplier relationships and technologies which should be considered in the outsourcing decision without proposing a framework that would guide a company in the process. Venkatesan [22] describes the approach adopted at Cummins Engine which introduces the concept of linking product differentiation, component families analysis and manufacturing capability as a way of deciding which activities should be carried out by suppliers and which internally by the organisation. However, the means by which this assessment of importance is to be made is not presented in any detail.

86 P Humphreys and R. McIvor

Welch and Nayak [23] based on their experiences in US manufacturing organisations suggested a generic framework to assist firms in evaluating sourcing decisions which they termed the strategic sourcing model. This tool augments the traditional cost analysis by considering strategic and technological factors in the decision making process. In addition, factors such asthe competitive advantage ofthe process technology, its maturity and competitors' process technology positions are all considered in making the final sourcing decision. Nonetheless, there is no practical demonstration of the benefits of the models in terms of evidence from organisations that have adopted such an approach. The missing aspect in all accounts so far reviewed is a sufficiently detailed yet generic methodology that may be implemented by practising managers. Probert [24] has attempted to rectify the situation by proposing a 4 stage process to the make or buy strategic decision. The various stages in his methodology are: • Initial business appraisal-collection of company, competitors and supplier data as well as evaluation of strategic issues which face firm; • Internal/External analysis-identifYing major parts families, manufacturing processes, cost allocations and alignment of parts and technologies on the competitiveness/ importance matrix; • Evaluation of strategic options-assessment of the various sourcing options which are identified in Stage 2 in conjunction with the business data obtained; • Choose optimal strategy-applying financial decision support models to evaluate the various sourcing strategies and to identify the most appropriate fit with the organisations current and future operations. Probert applied the strategic make or buy methodology to six engineering manufacturing businesses and they reported positively in terms of its usefulness with projected business results of 20-40% improvements in return on capital employed and 30-60% stock/lead-time reductions. On completion of the literature review, the next phase was to talk directly with purchasing practitioners to elicit their views on the key steps involved in the make or buy decision. A series of structured interviews with senior procurement managers in ten multi-national organisations were conducted. It should be noted that the companies come from a variety of industries including electronics, mechanical engineering, aerospace, chemicals and medical packaging. As a result of these discussions and the literature review, a generic model of the make or buy decision-making process was developed and is outlined below. The next stage was to computerise the most important components of the system to enable feedback from procurement managers in two of the multi-nationals first interviewed. For a fuller description of the questionnaire used in the interviews and the model see Humphreys et al. [25] and McIvor et al. [26] respectively. It must be emphasised that the model described in this article is not a panacea for all of the problems associated with making an effective make or buy decision. The model attempts to overcome some ofthe problems that companies have in formulating a make

Knowledge-based systems technology

87

or buy decision as identified from the literature review and interviews conducted with senior procurement managers in ten multi-nationals, and is designed to act as a decision aid for an organisation in the formulation of this decision. An important implication of the model is that organisations should give strategic attention to the make or buy decision. When a large proportion of the resources of a company are provided by outside suppliers, this becomes of even greater importance [27]. The make or buy model is intended primarily for use with strategic items focusing on a partnership type relationship with the selected supplier. Strategic items are generally obtained from one supplier, and/or they concern products of which the short- and long-term supply is not guaranteed. Furthermore, they represent a considerable value in the cost price of the end product. Examples are engines and gearboxes for automobile manufacturers. The sourcing decision for a strategic component is one of the most difficult for any company. To effectively carry out this decision, it is suggested that a team from various parts ofthe business should be formed to develop and implement an appropriate strategy for the item [28]. This cross-functional team should be represented by the manufacturing, purchasing, finance, engineering, quality and customer service functions or teams. The stages involved in the Make or Buy model are illustrated on Figure 1. An outline will now be presented on each of the stages involved in the make or buy decision. Stage 1-Identification of Performance Categories

The first step in the process is to identity the key performance categories that are required to specify, design and manufacture the component. These are the technical capability categories and are outlined below along with a sample criterion from each. Technical capability categories

• Quality-Quality Costs/Sales Ratio • Delivery-Percentage On-time Delivery • Customer Service-Customer Inquiry Response Time Each of these categories is then given a weighting representing its importance to the analysis of technical capability. The next step is to identity the key performance categories that will provide a sound indication of the compatibility of the supplier organisation with the purchasing organisation. These issues are crucial due to the partnership character of the buyer-supplier relationship. The organisation profiles are outlined below along with a sample criterion from each. Suppliers organisation categories

• Achievement of Financial Objectives-Return on Investment • Organisation Culture-Top Management Compatibility • Technology-Current Manufacturing Capabilities • Achievement of Sales Objectives-Sales Growth • Health and Safety-Lost Time due to Accidents

88

P. Humphreys and R. Mcivor

Make or Buy Analysis

~

Profiles of Suppliers' Technical Capability

A Profile of

SourcingCompany's Technical Capability (Internal)

Stage 1

Identification and Weighting of

~

Performance Categories

Stage 2 An Analysis ofthc Technical Capability Categories

,::: Profiles of Suppliers'

Stage 3

Organisation

Comparison of

Retrieved Internal and External Technical Capability Profiles

Stage 4 .................

No Capable Suppliers Identified

~

Numberof Capable Suppliers Identified

No Suppliers Suitable

Number of Capable

,.....

Total

"Best -in-Class"

......

Profile

AcquisitionCost Analysis

--

r

Number of Capable Suppliers Identified

Buy

Goto Supplier

l

Selection Process

Key -----. Process Flow Data Flow

1

Make

.> ,.....

....... Profiles of

Suppliers' Historical Cost Performance ~

[

.................

~

.> Suitable

SuppliersIdentified

Stage 5 Organisation

•.................._................. ,

An Analysis of Suppliers'

Organisation Categories

-:.::

"Best -in-Class" Technical Capability

D

EJ

Process Analysis

Figure L The make or buy model.

DataBase Profiles

Knowledge-based systems technology

89

Each of these categories is then given a weighting representing its importance to the analysis of suppliers' organisation. Stage 2-An Analysis of the Technical Capability Categories

The objective ofthis stage is to identify in rank order those suppliers who are technically competent in their ability to supply the item. The performance of potential sources of supply (internal and external) is assessed and evaluated against the categories and criteria identified in Stage 1. An important issue which became apparent from discussions with procurement managers is that multi-nationals have to assess the technical capability of their sister companies. Stage 3-Comparison of Retrieved Internal and External Technical Capability Profiles

This stage involves a comparison of the internal and external capabilities with the "Best-in-Class" on the range of criteria identified. The performance of each potential supplier retrieved should also be compared with the "Best-in-Class". At any point in time the "Best-in-Class" score in any individual criterion would be the highest possible benchmark world-wide. This is similar to the decathlon in athletics where the score in any individual event is related to the current world record in that event. Having a "Best-in-Class" score allows potential suppliers to compare their performance against the best available suppliers world-wide. The sourcing company's technical capability performance measure will be compared against the best potential suppliers. If it is found that there are no competent suppliers with the capability of the purchasing company, then the purchasing company may feel that a "Make" decision is the most effective course ofaction. However, if there are suppliers who are technically competent (which may include the purchasing company), then further analysis of these suppliers' organisations is required. Stage 4-An Analysis of the Suppliers' Organisations

The purpose of this analysis is illustrated by an example where some suppliers may be proficient technically but have poor financial stability and have an incompatible management style with the purchasing company. As Ellram and Edis [29) point out, these are key factors if the buyer is considering forging a close co-operative relationship with the supplier for a strategic purchased item. Profiles of suppliers' organisations will include 'soft factors' that are difficult to quantify. These 'soft factors' concentrate not only on immediate concerns but also on long-term ramifications associated with a potential relationship with a given supplier. These include factors such as: financial stability; strategic fit; and top management compatibility. The purpose here is to demonstrate that different factors, usually less quantifiable in nature, are as important when a firm is seeking a supplier partnership as those that typically are included in current supplier selection models. Issues such as strategic direction and management compatibility are very important when a company is faced with the selection of a supplier with whom they wish to establish a strategic partnership. If it is found that no supplier has a suitable organisation profile with which to initiate a partnership, then a "Make" strategy would be the preferred option. However, if a number

90

P Humphreys and R. Mcivor

of suppliers have been identified as being suitable, then further analysis of the Total Acquisition Cost involved with these suppliers as well as the purchasing company is required. Stage 5- Total Acquisition Cost Analysis

It is not within the scope of this article to give a full description ofthe measurement of Total Acquisition Cost (T.A.C.) in the make or buy model. However, a brief overview ofthe steps involved will be presented to demonstrate how this stage fits into the overall model. Total Acquisition Cost sums up all the actual and potential costs involved in the purchasing process [30]. It encompasses all costs associated with the acquisition of a good/service throughout the entire supply chain and not just the purchase price. It considers costs from initial idea conception, such as collaborating with a supplier in the design phase of the component, through to any costs (for example, warranty claims) associated with the component once the completed product is being used by the final customer. When the costs have been derived for the internal and potential external suppliers, the make or buy decision will have been completed by the purchasing company. If the company finds that potential suppliers identified in the previous stages of the make or buy analysis have a higher acquisition cost, then a 'Make' decision should be made. However, if it has been found that potential suppliers have a lower acquisition cost, then a 'Buy' decision should be taken. The purchasing company would then proceed to the supplier selection process. THE MAKE OR BUY SYSTEM

The make or buy decision is highly complex and one of the most difficult tasks faced by organisations. It requires substantial judgement to assess the wide range of trade-offs present, to recognise all the alternatives available and to make a decision that balances both the short and long-term needs of an organisation. In addition, as organisational requirements and market conditions change, a decision that may have been appropriate in the past may have to be resolved in a totally different manner in the future. Some commentators believe that knowledge-based system (KBS) technology has the potential to play a more significant role in improving the quality and cost effectiveness of unstructured strategic purchasing decisions [31]. KNOWLEDGE BASED SYSTMS (KBS) AND CASE-BASED REASONING (CBR)

KBS are computer programs that solve problems by emulating the problem-solving behaviour of a human expert(s). Generating the KBS involves capturing the knowledge and problem-solving logic/methodology regarding real-world problems associated with a particular domain of knowledge. The application of KBS in purchasing management decision making has been limited. Cook [31] identifies three KBS applications adopted by the US Navy to assist in the supplier evaluation and tendering process. Commercial organisations that are applying KBS, successfully in the purchasing area include IBM, DEC and Data General that are used to source parts on

Knowledge-based systems technology 91

complex customer orders [32]. Recently, Vokurka et al. [33] outlined a prototype KBS for the evaluation and selection of potential suppliers that took into consideration the importance of the purchased item to the sourcing company. Case-based reasoning is a subset of Knowledge Based Systems [34]. Case-based reasoning is a problem solving approach that relies on past, similar cases to find solutions to problems, to modify and critique existing solutions and explain anomalous situations [35]. CBR is a rich and knowledge-intensive method for capturing past experiences, enhancing existing problem solving methods and improving the overall learning capabilities of machines [36]. The CBR system mirrors the problem-solving approach taken by a manager who solves current problems using past experiences. CBR systems provide decision support to managers through an interactive question and answer session. In CBR, a new problem or situation case is compared with a library of stored cases-a case base. Each case contains information regarding a specific problem situation and its solution. Case-based reasoning systems show significant promise for improving purchasing management decisions in problem areas that are complex, unstructured and knowledge poor. CBR systems, used as purchasing decision support tools, result in faster, more accurate, more consistent, higher quality and less expensive decisions [37]. Aamodt and Plaza [38] have described CBR as a cyclical process comprising the four REs: 1. RETRIEVE the most similar case(s); 2. REUSE the case(s) to attempt to solve the problem; 3. REVISE the proposed solution if necessary; 4. RETAIN the solution as part of a new case. A new problem is matched against cases in the case base using heuristically cased indexed retrieval methods with one or more similar cases being retrieved. A solution suggested by the matching casesis then reused and tested for success. At this stage, if the best retrieved case is a perfect match, then the system has achieved its goal and finishes. However, it is more usual that the retrieved case matches the problem case only to a certain degree. In this situation, the closest case may provide a sub-optimal solution or the closest retrieved case may be revised using some pre-defined adaptation formulae or rules. Adaptation in CBR systems means that such systems have a rudimentary learning capability, which can improve (become more discriminatory) as the number of cases increases. However, there are a number of limitations with CBR applications. For example, when using past experiences to solve problems it is quite difficult to determine whether the solutions to past experiences have been successful over time. Also, with the case expanding through the addition of new cases it is possible that a lot of the cases within the case base may become redundant. Case adaptation can be a very complex process in attempting to derive modification rules. In this paper only the retrieval aspect of CBR systems is used. It is anticipated that formulae and domain knowledge rules may be used for adaptation of cases in future work.

92

P Humphreys and R. McIvor

THE REQUIREMENTS

The requirements of the system were determined from the following sources: 1. A thorough review of the literature on strategic purchasing and in particular the make or buy decision. 2. On completion of the literature survey, interviews with ten procurement managers were carried out to determine current make or buy practice. From these sources the primary requirement of the system was to provide a company with a formal method for the analysis of the make or buy decision. This is the high level requirement of the system which can be refined into the following sub objectives: • The system will address vital issues that should be considered when analysing the technical capability and organisational profiles of potential suppliers. For example, what criteria should be included when analysing the delivery performance of suppliers? • The system will allow the company to compare the technical capability of its internal operations in relation to the best potential suppliers in the industry. This will permit the company to identify any advantages or disadvantages it may have over these suppliers. • The system will allow the user group to carry out a comprehensive appraisal of the factors that must be considered when forging a partnership relationship with a supplier. These factors concentrate on the long term ramifications associated with a potential relationship with a supplier. • The system will contain a data structure to store the vital information necessary in order to make an effective make or buy decision. For example, the purchasing function can maintain and update records of suppliers' technical capability. This information could also be used in a Vendor Assessment and Selection system. • The system will provide a framework to analyse the costs associated with the adoption of either a make or buy strategy. • The interaction style in the system will be designed for use by a management team rather than one individual. The dialogue between the managerial team and the system will be in the form of questions with various menus from which options can be chosen. • The system will have a "what-if" analysis function in order to examine the impact of a change in the data inputs on the results. For example, when analysing the Delivery performance of potential suppliers, the user group may wish to alter the weighting assigned to the Percentage On-time Delivery criterion to observe the effect this would have upon which suppliers are retrieved. This function recognises the dynamic nature of the issues addressed in the model. Certain factors in the model will change over different planning periods. The user can build different scenarios to allow for this situation.

Knowledge-based systems technology

93

SYSTEM DEVELOPMENT

It was decided that a prototyping approach would be adopted for the development process. This allows early evaluation of the prototype to be carried out with two of the companies initially interviewed. Issues such as interface design, proposed changes and enhancements for the system were also addressed at this stage. From this evaluation modifications were made to the structure of the make or buy decision model. As identified in the requirements stage, the system has to be PC based using an industry standard database. It was decided to use Visual Basic as the main development environment. This allows rapid development of code, and external specialised libraries may be linked in for use by the main program. In practice, these external libraries comprises a CBR library, ReMind, and some graphics code libraries. The system uses an MS Access database as a back-end data store. It was felt that ReMind was the best tool for the case based retrieval function, as it uses the nearest neighbour algorithm that proved most suitable when retrieving cases where a large number of features (fields) had a numerical data type. It is assumed that the companies using the system will have a vendor assessment system with a back-end data store containing the following information: • Records ofthe performance ofsuppliers on contracts carried out previously in relation to issues such as quality, delivery, and customer service. • Records of the Best-in-Class performance benchmarks in their industry. • Cost performance breakdowns on suppliers. • Detailed information on suppliers such as financial performance and the culture of the organisation. During the interaction process the system will retrieve the suppliers that most closely meet the ideal characteristics required for the current contract from the vendor assessment database system. The information requirements in the make or buy model are illustrated in Figure 1 as database profiles. SYSTEM DESCRIPTION

The system is structured around the make or buy model as shown in Figure 1. The system maps closely to the stages outlined earlier in a description of the make or buy model. Stage 1-Peiformance Criteria Identification and "ffeighting

The user group must identity and weight the performance categories that are required to supply the component. This involves carrying out the following:

1. Technical Capability Categories Select the categories that denote the technical competence of the potential supplier to supply the item.

94

P Humphreys and R. Mcivor

Click on each category you wish to have included in the analysis. Choose a weighting to represent the importance 01 each category under each section. Enter a number to denote the order in which you wish to analyse each category . Double click on each category to amend the criteria lor inclusion in the analysis.

Technical Categories Weight

Order 01 Analysis

Quality

0.4

f

/Xi

D elivery

0.3

f

2

/iC

Customer Service 0.3

f

[]

)(

Orqenisenon Categories Weight

Technology

Ii'

Sales Objectives

~

Health and Safety

Figure 2. Performance criteria identification and weighting.

2. Suppliers Organisation Categories Select the performance categories that will provide a sound indication of the compatibility ofthe supplier organisation with the purchasing organisation. Each ofthese categories is then given a weighting that represents its importance to the analysis of suppliers' organisation. The user group has the option of selecting the order in which each category is analysed. For example, under Technical Capability, the user group may wish to analyse the Delivery category before the Quality category. A number of these Technical Capability and Suppliers' Organisation categories will be composed of quantitative criteria, while others will be qualitative criteria. For example, the Quality category will have quantitative criteria such as quality costs/sales ratio, while the Organisation Culture category will have qualitative criteria such as the Level of Trust. This is shown in Figure 2. Figures 3 and 4 shows the decomposition of the categories and criteria within the Technical Capability and Supplier Organisation categories. Stage 2- Technical Capability Stage

The objective ofthis stage is to identify in rank order those suppliers that are technically competent in their ability to supply the item. It involves analysing three categories of criteria. The user group must analyse each category in turn to determine the scores of each potential supplier. An example of how the system retrieves the best suppliers on Quality category using the nearest neighbour retrieval function is shown below.

~

'"

Figure 3 . Techni cal capability criteria.

-.0 0'

Figure 4. Organisational profile criteria.

Customer Service

Technical Capability

Ratio ofDelivery Complaints to Monthly Purchase Orders

Knowledge-based systems technology

97

QUality Category Cflten· o:::======~::q;;:==;::::::=;jll

Please enter the 'ideal' values for each criterion that

}lOU

require frlllll a source.

Idlt V

'We

gUlllit, CostsJSllles (%) :

0

0.1

ScrapNoluNl (~) :

0

0.1

'WosleNolloRc (%):

0.1

Number of \ll'ananty Clans :

0.2

Downtimc on Equipmcnt (hrs.) :

0.1

ReturnsNolume (2:) :

0.4

W e ig ht Settings [

Click here to amend weightings

~ N umb er of Potential Sources to

I~

Nulllbcr :

Retrllwe-I

3

Figure 5. Technical capability analysis.

Quality category example

(i) Weight the importance of each criterion in each category to the purchasing decision. For example, in the Quality category, Quality Costs/Sales Ratio may be considerably more important than all the other criteria in the category combined. (ii) Enter the 'ideal' values for each criterion in each category. These 'ideal' values represent the most technically competent performance rating required from a supplier or competitor along each criterion. The purchasing company may have an objective for each criterion, or it may be the best possible value for the criterion. For example, if a supplier has carried out a contract for the company with zero defects, then this will be the best possible value for this criterion. The Quality category dialogue box in Figure 5 illustrates this. (iii) The system will then retrieve the potential suppliers that most closely meet the ideal criterion values set out by the user group using the nearest neighbour retrieval function (refer to equation 1). Nearest neighbour retrieval works by retrieving cases based upon a comparison of a collection of weighted features in the problem case to the same features in the stored cases. Depending on the weight given to each

98

P. Humphreys and R. McIvor

lity - Potential Sources Retrie....ed - - Click here to view another potential source

1-

=Conlr ct N

De

e:

PCBs

ils

Sowell' SDOIe .

91.85

Dilte :

Criterion

Value

Quality Costs/Sales (%) :

2

SCiapNoIume (%) :

0

WasleNolume (%) :

0

Number 01 Warranty Claims :

0

Downtillle on Equipment [hr•. ) :

..

AetUinsNoIUllle (%) :

2

f!Ana ly is Options Click here IOf Options.

I:

ext»

Figure 6. Technical capability stage.

feature, an aggregate match score is calculated [39].

L~=l /IV; x sim

"n

L... 1= 1

(j/, f/)

Wi

(1)

where TV; is the weight of feature i, sim is the similarity function, and 1;1 and Ii R are the values for the features of the input and retrieved cases. The retrieved case with the highest aggregate match score represents the nearest match, and cases with a lower score are ranked beneath the highest scoring case. The potential retrieved suppliers will be comprised of both the potential suppliers and the purchasing company. It must be noted that the 'ideal' profile for each category will also be compared against the internal performance of the purchasing company. It is not just a case of comparing the 'ideal' profile against potential suppliers, both the internal and external dimensions are considered. This is illustrated in Figure 6.

Knowledge-based systems techno logy 99

Table t Calculating the total technical capability score for a hypothetical supplier Categories Quality Delivery Customer Service

Performance score

Category weight

Weighted performance score

0.75 0.62 0.91

0.2 0.4 0.4

0.15 0.25 0.36

Total Score:

0.76

O nce these tasks are carried out for each category within the technical capability analysis, each potential supplier will have a score for each category. These scores are then multiplied by the weights chosen in Stage 1 to attain a total weighted score for technical capability analysis. An example of how this is calculated for a hypothetical supplier is illustrated in Table 1. Case Structure

Each case structure is compos ed of a number of fields representin g the criteria in each category. The Quality case structure consists of the following fields: • Co mpany N umber • Co mpany Name • Contract N ame • Date • Q uality Costs/ Sales (%) • ScraplVo lume (%) • Waste/ Volume (%) • Number of Warranty Claims • Downtime on Equipment (hrs.) • R eturns/Volume (%) Cases within the case library will consist of the relevant perfor mance values of the suppliers retrieved from th e vendo r assessment system. For example, ifS suppliers in the database fulfil the quality criteria, the system exports th e relevant performance values of each of these suppliers into the case library, with one case representing the details of a contract previously carrie d by a supplier. The structure and number of fields in each case may be customised to suit the requirements of the organisation in which the system is being implemented. When the quality require ment s of the ideal supplier are input, the system attempts to find similar cases (suppliers) in memo ry, where similarity is dete rmined by how closely the values of the criteria of the new (or 'ideal supplier profile') case and a stored case match. Stage 3-Comparison of R etrieved Internal and External Technical Capability Profiles

This stage involves a com parison of the internal and exte rnal capabilities with the "Best- in- Class" on th e range of criteria identified. The perfor mance of each potential supplier retrieved should also be compared with the "Best-in -Class". O nce all the

100

P. Humphreys and R. McIvor

Quo.lity

Delivery

Customer Servrce

Weighted core

Supplier A :

0.15

0.25

0.36

Supplier 8 :

0.17

0.28

0.36

0.89

Supplier C :

0.13

0.23

0.3

0.66

Supplier D :

0.17

0.38

0.36

0.91

Supplier E :

0.15

0.35

0.33

0.83

Inlernal Source A :

0.18

0.36

0.36

0.9

Inlernal Source 8 :

0.15

0.32

0.33

0.8

SI Ius

0.76

Proposed Advice Discard the following

~upplielS

:

Supplier A Supplier C Proceed 10 lin An Iy~i~ 01 the OrllClfli~lItion Profile~ of : Supplier B Supplier D Supplier E

Note

Acceptance Threshold = 0.8

Figure 7. Comparison of internal and external sources.

potential suppliers have been analysed, the system will then filter out any suppliers that are unsuitable. This is done on the basis of the total score each supplier attains in the technical capability analysis. For example, the acceptance threshold set by the user may be 0.8. If any of the suppliers have a total score greater than this threshold, then these suppliers are considered suitable. The sourcing company's technical capability performance measure will be compared against the best potential suppliers. If it is found that there are no competent suppliers with the capability of the purchasing company, then the system advocates a "Make" decision. However, ifthere are suppliers technically competent (which may include the purchasing company), then the system advocates that further analysis of these suppliers' organisations is required. An example of this type of decision is shown in Figure 7. As a further feature it was decided to incorporate a "what-if" analysis function at this stage of the analysis for the user group. "What-if" analysis is a type of sensitivity analysis because it is structured as "What will happen to the solution if an input variable, an assumption, or a parameter value is changed?" [34]. In this case the system allows the user group to examine the impact of changes on the data input earlier in the consultation process on the results.

Knowledge-based systems technology

101

Stage 4-An Analysis of the Suppliers' Organisations

The purpose of this stage is to assess the organisation profile of the suppliers that have been identified in the previous stage as being technically proficient. This involves analysing the relevant characteristics used in establishing a close collaborative relationship with a supplier. It involves an in-depth analysis of the four categories indicated previously: organisation culture; technology, achievement of sales objectives; financial objectives. As Ellram and Edis [29] indicate, these are important factors if the buyer is considering developing a close relationship with the supplier for a strategic purchased item. The suppliers' organisations profiles will include soft factors that are difficult to quantify and concentrate not only on immediate concerns but also on long-term ramifications associated with a potential relationship with a given supplier. These include factors such as financial stability, strategic fit and top management compatibility. The purpose here is to demonstrate that different factors, usually less quantifiable in nature, are as important when a firm is seeking a supplier partnership as those that typically are included in current supplier selection models. Issues such as strategic direction and management compatibility are very important when a company is faced with the selection of a supplier with which they wish to establish a strategic partnership. Applying multi-attribute analysis

Managerial decision making inevitably involves the consideration of multiple objectives. For some problems like short term production scheduling, the dominance of certain objectives such as cost reduction justifies the use of single objective models to analyse these problems. However, for longer term planning problems, the use of single objective models are inadequate due to the complexity and subjective nature of the problem. The need to identify and consider simultaneously a number of objectives in the analysis and solution of some problems has resulted in the development of a relatively new field of study-multiple criteria decision making (MCDM) [40]. Over the last two decades, there has been a steady growth in the number of MCDM methods [41,42]. MCDM models can be categorised into two groups: multiple objectives decision making (MODM) and multiple attribute decision making (MADM). MODM methods are sometimes viewed as natural extensions of mathematical programming, where several objective functions are considered simultaneously and the decision variables are bounded by mathematical constraints. MADM methods, on the other hand, involve choosing from a finite number of feasible alternatives that are characterised by multiple but fixed attributes. The most widely used methods of MADM are the multiple attribute utility theory (MAUT), the outranking methods and the analytical hierarchy process (AHP). MAA is capable of selecting and identifying optimum choice in respect of the same objectives where the decision alternatives are predetermined. The advantages of MAA are primarily that it facilitates decision making despite the presence ofmultiple conflicting criteria [43]. It is a quantitative approach that considers multiple attributes in respect of multiple client objectives with preferences incorporated by the assignment of importance weights. MAA reflects real decision situations by encompassing client judgements. Options may be assessed systematically to produce

102

P. Humphreys and R. Mcivor

Table 2(a) Technology profiles of supplier alternatives in respect of qualitative factors Qualitative factors Supplier alternatives

Manufacturing capabilities

Technical support

Design capability

Investment in R&D

Speed of development

New product introduction rate

Supplier B Supplier D Supplier E

Acceptable Excellent Acceptable

Excellent Poor Poor

Excellent Poor Excellent

High High Low

Excellent Poor Poor

Excellent Excellent Poor

Table 2(b) Technology profiles of supplier alternatives given quantitative values Qualitative factors New product Supplier Manufacturing Technical Design Investment Speed of introduction Percentage rate Sum score alternatives capabilities support capability in R&D development Supplier B Supplier D Supplier E

0.5

1.0

0.5

1.0

0.2 0.2

1.0

0.2

1.0

0.8 0.8 0.2

1.0

0.2 0.2

1.0 1.0

0.2

5.3 3.4 2.3

88.3 56.6 38.3

aggregated results with the highest score indicating the optimum choice. Therefore, MAA is suitable for the multi-criteria nature of the make or buy decision. An example of how the system evaluates the best suppliers on Technology using Multi-Attribute Analysis is illustrated below. Technology Example

Assume that from an analysis of the Technical Capability of a number of potential suppliers, three potential suppliers are considered sufficiently competent to produce the item. It is now necessary to evaluate these suppliers in the light of six criteria in the Technology category identified from the literature review which include manufacturing capabilities, technical support, design capability, investment in R&D, speed of development and new product introduction (NPI) rate [44, 45 and 46]. The performance values for each of these criteria with regard to each supplier is stored in the vendor assessment system. These values can be updated depending upon the performance of each supplier over time. Table 2(a) shows these factors given qualitative and quantitative assessmentsfor each supplier. These factors will now be given quantitative values to facilitate easier evaluation as shown in Table 2(b). The scores for each factor are then summed and represented in percentage terms. To further improve upon this, importance weights have to be introduced in order to emphasise the importance of each factor in the make or buy decision making. Importance weights will be determined in relation to the nature ofthe contract between the buyer and supplier. For example, if the buyer wishes to establish a long term co-operative relationship with a supplier, then Design Capabilities is crucial to its success. As an example, weights are assigned as shown in Table 2(c) with the weighted scores worked out. It can be seen that suppliers Band E have maintained their score.

'"

o

-

W

0.15 0.15 0. 15

Supplier alter na tives

Sup plier B Supplier D Supplier E

W

0.25 0 .25 0.25

0.5 1.0 0.5

V

1.0 0.2 0.2

Tech nical suppo rt

V

Manufa cturing capabilities

0.20 0.20 0.20

W V

1.0 0.2 1.0

Design capability

0 .20 0.20 0. 20

W 0 .8 0 .8 0 .2

V

Investm ent ill R &D

Q ualitative factors

Table 2(c) 'Tech nology profi les o f supplier alternatives with qu an titative and weighted values

V

1.0 0.2 0.2

W 0.10 0.10 0. 10

Spe ed o f develop m ent

0 .10 0 .10 0 .10

W

1.0 1.0 0 .2

V

N ew Prod uct int rodu cti on rate

0 .88 0 .52 0 .36

Weight cd sco re

104

P. Humphreys and R . Mcivor

Supplie rs Organi sation Analysi s

H

o

S....

W,

Score

017

n.es

0.118

0.84

0.15

nns

O.~

069

0.07

0.08

0.65

< I!.ack

s

ext»

Figure 8. Suppliers organisation analysis.

However, supplier D 's score dec reases because of the low score obtained for design capability and the relatively high weighting given to this factor . T he need for using MAA can be appreciated due to the conflicting evaluations across the technol ogy criteria identified. O nce these tasks are carr ied out for each category within the organisation profile analysis, a total organisation profile score is computed for each supplier. The calculation meth od for the total score is the same meth od as used in the calculation ofthe techni cal capability total score. The system will filter out any suppliers that are unsuitable on the basis of the total score each supplier attains under the organisation analysis. If it is found that no supplier has a suitable organisation profile with which to initiate a partnership with , then a Make strategy is advocated by the system. An example of this is shown in Figure 8. However, if a number of suppliers have been ident ified as being suitable, the system recommends that further analysis of the Total Acqu isition Cost involved with these suppliers as well as the purchasing company is required. EVALUATION

Th e system prot otype developed has been refined and tested over a period ofsix months in a multi-nati on al telecommu nications company (for the purp oses of confidentiality the organisation will be referred to as the Company). Preliminary work has focused on customising the gene ric model of th e make or buy process to meet the specific needs of the organisation . The system is currently proficient at evaluating suppliers capabilities based on technical and organisational profiles (Stages 1 to 4 of Figure 1)

Knowledge-based systems technology

105

and the next step is to include a mechanism for inte grating the total acquisitio n cost into the decision making process (Stage 5). Even at such an early phase in the project, the Co mpany has identi fied a number of ben efits from the implement ation of the system which are as follows: 1. Provides potential suppliers with a clear understandin g of the priorities of the organisation with regard to key performance criteria. For example, as can be seen from Figure 2, und er the technical catego ry, Quality is perceived by the Company to be of higher imp ort ance than delivery and custom er service. Hence, suppliers can organise and manage their business operations in an attempt to match the criteria desired by the company. 2. Provide s clarification to those potential suppliers wh o were unsuccessful in bein g awarded a contract and assists them in enhancing their com petitive position. For example, Figure 7 is a comparative evaluation of each of the potenti al sources of supply. It can be seen that Supplier C achieved the lowest scores across all three categories and was unsuccessful in getting the contract. Supplier C could investigate the reasons for their poor performance by breaking down each category into its constituent elements. This would provide a more detailed analysis of their areas of weakness in relation to the "Best-in-Class" and is illustrated in Figure 6 for the Q uality criteria. 3. Internal sources of supply within the Company can be provided with a detailed analysis of their strengths and weaknesses in relation to other suppliers. In effect, the system provides a method of benchm arkin g the int ern al suppliers' techni cal criteria against tho se of external suppliers. Conse quently, these sister companies can identify potential areas for improvement and ultimately raise their level of com petence, improving the overall comp etitive position of the Company. 4. A cross- functional task force from the C om pany was involved in initially definin g and selecting the model attributes, as well as establishing "Best-in -Class" techni cal and organisatio nal profiles for suppliers throu gh a ben chm arking exercise. Consequently, the close interaction between staff has enhanced their understandin g of the various functional areas involved in the make or buy decision and at the same time has improved the cohesiveness of th e pro curement team. 5. Within the telecommunications industry produ ct development tim es are measured in months and comp anies are continuously investigating ways of compressing the tim e to market in order to enhance their speed of response to custome rs. The system assists in reducing the product development timeframe since it auto mates the supplier selection process and provid es th e Company buyers with a flexible and responsive tool for evaluating prospective suppliers. Before the introduction of the new system, buyers spent several days in discussion s with design , manufacturing, finance, marketing and accounting profession als in order to determine the most suitable vendo r. Since this knowledge is now contained within the system, the length of time involved in conducting the evaluation process has been considerably redu ced. In terms of disadvantages the Co mpany identified a number of imp ortant issues which they felt were key factors in th e success of th e projec t, but which required

106

P Humphreys and R. McIvor

considerable effort on the part of the organisation: 1. A significant proportion oftime was spent by personnel from the Company identifying and measuring best in class suppliers for each of the attributes in the model. For large companies like the Company this task is made easier given their global presence and the fact that historical data relating to existing suppliers already existed. The process of data collection was facilitated by the recent establishment of a benchmarking team at corporate level who had been identifying key performance metrics expected of suppliers within the telecommunications industry. 2. The various attributes within the model are weighted according to their importance in the purchasing decision. The weightings for each factor were determined by members from the multi-functional task force at a series of meetings where the importance of each variable was discussed and evaluated. Considerable time was spent by the team in achieving consensus, particularly with the qualitative factors which are more judgmental in nature. It also became apparent that over time the importance given to each attribute may change and hence the cross-functional team would need to meet on a regular basis to discuss and assess the contribution of each criteria to the make or buy decision. FURTHER ENHANCEMENTS

Dynamic performance analysis

An important enhancement would be the capability to compare two or more suppliers' performance measures over time. The purchasing company needs to discover determinants ofevents, and/or trends in supplier performance. From this analysis they can assess whether the performance of each supplier is improving or declining. The temporal aspects may simply be treated as another dimension, yet this approach may lose much of the semantic information encoded in trend lines. The authors intend to investigate if techniques that are used frequently in econometrics, such as co-integration, may be employed in conjunction with nearest neighbour retrieval to replicate more correctly the make or buy decision-making process. A consultancy tool

It is anticipated that the end user of this system would be the personnel in a company responsible for the make or buy decision. Another class of user is the consultant. In this mode of operation, the system could be used to collect and analyse the relevant information for make or buy analysis from the client company. The system would automatically generate a report containing advice, stating the reasons for and against the conclusions. It is envisaged that usability issues are addressed in a future version of the tool. Application of AI techniques

The make or buy decision-making process is complicated by the fact that various criteria (quantitative and qualitative) must be considered. As indicated by Vokurka

Knowledge-based systems technology 107

et al. [33], the criteria used may vary across different product categories and situations; trade- offs may exist amo ng the various criteria and these may not be readily apparent; often the data is not available or its validity is suspect, and in many cases the relevant objec tives are in conflict. Additio nally, multiple participants are involved in the assessment process. In some produ cts or phase of the assessme nt process one functional area may have more influence, whereas at other times another functional area may be in the influential position . Hence, it requires substantialjudgement to assess the wide range of trade- offs present, to recognise all the alterna tives available and to make a decision that balances both the shor t- and long-t erm needs of an organisation. Consequently, with regard to future work it is proposed that a hybrid approach be adopted in which the uncer tainty and ambiguity of decision-ma king is modelled using fuzzy logic in conju nction with a rule-based intelligent system approach to assessing the performance of suppliers [47]. The main positive characteristic offuzzy logic is that it can easily link qualitative and subjective 'fuzzy' variables with quantitative variables. T he former can represent concepts using 'linguistic variables' (variables whose values are words or sentences). Linguistic variables can cope with the multidimensionality or subje ctive character of concepts related to the estimation of supplier capabilities and environmental resources. Fuzzy sets have the potential to significantly imp rove the supplier assessment model describing supplier characteristics in a more effective and 'h uman like' manner. In many respects fuzzy sets reflect the way in which experts think about a problem. CONCLUSION

The strategic purchasing mo del described in this article attemp ts to overcome some of the problems highlighted earlier associated with the out sourcing decision, and act as a decision aid for a cross- functional team involved in the make or buy evaluation process. Furth er implement ation of this system is being carr ied out in collaboration with a multi-na tional electronics company and engineering manufacturi ng company. Th e development of this system has shown that it is possible to use a knowledge based systems methodology to build a support system in an area ofstrategic purchasing, especially if the dom ain is well defined, has a large numb er of factors to be considered and the relevant knowledge is available. The system uses case based retri eval technology in order to take advantage of the reasoning power of this techn iques. Integrating different IT and KBS techniques into a hybrid system provided an environment suitable for rapid application development . C BR can be combined with, for example, MAA to construc t an effective knowledge based system. In the particular case of analysing the technical performance of potential suppliers, it was foun d that case based reasoning adapts more naturally to the actual way in which the purchasing company carr ies out this process. Case based reason ing easesthe task ofknowledge acquisition in comparison with conventional rule based me thods. A case base in this contex t can be produ ced from the necessary performance criteria required for the cur rent purchasing situation, which may be obtained directly by interviewin g th e members of a cross-functional make or buy team.

108

P. Humphreys and R. McIvor

REFERENCES

[1] Ammer,D. S. (1972). Is your purchasing department a good buy?, Harvard Business Review, March-

April, 36-59. [2] Farmer, 0. (1978). Developing purchasing strategies,Journal of Purchasing and Materials Management, 14, Fall, 6-11. [3] Porter, M. E. (1980). Competitive Strategy: Techniques for Analysing Industries and Competitors. New York: The Free Press. [4] Spekman, R. E. (1981). A strategic approach to procurement planning, Journal of Purchasing and Materials Management, Winter, 3-9. [5] Burt, 0. N. and Soukup, W R. (1985). Purchasing's role in new product development, Harvard Business Review, September-October, 90-96. [6] Gadde, L. and Hakansson, H. (1994). The changing role of purchasing: reconsidering three strategic issues, European Journal of Purchasing and Supply Management, 1 (1),27-35. [7] Lamming, R. (1993). Beyond Partnership, Strategies for Innovation and Lean Supply, Prentice-Hall, Hemel Hempstead, UK. [8] McIvor, R. T, Humphreys, P. K., and Mc Aleer, W E. (1997). The evolution of the purchasing function, Journal of Strategic Change, 5 (6), 169-179. [9] Probert, 0. R., Jones, S. W, and Gregory, M.]. (1993). The make or buy decision in the context of manufacturing strategy development, Journal of Engineering Manufacture, Proceedings of the Institution of Mechanical Engineers, 207, 241-250. [10] Dobler, 0. W, Burt, D. N., and Lee, L. (1990). Purchasing and Materials Management, McGraw-Hill, New York. [11] Ford, D., Cotton, B., Farmer, D., Gross, A., and Wilkinson, I. (1993). Make-or-buy decisions and their implications, Industrial Marketing Management, 22, 207-214. [12] Yoon, K. P. and Naadimuthu, G. (1994). A make-or-buy decision analysis involving imprecise data, International Jonrnal of Operations and Production Management, 14 (2), 62-69. [13] Dooley, K. (1995). Purchasing and supply-an opportunity for OR?, OR Insight, 8 (3),21-25. [14] Blaxill, M. F. and Hout, T M. (1991). The fallacy ofthe overhead quick fix, Harvard Business Review, July-August, 93-101. [15] Davis, E. W (1992). Global outsourcing: have US managers thrown the baby out with the bath water? Business Horizons, July-August, 58-65. [16] American Management Association (1991). Accountants admit numbers don't add up, Industry Forum, April,4. [17] Prahalad, C. K. and Hamel, G. (1991). The core competence of the corporation, Harvard Business Review, July-August, 79-91. [18] Hayes, R., Wheelwright, S., and Clark, K. (1988). Dynamic Manufacturing: Creating the Learning Organization, Free Press, New York. [19] Platts, K. and Gregory, M. (1989). Competitive Manufacturing: A Practical Approach to the Development of a Manufacturing Strategy, IFS, Bedford. [20] Jennings, 0. (1997). Strategic guidelines for outsourcing decisions, The Journal of Strategic Change, 6,85-96. [21] Quinn,]. B. and Hilmer, F. G. (1994). Strategic outsourcing, Sloan Management Review, Summer, 43-55. [22] Ventkatesan, R. (1992). Strategic sourcing: To make or not to make, Harvard Business Review, November-December, 98-107. [23] Welch,]. and Nayak, P. (1992). Strategic sourcing: A progressive approach to the make or buy decision, Academy of Management Executive, 6 (1), 23-30. [24] Probert, D. (1996). The practical development of a make or buy strategy: The issue of process positioning, Integrated Manufacturing Systems, 7 (2), 44-51. [25] Humphreys, P. K., McAleer, W E., and Mcivor, R. T (1996). Strategic purchasing: the implications for Northern Ireland business, Irish Business and Administrative Review, 17. [26] McIvor, R. T, Humphreys, P. K., and Mc Aleer, W E. (1997). A strategic model for the formulation of an effective make or buy decision, Management Decision, 35 (2), 169-178. [27] Hines, P. (1996). Purchasing for lean production: the new strategic agenda, International Journal of Purchasing and Materials Management, Winter, 2-10. [28] Smytka, 0. L. and Clemens, M. W (1993). Total cost supplier selection model: a case study, InternationalJournal of Purchasing and Materials Management, Winter, 42-49.

Knowledg e-b ased systems techno logy

109

[29] Ellram, L. M. and Edis, O. (1996). A case study of successful partnering implementation , International Journal of Pu rchasing and Materials Management, Fall, 20-28. [30] DT I (1995). Efficiency and Value in Purchasing and Supply, Lond on. [31] Cook , R . (1992). Expert systems in purchasing: applications and development. Internati onal Journal of Purch asing and Materials Management, Fall, 20-27. [32] Allen, M . and H elferich, 0. (1990). Putting Expert Systems To Work In Logistics, Oak Brook, IL: Co uncil of Logistics M anagement. [33] Vokurka, R ., C hoob ineh ,]. , and Lakshmi, V. (1996). A protorype expert system for the evaluation and selection of pot ential suppliers, Intern ational Journal of Op eration and Production Management , 16 (12), 106---127. [341 Turban, E. (1995). Decision Support Systems And Expert Systems (fourt h ed.), Prent ice- H all, New Jersey. [351 Kolodne r,} , L. (199 1). Improving huma n decision- making through case-based aiding, AI Magazine, 12 (2), 52---{)8. [36] Schank , R . C. (1982). Dynamic Mem ory : The T heo ry of Reminding and Learning in Com puters and People, Cambridge Un iversiry Press, New York. [371 Co ok, R. L. (1997). Case-based reasoning systems in purchasing: applications and development, Interna tional Journal of Purchasing and Mater ials Management, Winter, 32-39. [381 Aamodt, A. and Plaza, E. (1994). Case-based reasoning: foundat ional issues, methodological variations and system approaches, Al Communications, 7 (1), 39- 59. [39] Kolodner, J. (1993). Case Based R easoning, Morgan Kaufmann, Californi a. [401 Mustafa, A. and Goh, M . (1996). Multi-criterion models for higher educ ation administration, OMEGA, 24 (24), 167- 178. [41] Stewart, T. J. (1992). A critical survey on the status of multiple criteria decision making and practice, O MEGA, 20, 569-586. [421 Co lson, G. and Bruyn, C. D. (1989). Models and methods in multiple obj ectives decision making, Mathemat ical Compu ter Mod elling, 12, 1201-1 2 11. (43) H wang, C. and Yoon , K. (1981). Multi-Attribute Decision Making, a State of the Art Survey, Springer, Berlin. [441 Dowlatashahi, S. (1996). Th e role of logistics in concurrent engineering, Intern ational Journ al of Produ ction Economi cs, 14, 89- 199. [45] R oy, R . and Potter, S. (1996). Managing engineering design in complex supply chains, Intern ational Journal of Technology Management , 12 (4), 403-420. [46] Gerw in, D. and Guild, P. (1994). R edefining the new produ ct introdu ction process, Intern ational Journ al of Techn ology Management, 9 (5/6 17), 678--690. 1471 Morlacchi, P. (1999). Vend or evaluation and selection: the design process and a fuzzy-hierarchical model , 8th Intern ational Annu al IPSER A Conference, Belfast and Dublin , 6 11-620.

INTELLIGENT INTERNET INFORMATION SYSTEMS IN KNOWLEDGE ACQUISITION: TECHNIQUES AND APPLICATIONS

SHIAN-HUA LIN

1. INTRODUCTION

The explosive growth of the World Wide Web continues to revolutionize information editing, publishing and accessing patterns. Within the Web infrastructure, individuals can easily edit and publish documents that contain hyperlinks to other documents published by the same or other Web sites. As a result, the Web contains information on almost any subject available anywhere to anyone at anytime. However, this explosive information growth has made the task of finding information like trying to find a needle in a haystack. Although directory services (like Yahoo! 1) and search engines (like Google/) facilitate information searches, many users still have difficulty locating useful information. Browsing directories is time consuming as there are a seemingly infinite number of possible topics. For example, Open Directory (currently the largest directory database) contains over 460,000 categories'. Users must click and click and click to find a target directory and browse documents. Furthermore, the construction of directories is labor-intensive and the directory service cannot keep up with Web growth. Finding documents using search engines is frustrating as search results usually contain thousands oflinks. Although some search engines like Google apply hyperlink analysis to provide better ranking, it is still often ineffective.

1 http://www.yahoo.com/. 2http://www.google.com/. 3http://dmoz.org/. The Web site contains over 3.8 million sites, 57,238 editors, and over 460,000 categories when I visited the site at June 26, 2003.

110

Intelligent internet information systems in knowledge acquisition

111

Consequently, finding the right document on the Web is difficult when using directory services and search engines. Obtaining the desired information from a Web document is even more difficult. Users usually want to not only find documents but also answers within the documents. For example, say a person wants to know which computer vendor sells a chipset notebook that fulfills his or her price requirement. Unfortunately, Yahoo! and Google cannot provide this information. The user must try to find a Web site that provides a price-comparison" service, connect to the site, input his or her requirements into the search fields, and then possibly obtain useful results. However, price-comparison sites are usually database-oriented applications and are highly dependent on people to manually enter product information. In this paper, I propose an Intelligent Internet Information (I3) system to collect and extract structured information from Web documents. By obtaining knowledge from the pre-processed structured information, the I3 system aims to make possible the automatic construction of an Internet domain knowledge base. 2. RELATED WORK

The I3 system is an integration of several computer science research fields. The Internet provides the infrastructure in that Web services are the fundamental methods used to locate information sources, access source information, and understand source presentation. Search engines (or information retrieval systems) process and index Web documents to efficiently access information sources. The widely used Web publication format, hypertext markup language (HTML) [86], was designed for presentation purposes. However, semantically structured information is not defined in HTML; search engines and automatic programs are hard-pressed to extract structured information from popular HTML pages. Although extensible markup language (XML) [91] is designed to deliver structured information of a Web page, it's not yet in popular use. Therefore, research of information extraction is developed to automatically extract structured information from unstructured or semi-structured Web pages. Machine learning and data mining are then applied to obtain knowledge from the extracted information and store the knowledge in databases or knowledge bases. In this section, I introduce the major studies that form the basics of I3. 2.1. The Web

The growth of the Web stimulates numerous information sources published as HTML pages on the Internet. Millions of new documents and thousands of new Web sites are available on the Internet each day. From this sea ofinformation, retrieving relevant documents is at best challenging. A greater challenge is to extract useful information and knowledge from these documents. Both challenges are painstaking. In this section, I describe the Web environment and summarize several problems with Web documents which affect the I3 system design.

4http://directory.google.com/Top/Home/Consumer_Information/Price_Comparisons/?tc=l is the price comparison directory organized in the Open Directory. http://wwv.r.dealtime.com/is one example of the shopping Web site.

112

Shian-Hua Lin

The I#b environment With the birth of the World Wide Web [9], HTTP [87] and HTML have become the most widely used network application protocol and document format, respectively. As of June 2002 5 , Google and FAST's AllThe Web 6 could search about 2.1 billion documents. The current search engine size ofGoogle ' is over 3 billion documents with almost 1 billion new documents added one year. The current size of the Web is several times more than Coogle's reach. Despite the fact that accessing useful information from over several billion documents is a daunting task, the Web does provide an enormous treasure trove ofinformation and knowledge. In order to exploit this Internet potential, the following problems should be understood and solved.

• The size of document sets is extremely large. The exponential growth of the Internet creates two difficult issues: scalability and peiformance. Both factors influence database size and retrieval performance when designing search engines. • VVeb documents may not be well-structured. HTML pages and ASCII text pages are regarded as semi-structured and unstructured documents, respectively. Since current Web pages are not designed to be machine-readable, it is difficult for Web applications to identify the informative content of a page without knowing its structured information. For example, many dot-com pages contain advertisements whose content might be parsed, indexed and retrieved by search engines and information extraction systems. Therefore, both kinds of systems may process the content that is regarded as noise. • Some VVeb documents are redundant. Approximately 30% ofWeb pages are duplicated or similar due to mirror sites and default pages ofinstalled Web servers [14]. This situation is referred to as inter-page redundancy in our previous study [57]. Semantic redundancy is more problematic. For example, a news site might publish the same news article on several pages that each appears in different news categories. Redundancy within pages is referred to as intra-page redundancy [57]. Search engines usually attempt to calculate and store the message digest ofa page to determine its inter-page redundancy. However, this approach cannot detect intra-page redundancy. • The quality ~f VVeb documents is notguaranteed. As the Web is a distributed environment, there is no standardized publishing process for Web documents. Web documents are often published with an invalid forrnat'', bad links, or incomplete contents (such as unavailable multimedia objects). Web crawlers and document parsers encounter difficulty processing these poor quality documents. Moreover, some documents known as Web hoaxes are published for humor, or to mislead or confuse users. Obviously, search engines and Web mining systems need intelligent pre-processors that dismiss these documents. s http://www.searchenginewatch.com/searchday/ article.php/2160141 "AllTheWeb: ..http://www.alltheweb.coml".. 7 As of June 27, 2003 ..http://www.google.coml". indicates Googles current size is 3,083,324,652. 'Microsoft Internet Explorer (MSlE) is capable of interpreting or presenting some invalid HTML format. Currently, most Web pages are presented for MSIE. Therefore, based on MS COM architecture, programming a document parser embedded with IE HTMLParser components is a good approach.

Intelligent internet information systems in knowledge acquisition

113

• There are different languages within documents. The Intern et con nects more than 200 countries of various language backgrounds. Mo st languages use a R oman alphabetlike system that is small in size, while other systems, such as C hinese and Japanese, are very large. Most search engine s only focus on indexing pages written in the local language. Generally, th e linguistic inform ation of a page is specified in the META sub- tag of the C HARSE T tag allowing some crawlers to accesspages written in some of th e indicated languages. U nfortun ately, many Web pages are writte n without linguistic infor mation. Some pages even provide incorrect C HARS ET information. T hese pages are not actually wri tten in the language specified in C HARSET. Consequently, we need a linguistic detecto r to identify the language in order to evaluate conte nt semantics. Th ese problems prompted the studies of Web mining, information extraction, intelligent agents, and docum ent and text analysis, among others, studies focuses on developing software programs to facilitate accessing, extracting, and learning from the Web. From different perspective, Tim Berners-Lee developed a meth od, the Semantic Web, to cope with the problem. The Semantic Web

The Semantic Web [88J sketches the Web as a framework based on XML [91), Resource Description Framework (R DF) [89], and Web O nto logy [90]. T he Semantic Web represents data and knowledge on th e World W ide Web [88]. It is based on RDF that integra tes a variety of applications like library catalogs and world-wide directories. XML provides the interchange syntax to syndicate and aggregate news, software, and content collections of music, photos, and events. R DF specifications provide a lightweight ontology system to support the exchange of know ledge on the Web. R ather than avoiding the artificial inte lligence problem of training machines to think like people, the Semantic Web approach develops languages for expressing information in a machine processable form [I OJ. More details of the Semantic Web are available from World Wide Web Co nsortium (W3C) [84]. T he Semantic Web provides a road map to guide development of the 13 system; however, the semantic content framework is currently not mature. Although XML support and tools are developed; th e tools and support for R DF are immature at the present time. From the perspective of publishing Web con tent, the Semantic Web provides standards and tools to add machine-readable inform ation (such as metadata information) on the Web. Ho wever, the integration ofSemantic Web technologies into the cur rent Web is just beginn ing. HTML documents still dom inate, thu s gap exists between the conventional Web and the Semantic Web. Web min ing, information extraction, and other intelligent techniq ues are urgently needed to close this gap. 2.2. Information retrieval

Due to the explosive growth of Web docum ents, Infor mation R etrieval (IR) systems need to be refined to deal with th e hu ge number of docum ent s. Previous studies on IR systems focused on improving retrieval efficiency by using term-based indexing and

114

Shian-Hua Lin

query reformulation techniques. Term-based document processing initially extracts terms from documents based on pre-constructed dictionaries (or thesauri), stop words, and stemming rules. Once terms are extracted, the widely used method called TF x IDF (or its variations) is used to calculate term weights. A document is therefore represented by a set of terms and term weights. The similarity measure between a query and a document is the direct product of their corresponding term vectors, the cosine value between the two vectors in a multi-dimensional vector space. To indicate the degree of relevance of documents and queries, retrieved documents are presented as a ranked list based on the similarity measure [32][76][77][78][80][95]. Alternatively, the string-based approach indexes strings and all possible sub-strings instead of terms as in the term-based approach. This is particularly useful for arbitrarylength string searches, such as string matching and character-based language search (such as Chinese and Japanese). Notably, the storage requirement of the string-based indexing approach is much higher than that of term-based indexing. In addition, the complicated data structures of string-based indexing requires more retrieval time. While superior in retrieving matched strings, the string-based approach is inappropriate for Internet information discovery queries in which users only provide conceptual descriptions instead of exact strings. Many researchers have developed string-based indexing technologies, including PAT-tree [22] and signature files [26]. Usually, Web IR systems consist of search engines and directory services. Search engines employ various IR techniques to retrieve information efficiently. Directory services organize the Web into a hierarchical conceptual tree or lattice, which makes a wide range oftopics reachable through mouse clicks. Web IR systems make traditional IR systems compatible with the Web in the following ways. Crawling and indexing

Search engines visit Web pages based on user submissions or by means of automatic Web crawlers (also called spiders or robots). A document parser is then applied to extract texts or terms from Web pages. Like conventional IR systems, search engines index a set ofwords or phrases for efficient retrieval. Based on the rich HTML format, search engines enhance their indexing scheme by weighting indexed terms according to HTML tags. Representation

Most search engines employ full-text indexing to quickly match queries with the list of terms that represent documents. Terms are usually weighted by the IF x IDF [77] as is the case with conventional IR systems. The list of term-weight pairs forms a vector to represent a document in the Vector Space Model [79]. Most topic directory systems or portal sites manually organize Web pages into a topic hierarchy. That is, partial Web documents are represented by a hierarchically conceptual tree, an intrinsic knowledge base for the I3 system. Querying

In the Web environment, search engines employ several functions to retrieve and refine search results. Most search engines use Boolean operators to retrieve precise

Intelligent internet information systems in knowledge acquisition

115

results [59]. Other functions, such as phrase matching, restricting search by URL patterns, and sorting or grouping results by corresponding sites are also useful for refining search results. Relevance feedback is applied to refine the search result based on the user's feedback [4]. As for ranking search results, the Hyperlink Induced Topics Search (HITS) algorithm [48] and Coogle's PageRank [13] are popular methods for ranking search results. The ranking policy is related to the Web hyperlink analysis and is illustrated in section 2.3. Implementation

Search engines and topic directory systems need to cope with the dynamic Internet environment. In contrast with the stable context of IR systems, Web pages are frequently created, modified and deleted, requiring Web IR systems equipped with dynamic storage structures and efficient indexing mechanisms. The implementation of intelligent Web crawlers is a new challenge issue for collecting related Web pages on demand. There are currently hundreds of search engines that use IR techniques to retrieve Web documents. Popular search engines are famous for their ranking policies, rich indexes and fast response time. In general, most search engines borrow indexing and ranking methods from IR and improve their performance by adding advanced hardware and sophisticated software. User satisfaction suffers more when search engines return too many documents rather than when no documents are returned. To learn more about the current status of popular search engines, readers can access Search Engine Watch 9 . 2.3. Hyperlink analysis

The hyperlink environment is a distinguishing difference between the Web and the conventional IR environment. The hyperlink provides the most significant page quality information. Hyperlink Induced Topics Search (HITS) algorithm [50] and Google's PageRank [13] analyze the hyperlinked structure ofa page to estimate its quality. HITS estimates authority and hub values of hyperlinked pages while Coogle's PageRank [13], the most popular ranking scheme, merely ranks pages according to a popularity measure. Both are effective methods of ranking search results. HITS, based on mutual reinforcement relationship, provides an innovative methodology for re-ranking Web searching results for topics distillation. According to the definition in [48], a Web page is authoritative on a topic if it provides good quality information, and is a hub if it provides links to authoritative pages. HITS uses a mutual reinforcement operation to propagate authority and hub values to represent linking characteristics. Recent research on link analysis of hyperlinked documents applies HITS to the research area of topic distillation and proposes several HITS variations to enhance the significance of links in hyperlinked documents [11][17][18][19][20][48][55]. Hyperlink analysis is also applied to discover the concise structure of the Web sites. Authority and hub are applied to distil a complex structured Web site into a concise structure that consists of authoritative pages linked by hub pages [47][48]. However, "http://searchenginewatch.com/

116

Shian-Hua Lin

HITS-related algorithms do not perform well in mining a Web site's concise structure due to the effects ofnepotistic clique attack and Tightly-Knit Community (TKC) [12]. Such effects appear more frequently within a Web site while analyzing and distilling the site structure [47]. 2.4. Information extraction

Information Extraction (IE) is one way to alleviate inefficient discovery of legal materials on the Web. Studies ofIE [33][42][52][92] aim to mine structured information (metadata) from Web pages. Although able to extract valuable metadata from pages, most IE systems require labor-intensive efforts. Cardie [16] defines five pipe-lined processes for an IE system: tokenization and tagging (manual labeling), sentence analysis, extraction, merging, and template generation. Based on domain-specific knowledge (concept dictionaries and templates) generated by first two processes, machine learning methods are usually applied to learn, generalize, and generate rules in the last three processes [33]. Training instances applied to learning processes are also artificially selected and labeled. In Wrapper Induction [52], the author manually defines six wrapper classes, which consist of knowledge to extract data by recognizing delimiters to match one or more of the classes. The richer the wrapper classes, the more likely they will work for any new site [23]. SoftMealy [42] provides a GUI that allows a user to open a Web site, define meta data attributes, and label tuples in the Web page. The common disadvantage of IE systems is the time cost of manually generating templates, domain-dependent knowledge, or annotations of corpora. This is the very reason that these systems are only applied to specific Web applications that extract the structured information from pages of specific Web sites or pages generated by CGI. Consequently, these IE systems are not scalable and therefore cannot be fully automated to extract Internet information. Additionally, IE systems try to generate rule templates from repeated patterns found on the entire Web page. However, the amount of useful content on most Web pages is minimal. For example, almost all commercial pages contain content blocks oflogos, advertisements, navigation panels, related links, informative content, and copyright announcements [57]. Only informative content blocks are meaningful when locating repeated patterns and extracting structured information. Therefore, learning methods that use the entire page are not cost effective. The learning accuracy of IE systems would be low since many patterns, which are probably noise, need to be found and processed. 2.5. Data mining and machine learning

Machine learning addresses the question of how to build programs that improve performance through experience and heuristics. A well-defined learning problem requires a specified task, a performance metric, and a source of training experience [63]. The specified task determines to the choice oflearning algorithms, such as learning classification rules [72][83][37], discovering clustering patterns [1][43], or mining associations [1][3][82]. The performance metric is a guideline that evaluates the quality ofa learning system. The training experience is the data source used to train and test the learning system.

Intelligent internet information systems in knowledge acquisition

117

Databases have been successfully applied in business management, government administration, medical management, scientific and engineering management applications, and many other fields. This explosive growth ofdata has driven investigation into new techniques and tools that obtain knowledge from databases. However, previous studies of machine learning merely deal with small data sets. Performance and scalability become the major concern in database learning. Consequently, data mining has become a popular research topic. The data mining system DBMiner [38] was developed for interactive mining of multiple-level knowledge in large relational databases. The system implements a wide spectrum of data mining algorithms and functions, including generalization, characterization, association, classification, and prediction. There are also many other data mining terms that carry a similar or slightly different meaning to data mining, such as knowledge discovery from databases (KDD f70]), knowledge mining from databases, knowledge extraction, data archaeology, data dredging and data analysis [21]. Readers can refer to [21][39][46] for further information. Following is a summary of the major data mining methods. Classification

Classification (supervised learning) is a well-known and widely used data analysis method that can automatically learn models or rules describing categories of data. Given a set of training data assigned class labels, the learning system first partitions data into two sets: training and testing. In the training phase, classification algorithms learn models to fit the training data. In the testing phase, obtained models are used to predict class labels of testing data to verity the learning quality. Since databases consist of structured information (relational tables) and are rich with implicit information and knowledge, classification learning is frequently applied to obtain knowledge from databases to make business decisions. Many classification algorithms have been analyzed, including inductive and decision-tree-based methods such as ID3 [72], CN2 [25], C4.5 [73] and SLIQ [62]; statistical methods [27]; neural networks; as well as database-oriented classification methods like attribute-oriented induction [37]. Business applications of classification learning include classifying customer groups, market trends, and customer purchasing behavior. Clustering

Clustering (unsupervised learning) is an important data mining method that groups similar data together. The similarity (or dissimilarity) between objects is based on distance-based or density-based measures. According to distance-based measure, there are two types of clustering methods: partitioning and hierarchical. Partitioning methods, such as k-Means, k-Medoids [39] and CLARANS [66][67], try to partition objects into k groups and iteratively improve clustering by moving objects between groups. Hierarchical methods build a hierarchical decomposition of the given objects based on agglomerative (bottom-up) or divisive (top-down) approaches. CURE [35], BIRCH [97], ROCK [36], and CHAMELEON [49] are hierarchical methods. Most distancebased clustering methods are sensitive to noise or outliers and do not perform well for clusters that are not spherical in shape. To tolerate noise and discern clusters with

118

Shian-Hua Lin

arbitrary shapes, density-based clustering methods like DB SCAN [29] and OPTICS [6] were developed. Except for CLIQUE [1], most clustering methods are designed for low-dimensional numerical data. CLIQUE identifies dense clusters embedded in subspaces of maximum dimensionality and generates cluster descriptions in the form ofDNF expressions minimized for ease of comprehension. Clustering is particularly appropriate for the exploration of inter-relationships among sample objects. It makes a preliminary assessment of the sample structure since it is difficult for humans to intuitively interpret data embedded in highly dimensional spaces [46]. For example, clustering can be use to identify different customer groups, characterize these groups, and determine market trends. Clustering can be integrated with other mining methods to create new hybrid applications. For example, mining associations between customer groups and buying patterns are useful when determining market trends and promotional programs for customers. Association

Association rule mining [2][3][82] is used to discern interesting associations (or correlations) among itemsets (patterns) generated from a large data set, particularly for supermarket transactional data. In fact, association rule mining is often referred to as market-basket analysis that determines which items are frequently placed in one person's shopping cart. Apparently, only frequently purchased itemsets are attractive for market analyzers. According to the definition of the association rule in [2], elements in the problem are items, transactions, and the database. Let I = {i 1, i2 , .•. , iml be a set ofitems. Let D be a set oftransactions (the transaction database), where each transaction T is a set of items such that T ~ I. An association rule is an implication of the form X ---* Y, where X C I, Y C I, and X n Y =
Most association mining algorithms initially try to find frequent itemsets that satisfy a pre-defined minimum support count. Association rules are then generated from these frequent itemsets according to minimum support and confidence. To mine associations between X-itemsets and Y-itemsets, association mining algorithms such as Apriori [3], must iteratively generate candidates of (k + 1)-itemsets from k-itemset and scan database transactions to verify candidates. Since mining association rules may require multiple database scans, research has focused on performance improvement [24]. As such, many variations of the Apriori algorithm have been proposed to improve performance. For example, the DHP (direct hashing and pruning) algorithm was developed to efficiently generate candidates of large itemset and reduce transaction size and database scans [69]. Frequent pattern growth (FP-growth) method is different from Apriori-like algorithms. FP-growth first performs a database scan to construct an FP-tree, an extended prefix tree structure for storing compressed, crucial information about frequent patterns. Major operations of mining association rules are count accumulation and prefix path count adjustment. Both are usually much less

Intelligent internet information systems in knowledge acquisition

119

costly than candidate generation and pattern matching operations performed in most the Apriori algorithm [40]. Occasionally, pre-determined minimum support and confidence may be too high, applying hierarchical taxonomy (is-a relationship) items to mine generalized (multi-level) association rules is a useful approach [81]. The mining association rule is a useful tool for discovering an optimal item arrangement in the supermarket to help customs quickly find their products. It is also helpful to mining conceptual associations from Web documents. Applying association rule mining to extract correlations between keywords can enhance semantics of correlated keywords (concepts). This enhancement improves the accuracy of automatic document classification [58]. 2.6. Document categorization

Categorizing Web documents is a productive approach to constructing domain knowledge (ontology) from the Web. The information space of the Web is summarized as many hierarchical concepts. There are two approaches used to categorize documents into a hierarchical tree, manual categorization and automatic categorization. Manual categorization, like the directory service ofYahoo!, is time consuming and expensive. The approach is not feasible due to the immense amount of Web documents. In automatic categorization, the system predicts the classlabel based on the document categorization knowledge acquired from domain experts or learned automatically from a set of documents [7]. Acquiring knowledge from domain experts, while relatively effective, is expensive in terms of time and knowledge maintenance. Furthermore, the knowledge acquired from experts is usually incomplete. Contrarily, learning from documents is efficient and scalable, but accuracy is constrained by the employed learning model and the document set. Currently, no systems are able to automatically categorize documents into an acceptable hierarchy without human guidance. Therefore, a successful document categorization system is based on the collaboration between humans and automatic programs. Many text categorization studies have been undertaken in information retrieval [7][45][53][54][94]. Herein, document categorization is used instead of text categorization since we focus on Web documents rather than general texts. Document categorization adopts many studies from similarity-based document retrieval [94], relevance feedback [78], text filtering [64], text categorization [7][53], and text clustering [54]. For example, SIFTER [64] uses the vector space model for document representation, applies unsupervised learning in document categorization, and uses reinforcement learning for user modeling to filter documents. ExpNet [94] uses similarity measurement as the category ranking method to determine the best category for the input document. INQUERY [53] employs three different learning and mining techniques: a k-nearest neighbor (kNN) approach using belief scores as the distance metric, Bayesian independence classifiers, and relevance feedback. Conventional data mining methods are applied to obtain knowledge from databases in which each record (row or tuple) has attributes (columns) regarded as its features. However, there are no explicit features for documents. Thus, characterizing documents is the most important task when applying mining algorithms to document classification.

120

Shian-Hua Lin

Similar to other data mining methods, we can categorize document categorization into two types: document clustering and document classification. Document clustering

Document clustering tries to discover clusters (or categories) for which documents of the same cluster are similar and documents from different clusters are dissimilar. Similarity (or dissimilarity) measures affect clustering performance. There are several ways to select similarity measures. For example, by regarding each document or cluster as a multi-dimensional distribution over a set of terms, the vector space model can be also applied to estimate similarity between documents (or clusters). When a cluster of documents has been identified, the problem of denoting the cluster concept arises. Most studies use the centroid document of the cluster to represent the concept of the cluster. The document clustering method is composed of two processes: finding clusters and assigning documents. Usually, the document clustering process is user interactive. First, users assign the number of clusters, m. The clustering system tries to partition the document set into the given number of clusters. The processes can be iterative until convergence is reached and clustering models are obtained. It can also interact with users to construct hierarchical clusters. Second, based on obtained clustering models, the clustering system can assign (predict) new documents to clusters. Document classification

Document classification attempts to assign documents to one or multiple pre-defined classes (categories). Given a set of classes (or a class hierarchy) of manually categorized documents, document classification tries to obtain classification knowledge by learning from the hierarchy. The knowledge is then applied to automatically categorize new documents. Previous machine learning studies developed many algorithms that performed well in many fields including medicine and finance. These algorithms can be employed in document classification by characterizing document features, such as Bayesian independence classifiers [54], the k-nearest neighbor method [34], and rule-based induction algorithms [7]. 2.7. Web mining

Data mining has been recently incorporated into the World Wide Web [51][68]. Web mining applies data mining techniques to discover and extract information from Web documents and services, such as on-line travel agents, job listings, and electronic malls. Web mining is composed of the following [30]. • Resource discovery: locating unfamiliar documents and services on the Web. • Information extraction: automatically extracting specific information from newly discovered Web resources. • Generalization: uncovering general patterns at individual Web sites and across multiple sites.

Intelligent internet information systems in knowledge acquisition

121

Information retrieval and document categorization are used in resource discovery. Information extraction is introduced in section 2.4. Generalization is the major challenge of the Web mining. How do we generalize special cases of information extraction so that the same mining process can be applied to other Web sites? A manual labeling process is the bottleneck when generalizing Web mining tasks for applications of other fields. Applications of Web data mining should focus on three issues: Web structure mining, Web content mining, and Web usage mining [60]. VVeb structure mining

Given a collection of hyperlinked Web documents, Web structure mining systems try to discover concise structured information about the document set or subset. For example, search engines may crawl and index all news pages in a news Web site. For a commercial use or user-friendly browsing purpose, a news page may contain information and links that are irrelevant to the news article, resulting in a redundant structure of the news site. The informative structure of a news site should be: a set of table-of-contents (TOC) pages (with respect to news categories) linking to news article pages. Although HITS related algorithms [11][17][18][19][20][48][55] are widely used in topic distillation by analyzing the Web hyperlink relationship, there are no studies investigating Web site structure distillation. In [47], we borrow the link analysismethod from HITS to distil the structure of a Web site. The distilled structure is referred to as the informative structure of the Web site. Structural distillation is useful in the Web content mining. VVeb content mining

Web content mining extracts semantic information from a given collection of Web documents. Given a set of documents collected from directories ofa portal site, mining term associations extracts a conceptual network that expands the domain knowledge of these directories [58]. A successful Web content mining task is dependent on the quality of information resources. Mining content from all pages of a Web site rather than from pages ofthe distilled structure is inefficient and ineffective since the complete Web site structure contains a lot of redundant information. Therefore, Web structure mining can be a pre-processor for Web content mining. VVeb usage mining

Web usage mining identifies user access patterns from Web server logs (the Web page access history). Mining Web logs can help a Web site understand user behavior. Analyzing and exploring regularities in this behavior can improve system performance, enhance the quality of information services, and identify potential customers for electronic commerce. By observing usage of data collections, data mining can be of considerable assistance to Web site designers when, for example, re-arranging the Web site map. WebLogMiner, based on DBMiner, uses data mining and data warehousing techniques to analyze Web log records [96]. In addition to providing benefits to the Web site design, Web usage mining is also useful for training and learning user profiles.

122

Shian-Hua Lin

2.8. Intelligent web agent

An agent can be one of a broad scope of entities such as hardware entities, software programs and humans [41]. Asking the question "what is an agent?" to the agent-based computing community is similar to asking the question "what is intelligence?" to the AI community [93]. Intelligent agent abilities include delegation, communication skills, autonomy, monitoring, actuation and intelligence [15]. In this paper, we focus on the agent that applies machine learning and data mining techniques to facilitate intelligent applications on the Web. For example, applying machine learning techniques to a crawler is efficient when gathering documents of some specific topic [61][75]. It can be referred to as the focused crawling agent. In this paper, we define an intelligent Web agent (IWA) as one with the following abilities. • Crawling: the agent includes a crawler module that is able to gather Web pages. Most crawlers retrieve content by following hypertext links and ignore the tremendous amount of high-quality content hidden behind the search forms that connect to searchable electronic databases. This is the hidden VVeb [74]. IWA should be capable of exploring the hidden Web. • Understanding domain knowledge: IWA usually starts with the initial domain knowledge provided by humans and collects related Web documents based on the knowledge. For example, the Web mining agent ShopBot [28] uses descriptions of domains and vendors as its prior knowledge when comparing vendor attributes (e.g., price). • Interacting, extracting and learning ability: IWA should be able to extract useful information or knowledge from Web documents. Usually, an information extraction or learning subsystem is embedded in IWA. For example, Shop Bot [28] and ILA (Internet Learning Agent) [31] interact with the user to learn structured information of unfamiliar information sources. 3. THE I3 SYSTEM

In this section, the architecture of!3 is outlined. To develop an intelligent information system, several semantic problems must be first considered. In section 3.2, research issues concerning content semantics problems are discussed to clarify the design of the 13 system. 3.1. The architecture of the 13 system

As shown in Figure 1, the 13 system contains three-layer horizontal components that cooperate with the vertical component, domain knowledge ontology. I briefly describe these components.

• I3 VVeb Analyzer (I3WA). This analyzer consists of two components: Web crawler and Web content and hyperlink analyzer. The Web crawler is responsible for gathering various documents from the Internet. However, the conventional crawler is unable to filter unwanted documents whereas the 13WA crawler is designed to collect

Intelligent intern et informati on systems in knowledge acquisition

13 Knowledge Leamer (I3KL ) 13 Metadata Extractor (I3ME)

1-- - - - - --------1

Domain Ontology

123

Domain Expert

13 Web Analyzer(13WA)

Figure 1. Th e architecture of I3 system.

documents for topics specified in domain knowledge. The Web content and hyperlink analyzer analyzes the collected documents with corresp onding hyperlinks. Based on analyzed results, the I3WA can make effective and efficient decisions to gather related documents. - /3 Metadata Extractor (/3ME). After 13WAcollects documents o n certain topics, 13ME is employed to automatically extract metadata from semi-structured or unstructured Web documents. In this paper, the term metadata refers to indicate struc tured information. The extracted metadata contain rich struc tured inform ation that can facilitate the next learning pro cess. -/3 Krzowle~'le Leamer (I3KL). Given initial domain knowledge (o ntology), a learning system mu st discover new kn owledge and enhance dom ain knowledge. 13KL applies several data mining and machin e learning algorithms to obtain knowled ge from Web docum ent s. - Domain Ontology. Currently, no successful intelligent inform ation system is fully automatic. An intelligent system mu st intera ct with dom ain experts. T herefore, the domain ontology is an interface layer that interacts with experts to drive the learning proce ss as well as obtain and verify knowledge to enh ance the knowledge base. 3.2. Semantic issues of the 13 system

Before the Semantic Web becomes popular, intelligent information systems are responsible for automatically (or semi-a uto matically) extracting information and obtaining kno wledge from Web documents. Generally, a document is represent ed as a set of kcyword s in IR, IE and do cum ent catego rization. The representation is deficient since the content semantics cannot be correctly extracted witho ut knowing the context of these keyword s. A document w ritte n in natural language text is co ntex t-sensitive and its meaning is very depend ent on writers and readers. In this section, several issues that may affect the effectiveness of conte nt semantics extraction are illustrated. Obviou sly, these issues also have effects on learnin g.

124

Shian-Hua Lin

Contentsemantics associated to domain

Generally, information systems deal with the content semantics of a document by a deterministic approach, i.e., the parsed or extracted content semantics cannot be changed while encountering different problems, users, or domain classes. However, content semantics varies from domain to domain. For example, apple has different meanings in documents from different domains such as computer and food. Information systems usually omit the semantic diversity of keywords used in different documents and domains. The 13 system applies mining association rules to discover term associations from documents in some classes (topics). Term associations mined from a class's documents are used to enhance classification knowledge. Experiments show that term associations improve the classification accuracy of document categorization [58]. Detection of linguistic information

Detecting the language in a document is the first and most important step to understanding content semantics. However, many Web documents are published with or without correct linguistic information. Documents are regarded as a binary (or ASCII) string in computer programs. For example, while processing Traditional Chinese documents (corresponding to the BIGS character set), information systems might collect documents written in Simplified Chinese (corresponding to the GB2312 character set). Some Simplified Chinese documents even incorrectly indicate BIGS CHARSET information in their HTML files. These documents become noisy data when processing content semantics of BIGS documents. Although Unicode is proposed to unify character sets of different languages, many documents are still published in their local languages. In the I3 system, I3WA consists of a linguistic detector that determines the document language based on probabilities of characters that appear in different character sets. Semantic gaps between writers and readers

The Web is a distributed environment in which various individuals publish documents in their own ways, languages, and considerations. People might use the same words to present different meanings; contrarily, they might use different words to describe the same meaning. A writer might also use different words to convey the same meaning. Correspondingly, document interpretation is dependent on individuals. Different users will evaluate the same search results differently. The I3 system uses thesauri and user profiles to deal with this problem. Thesauri are used to align concepts represented in a document, and user profiles are used to trace individual behavior. Stop words and stemming words

Stop words (or negative dictionaries) are commonly found in almost every document. These words, e.g., the, a, an, have no discrimination value for searching and mining.

Intelligent internet inform ation systems in knowledge acquisition

125

However, stop words are also highly dom ain dependent. A stop word might become meaningful w hile the application domain becom es general; on the other hand , a usually meaningful word might becom e useless in a specific dom ain. For example, computer is probably a stop word in computer literature databases. Intu itively, stop word lists should includ e the most frequ ently occurring words in documents ofsome dom ain . Numerous studies show that if the words in a docum ent are ranked in order of decreasing frequen cy, they follow a relationship known as Zip f's law [98]. Applying Z ipf 's law to documents of some classes can identify domain stop words. Word stemming maps multiple representations of a word into a single stemmed term to provide significant compression and impro ve recall. However, the precision measure, based on minimizing non- relevant information, may be redu ced becau se of th e stemming effect of increasing recall. For example, memorial and memorize can be stemmed to memory. But memorial and memorize are not synonyms and have different meanings. Consequently, wo rd stemming influences recall and precision measures and sho uld be carefully pro cessed when designing information systems. Generally, the stemming algorithm removes suffixes and prefixes to derive the final stem. The Porter algorithm [71] is based on a set of conditions of the stem, suffix and prefix and associated actions. Some stemming metho ds are based on dictiona ries. Studies of various stemming methods are summarized in [32][8]. Since stop words and word stemming have effects on content semantics of documents, both techn iques must be embedded in the design of the 13 system. The inte rpretation of docum ent semantics is dependent on the problem dom ain, the document's categorization, and the user's profile. Document categorization that obtains classification kno wledge from a hierarchical direct ory is useful in dealing with these semantic issues. By predi cting the docum ent class, the class information can be applied to iden tify th e content semantics. Co mbined with user profiles, document semantics can be precisely mapped to user's inform ation needs. 4. 13 WEB ANALYZER

In the 13 system, the bott om layer is an 13 Web Analyzer (I3WA). I use the term analyzer rather than agent since a Web agent system is a comp lete and complex system that includes the ability to crawl, und erstand, interact, extract and learn . The impleme ntation of extracting and learn ing components is highly dependent on domain knowledge. Accordingly, I elicit the first three abilities from the Web agent to build I3WA. By interacting with dom ain knowledge, I3WA focu ses on abilities of crawling and understanding Web documents and hyperlinks. 13WA pre-p rocesses the Web and extracts useful information for its following extracting and learning subsystems, respectively 13ME and 13KL. Basically, 13WA is composed of a Web Crawler and a T#b Content and Hyper/ink An alyz er. According to the semantic issues introduced in section 3.2, the analyzer mu st coo perate with Dowment Parser and Linguistic Detector to ident ify a Web document's conte nt and linguistic inform ation . T he analyzer performs analysis on content and hyperlink. T he architecture ofI3WA is shown in Figure 2.

126

Shian-Hua Lin

Figure 2. I3WA components.

Following, several constraints that users may specify in I3WA are summarized. I3WA components corresponding to these constraints are identified. The detail of each component is illustrated in subsections 4.1 through 4.4.

• Document type constraints: Document types that can be processed in an information system should be restricted otherwise unpredicted results will be encountered while processing documents of unknown types. Web mining systems generally focus on HTML documents. Therefore, these systems lose information sources of other document types such as PDF, PS, DOC, PPT and XLS. These document types are increasingly popular on the Web. I3WA includes a configurable crawler and document parser for parsing these document types. • Linguistic constraints: Users may be interested in documents written in specific languages, that is, the analyzer must contain a linguistic detector to collect proper documents written in specified languages. • Structure constraints: Users may be interested in gathering documents of specific structures ofa Web site. For example, saya person wants to organize a hierarchical directory for sports documents. He will need a crawler that only collects sports directories from several portal sites to construct a hierarchical directory of sports documents. I3WA should have the ability of understanding the structure of a site. This corresponds to the functionality of the structure analyzer. • Topic constraints: Given a set of concepts (keywords) to represent user topics, I3WA should be able to harvest documents related to these topics. This task is performed by the content analyzer. 4.1. Web crawler and document parser

Based on network protocols like HTTp, NNTP and FTp, the crawler is able to automatically gather various kinds ofdocuments and information sources from the Internet. In this paper, a description of the crawler is omitted since it is a well-known in search engine component.

Intelligent internet information systems in knowledge acquisition

127

The implementation of the document parser is based on Microsoft Windows systems. Windows provides COM (or DCOM) for invoking software components. The document parser determines a document type in the run time, and invokes parsing components corresponding to the document type. For example, it calls IE HTMLParser for parsing HTML files. In the same way, it conveniently supports MS Office, Outlook E-Mail (.eml), PDF and PS document types. The programming detail is beyond the scope of this paper. 4.2. Linguistic detector

In this paper, the 13 system is only constructed to deal with the linguistic information of English, Traditional Chinese (BIGS), and Simplified Chinese (GB). The former is written in one-byte ASCII code, and the latter two languages are written in a twobyte code. Therefore, detecting English or Chinese document is easy. As for Chinese content, we can theoretically get the linguistic information of an HTML document from the sub-tag CHARSET of the META tag specified in the HTML file. However, as I described in section 3.2, CHARSET tags often contain the wrong language information or even no language information. As Taiwan uses BIGS characters, the frequency of those characters is estimated from documents collected in Taiwan. Similarly, the frequency of each GB character is estimated based on documents collected from mainland China, as China uses the GB system. The BIGS and GB character frequency is then normalized and translated to probability. In this way,BIGS and GB character-probability tables are generated. Given a document, the linguistic detector first extracts characters and then looks up corresponding probabilities in both tables. The probability of each character is accumulated to become the document's BIGS- and GB-probability values. The larger probability value indicates the document's language type. According to randomly selected S,OOO pages from Taiwan (corresponding to the BIGS answer set) and China (corresponding to the GB answer set), the precision rate of the linguistic detector is 0.97S where 12S pages missed. After manually checking these missed pages, we removed BIGS pages in China and GB pages in Taiwan. The precision rate is increased to 0.996. Therefore, the linguistic detector is effectively to determine the linguistic information. 4.3. Structural analyzer

To collect specific pages from Web sites, users usually observe CGI patterns or URL addresses and identify templates for these pages. For example, directory pages of Yahoo! can be represented by the template http://dir . yahoo. comj* /, where "*,, indicates the directory name. Open directory service provided by Google is http://directory . google. com/Top/* /. Hierarchical directory structures of both sites are implied in path information of both templates. Not all sites generate Web structures in simple patterns. URLs of AltaVista's directories 10 are not simple and hierarchical information of directories is not implied in the URLs. Consequently, there are no trivial solutions for discovering specific Web site structures. lO .. http://www.overture.com/d/search/p/altavista/odp/us/?c=directory". " is the template for generating directory pages of AltaVista (http://www.altavista.com/).

128

Shian-Hua Lin

In a systematic Web site, its informative structure is composed of a set of TOC (Table of Content) pages and a set of article pages linked by these TOC pages [48]. The definition can be extended to: • The informative structure of a Web site consists of a set ofTOC pages. A TOC page indicates a directory that contains links to TOC pages as sub-directories and links to article pages as directory objects. An example of the informative structure is the directory structure of a portal site. The problem and solution of mining informative structures from Web sites is described in the LAMIS system (Link Analysis of Mining Informative Structure) [47][48]. 13WA applies LAMIS to analyze and distil the Web site structure for serving requirements of structure constraints. 4.4. Content analyzer

After discovering the informative structure of a Web site, we obtain TOC and article pages for information extraction and learning purposes. However, redundant and irrelevant links in article pages are not easily discovered. This is the problem with intra-page redundancy as described in section 2.1. To deal with this problem, we need a content analyzer to extract informative content from a page. Based on the W3C Document Object Model (DOM) [85], an HTML page can be parsed and represented by a tree structure in which internal nodes indicate HTML tags and leaf nodes indicate texts. With DOM, programmers can build documents, navigate their structure, and add, modify, or delete elements and content. Accordingly, the solution to the intrapage redundancy problem can be mapped into locating informative leaf nodes (texts) from an HTML document's DOM-tree. Since there are probably too many leaves in a DOM-tree, finding informative elements is complex and tedious. In InfoDiscover [57], about 70% of Web pages use in presentation, i.e., the DOM-tree can be generalized to the

level for simplifying the problem of discovering informative texts. Informative texts appearing in the same table become the informative content block. The problem is defined below. • The informative content block of a Web page contains texts that appear in a table, i.e., texts between nearest

and

. I propose a method, called InfoDiscoverer, to disocver informative content blocks from Web pages. Experiments show that InfoDiscoverer is efficient and effective for discovering informative content. Both precision and recall rates are over 95% for tested pages [57]. As a result, 13WA's content analyzer employs InfoDiscoverer to analyze informative content blocks to deal with the problem of intra-page redundancy. 13WA integrates LAMIS with the proposed InfoDiscoverer to discover informative structures and content from Web sites. Both problems and solutions proposed in LAMIS and InfoDiscoverer can be combined as shown below.

Intelligent internet information systems in knowledge acquisition

129

• Given a Web site, 13 crawler first gathers all pages from the site. LAMIS is applied to discover the informative structure of the site that contains a set of article pages linked by a set of hierarchical TOC pages. InfoDiscoverer is then used to extract informative content blocks of each page. LAMIS uses the feedback of informative content blocks to refine the informative structure as a set of informative content blocks (article pages) linked by a set of informative content blocks (TOC pages). Article pages are defined as pages linked by anchors appearing in the informative blocks ofa TOC page. Also, these article pages form a new data set that InfoDiscoverer extracts informative blocks as the meaningful content article pages. Consequently, the informative structure of a Web site is therefore represented as a set of TOC blocks pointing to a set of article blocks. LAMIS (the structure analyzer) and InfoDiscoverer (the content analyzer) are interactive with each other as shown in Figure 2. Studies of topic distillation introduced in section 2.3 are useful to when trying to find documents on certain topics using search engines. Some topic distillation methods suffer from problems of nepotistic clique attack and Tightly-Knit Community (TKC) [12]. The refined informative structure and content can deal with these problems [48]. Therefore, 13WA combined with topics distillation methods is adequate for dealing with structure and topic constraints as described in the beginning of this section. 4.5. Summary of I3WA

The traditional Web data flow was improved by employing 13WA as a pre-processor for Web application systems, such as search engines, IE and document categorization systems. The new Web data flow of the 13 system is shown in Figure 3. In conclusion, 13WA extracts informative structure and content that are useful in the following ways. • Crawlers and Web agents focus on the informative structure to precisely and efficiently explore useful information for further analysis. • Search engines can improve performance by only indexing informative content blocks of article pages rather than the entire content and all pages of Web sites. As a consequence, by removing indexes of redundant content, the index size is reduced and the retrieval precision is increased. • IE systems expect input Web pages to possess a high degree of regularity so that structured information (metadata) encoded in these pages can be discovered. Apparently, 13WA can be a pre-processor for IE systems and improve their efficiency and effectiveness while exploring repeated patterns from Web pages. 5. I3 METADATA EXTRACTOR

I3WA provides 13 Metadata Extractor (I3ME) with concise data, informative structure and content ofWeb sites, for information extraction asshown in Figure 3. 13ME focuses

130

Shian-Hua Lin

HTML Documents

-----»

Traditional Web data flow

-- --- --- --- - -- ~

Web Browser

Web data flow in I3 system

)

HTML Parser

Plain Text Document

Search Engine Reduce index size and im prove retrieval precisio n

13WA

IE Systems

Disti l Web site structure Remo ve intra-page redundancy

Informative Structures and Contents

Reduce size and noise of training set

Document Categorization Figure 3. Web data flow in the 13 system.

on these concise data and extracts structured information. Extracting structured information from Web documents is domain-dependent. An automatic IE system should minimize the amount oflabor-intensive work required. 13ME applies DOM-tree representation, a full-text indexing technique, and BLAST services [65] to reduce the requirement of domain knowledge. I3ME is composed of the following components. Its structure is shown in Figure 4.

• Data Pre-processor. It contains a Web Crawler, Document Parser, and Linguistic Detector when I3ME is designed to be an isolated system. In the 13 system, these components are embedded in 13WA. In 13WA, a Web page is pre-processed and represented as a DOM-tree (by Document Parser) in which a tree-node indicates a part of informative content. • Tokenizer. It translates character sequences and tags into tokens and performs generalization/specialization processes on the DOM-tree processed by 13WA. In I3ME, the IE problem is mapped into the problem of sequence alignments in the area of Bioinformatics. BLAST [5][65] is employed to extract similar patterns (corresponding

Intelligent internet information systems in knowledge acquisition

I3WA (Da ta Preprocessor )

....

Tokenizer

Full-text Index r

i: dictionary ornew

~

J

[

Patterns (Proteins Seqs.l....

Indexes of Patterns with Frcquencies

i keywords

Web Docu ment s

/

User Label Interface

I

'If

"/~®',:r.~PI"" l

ii"~"~'"~

,.",

131

/ / ' / ew Temtate and Meta ata ~

Unsatisfied Patterns

iTemplate

. . . . . . .... .

i and Mctadata

W

Score Evaluator

Long Rep ated Pattern Extractor .•.....••.. ••...

~ ~

High scored Pallcms

<, ~

Candidates of Metadata Records

BLA T

Remaining Patterns

Figure 4. Components of 13ME and the data flow.

to protein sequences in BLAST). Tokenizer maps character-tag sequences into protein sequences encoded in 20 amino acids. A Tag corresponds to an amino acid and can be generalized. Texts are treated as one amino acid. However, text keywords that appear in domain dictionaries are regarded as another amino acid to represent meta data fields. For example, text keywords like "Function," "Responsibility," and "Qualification" are probable metadata fields. Therefore, texts must be segmented to extract keywords. Text segmentation is trivial for English-like languages. However, in processing Asian languages like Chinese or Japanese, there are no delimiters for separating character sequences into words. I3ME performs a dictionary-based term segmentation method to extract terms from Chinese texts. The method was developed in our information system, ACIRD [56]. • Full-text Indexer. Given a Web page, Tokenizer outputs patterns (protein sequences) to indicate sentences or paragraphs appearing in the page. Intuitively, a repeated pattern (or sub-pattern) appearing in a page or a set of pages are candidate records for mining structured information (metadata). The pattern can be regarded as a string and indexed by full-text index engine that we developed for searching Chinese Web pages in ACIRD. The weighting scheme of full-text index, such as TF x IDF, is changed to term frequency (TF) weighting. • Long Repeated Pattern Extractor. Long repeated patterns can be easily retrieved from full-text indexes. Such a pattern indicates a candidate record that is matched with metadata templates.

132 Shian-Hua Lin

• BLAST Server. Candidate patterns are sent to BLAST Server for matching similar templates and retrieving corresponding metadata. In the beginning, there are no matched templates since the template database is empty. In Score Evaluator and User Label Inteiface, users can label these candidates as templates with metadata, or skip some of these candidates. Currently, 13ME is still in the development stage. Many IE systems have been successfullydeveloped for specific domains as described in section 2.4. In the 13system, we propose 13ME to construct a general IE system that is less dependent on domain knowledge. 6. 13 KNOWLEDGE LEARNER

Organizing Web documents as hierarchically structured directories is a common method for managing information. The concept of hierarchical directories is widely used in phone books, address books, libraries, and file systems. It is the most natural way for humans to organize information as knowledge. Therefore, the ontology (or concept hierarchy) is the intrinsic domain knowledge. 13 Knowledge Learner (13KL) is a supervised learning system for document categorization. It obtains classification knowledge from the initial domain ontology, which contains hiearachical directories and documents manually constructed by people. I3KL is an extension of our previous work, ACIRD [56]. 6.1. The ACIRD system

Automatic Classifier for Internet Resource Discovery, ACIRD [56], is an intelligent information system that automatically collects and classifies Web documents for efficient and effective management and retrieval. ACIRD initially focuses on improving the expensive and time-consuming manual classification process. Figure 5 schematically shows the data flow in ACIRD. Domain experts provide a classlattice (directories) with a set of training data (documents) assigned to one or several classes. Classification Learner learns from the training data and generates classification knowledge (or class indexes) of classes in the class lattice. IiVeb Crawler automatically collects documents from the Internet, and the Pre-processing Process extracts the features (terms or keywords) from documents. Document Classifier proceeds to predict and assign one or more most appropriate class to the incoming documents. When users submit queries to ACIRD, the Two-Phase Search Engine matches the queries with indexes documents and classes and presents a hierarchical view to the users to facilitate information discovery. 6.2. Mining term associations

ACIRD applies association rule mining to mine term associations from documents of a domain. The problem of mining associations is mapped from itemsets of transactions terms of documents. The transaction database corresponds to the document set. Two critical issues should be considered before applying the association mining process: the granularity of a transaction and the database of the document set (a domain).

Intelligent internet information systems in knowledge acquisition

~~~' Documents

Web Crawler

'

Doc. Stream

,

Preprocessing Process

lDOC. Document Classifier

Doc. Indexes

Classified Doc.

Class Lattice & TrainingDoc.

~

J rl

Classification Learner

,

..-'

Knowledge ..... Base

Human Expert

oJ

Indexes

Database f - -

'if Class Lattice & I"TrainingDoc. ""

133

Class Indexes

Class Latice and Class Indexes

Doc. Indexes

I Two-Phase Search Engine

i

oJ

Query

..... Result

,'" Web User

Figure 5. The major components and data flow in ACIRD.

Granularity if mining associations

In [44], authors propose to restrict the granularity of generating associations to 310 sentences per paragraph in order to reduce the computational complexity. The restriction is impractical for Web documents since a paragraph may have hundreds of meaningful sentences. In addition, the importance of a sentence in a Web document depends on associated HTML tags, not its position in the text. Therefore, we define the granularity of mining term associations to be the entire informative content extracted byI3WA. Domain of mining associations

As Web documents are published by different web developers, one term may be represented differently by different developers. Therefore, we restrict the transaction database of association mining to documents categorized in a class when performing the process of mining term associations. ACIRD applies association rule mining to mine term associations by the following translations:

• Terms appearing in documents correspond to items. Termsets corresponds to itemsets. • Informative content extracted by 13WA corresponds to a transaction. • Class corresponds to the transaction database. A class represents a domain. Definitions of support and confidence stay the same with the definitions used in mining association rule. For example, in the class Art, the initial support of exhibition

134

Shian-Hua Lin

and art are SUpexhibition,Art = 0.13 and sUpart,Art = 1, respectively. The term exhibition should be removed from keywords of the class if the support threshold is set to 0.2. Mining term associations in the class Art gets the term association rule: exhibition ---+ art, con!exhibition-+art = 0.826

and sUPexhibition-+art = 0.1.

Assume that a rule with 10% supports is useful. The optimized support of the term exhibition is promoted from 0.13 to 0.826 since the term is strongly associated to the term art [58]. Experiments show that term association mining is useful for enhancing the semantics between terms and is therefore useful for improving the classification accuracy. In I3KL, mining term association is used to construct the thesaurus of a domain that represents a class and corresponding subclasses in a directory hierarchy. That is term associations form conceptual network, like the WordNet 11 , of a specific domain. 7. 13 APPLICATIONS

The main goal of the 13 system is to construct an information system that simplifies obtaining information and knowledge from the Web. Applications of the 13 system behave like a transparent assistant that helps users obtain useful information and make decisions on the Internet. For example, 13 applications can monitor events, make decisions, and execute tasks to automatically manage risks, exchange business information, make transactions, or serve requirements of individuals. In the Web environment, 13 techniques are widely used to retrieve, organize, and manage Web sites, especially portal sites. These applications are summarized as follows: • Intelligent content management. To develop adaptive Web sites, 13 techniques are usually used to manage content publishing with a minimum labor requirement. • Intelligent inteifacefor retrieving Mleb documents. By combining directory services, search engines, and document categorization techniques, an information system provides users with various perspectives in viewing and retrieving the Web information space. The Two-phase search of the 13 system uses catalogs to improve search. • Personalization. To customize a Web site according to user profiles, 13 techniques can be used in extracting profiles from users' accessing logs, ratings and recommendations. • Customer relationship management (CRM). Using the Internet as communication platform between customers and companies is the current trend. CRM information provides companies with information about products and services that are of interest to customers. Many companies invested a lot of time and many in building CRM platform to enable Internet businesses. • Intelligent agents. Shopping or auction agents can be programmed to search for specific items on the Web, extract and monitor price information, and make decisions. They can notify users of events by sending e-mails to users or make automatic decisions. 11 http://www.cogsci.princeton.edu/~wn/w3wn.html.

Intelligent internet information systems in knowledge acquisition

135

• Text analysis and summarization. Many intelligent information applications are used to analyze and summarize textual information from the Internet. These applications automatically collect documents from the Internet, identity document languages of documents, create or predict classes (or clusters) of documents for conceptual representations, and summarize information from documents. There are infinite applications ofI3 to access the treasure trove of Web information. Recently, we are designing an information infrastructure based on many autonomous 13 systems. Each 13 system is autonomous and focused on collecting, organizing, managing, and learning from Web documents of specific domains. Connecting these systems forms an intelligent network that provides the document and query routing environment. Submitting a documents or a query to the collaborative 13 environment, the document or the query can be routed to adequate 13 systems (nodes). 8. CONCLUSIONS

In this paper, an Intelligent Internet Information System (13 system) is proposed to automatically obtain useful information and knowledge from the Web. The threelayer architecture of the 13 system clearly partitions the problem oflearning from Web documents into three parts:

• The bottom layer, 13WA, analyzes the Web and mines informative structure and content from Web sites. • The middle layer, 13ME, extracts structured information from the informative structure and content. • The top layer, 13KL, obtains knowledge from extracted metadata and improves the insufficient semantics of the current Web. In this three-layer architecture, the sub-system of each layer can deal with its corresponding problem. Several ways to apply 13 are also described. Although applications of Web intelligent systems are domain dependent, the 13 system tries to integrate several information techniques to build an adaptive 13 framework. REFERENCES [1] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, "Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications," Proceedings of the ACM SIGMOD International Conference, pages 94-105, 1998. [2] R. Agrawal, T. Imielinski, and A. Swami, "Mining Association Rules between Sets ofItems in Large Databases," Proceedings of the ACM SIGMOD International Conference on Management of Data, May 1993. [3] R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules," Proceedings of the 20th International Conference on VLDR, Septemher 1994. [4] J. Allan, "Relevance Feedback with too much Data," Proceedings of the ACM SIGIR International Conference on Information Retrieval, pages 337-343, July 1995. [5] S. F. Altschul, W Gish, W Miller, E. W Myers, and D. J. Lipman, "Basic Local Alignment Search Tool," Journal of Molecular Biology, 215:403-410, 1990. [6] M. Ankerst, M. M. Breunig, H.-P. Kriegel, andJ. Sander, "OPTICS: Ordering Points to Identify the Clustering Structure," Proceedings of the ACM SIGMOD International Conference, pages 49-60, 1999.

136

Shian-Hua Lin

[7] C. Apte, F. Damerau, and S. M. Weiss, "Automated Learning of Decision Rules for Text Categorization," ACM Transactions on Information Systems, 12(3):233-251, July 1994. [8] R. Baeza-Yates, "Modern Information Retrieval," Addison Wesley, 1999. [9] T. Berners-Lee, T. R. Cailliau, etc., "The I1Iorld-Wide lVeb," Communications of the ACM, 37(8):7682, August 1994. [10] T. Berners-Lee, "Semantic Web Road Map," http://www.w3.org/DesignIssues/Semantic.html. [11] K. Bharat and M. R. Henzinger, "Improved Algorithms for Topic Distillation in a Hyperlinked Environment," Proceedings of the ACM SIGIR International Conference on Information Retrieval, 1998. [12] A. Borodin, G. o. Roberts,]. S. Rosenthal, and P. Tsaparas, "Finding Authorities and Hubs from Link Structures on the World Wide Web," Proceedings of the l Oth International World Wide Web Conference, pages 415-429,2001. [13] S. Brin and L. Page, "The Anatomy of a Large-scale Hypertextual Web Search Engine," Proceedings of the 7th International World Wide Web Conference, 1998. [14] A. Broder, S. Glassman, M. Manasse, and G. Zweig, "Syntactic Clustering of the Web," Proceedings of the 6th International WWW Conference, pages 391-404,1997. [15] A. Caglayan and C. Harrison, "Agent Sourcebook-A Complete Guide to Desktop, Internet, and Intranet Agents," John Wiley & Son, 1997. [16] C. Cardie, "Empirical Methods in Information Extraction," AI Magazine, 18(4):5-79, 1997. [17] S. Chakrabarti, "Integrating the Document Object Model with Hyperlinks for Enhanced Topic Distillation and Information Extraction," Proceedings of the l Oth International World Wide Web Conference, 2001. [18] S. Chakrabarti, B. Dom, P. Raghavan, S. Rajagopalan, D. Gibson, and]. M. Kleinberg, "Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text," Proceedings of the 7th International World Wide Web Conference, 1998. [19] S. Chakrabarti, B. Dom, S. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, D. Gibson, and]. M. Kleinberg, "Mining the Web's Link Structure," IEEE Computer, 32(8):60-67, August 1999. [20] S. Chakrabarti, M. Joshi, and V Tawde, "Enhanced Topic Distillation using Text, Markup Tags, and Hyperlinks," Proceedings of the ACM SIGIR International Conference on Information Retrieval, 2001. [21] M. S. Chen,]. Han, and P. S. Yu, "Data Mining: An Overview from a Database Perspective," IEEE Transactions on Knowledge and Data Engineering, 8(6): 866-883, 1996. [22] L. F. Chien, "PAT-Tree-Based Keyword Extraction for Chinese Information Retrieval," Proceedings of the ACM SIGIR International Conference on Information Retrieval, 1997. [23] B. Chidlovskii, "Wrapper Generation by k-Reversible Grammar Induction," Workshop on Machine Learning for Information Extraction, August, 2000. [24] D. W Chung, U. T. Ng, A. W Fu, and Y.]. Fu, "Efficient Mining of Association Rules in Distributed Databases," IEEE Transactions on Knowledge and Data Engineering, 8(6):911-922, December 1996. [25] P. Clark and T. Niblett, "The CN2 Induction Algorithm," Machine LearningJournal, 3(4):261-283, 1989. [26] W. B. Croft and P. Savino, "Implementing Ranking Strategies Using Text Signatures," ACM Transactions on Office Information Systems, 6(1):42-62, Jan. 1998. [27] T. G. Dietterich, "Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms," Neural Computation, 10(7):1895-1924, 1998. [28] R. Doorenbos, O. Etzioni, and D. S. Weld, "A Scalable Comparison-Shopping Agent for the WorldWide Web," Proceedings of the 1st International Conference on Autonomous Agents, pages 39-48, February 1997. [29] M. Ester, H.-P. Kriegel,]. Sander, and X. Xu, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases," Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pages 226-231,1996. [30] O. Etzioni, "The World-Wide Web: Quagmire or Gold Mine," Communications of the ACM, 39(11):65-68. November 1996. [31] O. Etzioni and M. Perkowitz, "Category Translation: Learning to Understand Information on the Internet," Proceedings of 15th International Joint conference on AI, pages 930-936, 1995. [32] W B. Frakes and R. Baeza-Yates, "Information Retrieval: Data Structures and Algorithms," Prentice Hall, 1992. [33] D. Freitag, "Machine Learning for Information Extraction," Ph.D. Dissertation of Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 1998.

Intelligent internet information systems in knowledge acquisition

137

[34] N. Fuhr, "Models for Retrieval with Probabilistic Indexing," Information Processing and Management, 25(1):55-72,1989. [35] S. Guha, R. Rastogi, and K. Shim, "CURE: An Efficient Clustering Algorithm for Large Databases," Proceedings of the ACM SIGMOD International Conference, pages 73-84, 1998. [36] S. Guha, R. Rastogi, and K. Shim, "ROCK: A Robust Clustering Algorithm for Categorical Attributes," Proceedings of the 15th International Conference on Data Engineering, 1999. [37] J. Han, Y. Cai, and N. Cercone, "Knowledge Discovery in Databases: An Attribute-Oriented Approach," Proceedings of the 18th VLDB Conference, pages 547-559,1992. [38] J. Han, Y. Fu, W Wang, J. Chiang, W Gong, K. Koperski, D. Li, Y. Lu, A. Rajan, N. Stefanovic, B. Xia, and 0. R. Zaiane, "DBMiner: A System for Mining Knowledge in Large Relational Databases," Proceedings of the International Conference on Data Mining and Knowledge Discovery, pages 250255, 1996. [39] J. Han and M. Kamber, "Data Mining: Concepts and Techniques," Morgan Kaufmann, 2001. [40] J. Han, J. Pei, and Y. Yin, "Mining Frequent Patterns without Candidate Generation," Proceedings of the ACM SIGMOD International Conference, pages 486-493, 2000. [41] C. C. Hayes, "Agents in a Nutshell-A Very BriefIntroduction," IEEE Transactions on Knowledge and Data Engineering, 11(1):127-132,Jan/Feb 1999. [42] C. N. Hsu, andM. T. Dung, "Generating Finite-state Transducers for Semi-structured Data Extraction from the Web," Information Systems, 23(8):521-538,1998. [43] A.Jain, M. Murty, and P Flynn, "Data Clustering: A Review," ACM Computing Surveys, 31(3):264323,1999. [44] Y. F. Jing and W B. Croft, "An Association Thesaurus for Information Retrieval," http://cobar. cs.umass. edulinfo/psfileslirpubs/jingcroftassocthes.ps.gz, UMass TR 94-17. [45] T. Kalt and W B. Croft, "A New Probabilistic Model of Text Classification and Retrieval," http://cobar. cs.umass.edulinfo/psfileslirpubslir.html, UMass Computer Science Technical Report, IR-78,1996. [46] M. Kantardzic, "Data Mining: Concepts, Models, Methods, and Algorithms," Wiley-Interscience, 2003. [47] H. Y. Kao, S. H. Lin, J. M. Ho, and M. S. Chen, "Entropy-Based Link Analysis for Mining Web Informative Structures," the Eleventh International Conference on Information and Knowledge Management (CIKM'02), 2002. [48] H. Y. Kao, S. H. Lin, J. M. Ho, and M. S. Chen, "Mining Web Informative Structures and Contents Based on Entropy Analysis," to appear in IEEE Transactions on Knowledge and Data Engineering. [49] G. Karypis, E.-H. Han, and V. Kumar, "CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling," IEEE Computer, 32(8):68-75,1999. [50] J. M. Kleinberg, "Authoritative Sources in a Hyperlinked Environment:' ACM-SIAM Symposium on Discrete Algorithms, 1998. [51] R. Kosala and H. Blockeel, "Web Mining Research: A Survey," SIGKDD Explorations, 2(1):1-15, 2000. [52] N. Kushmerick, D. Weld, and R. Doorenbos, "Wrapper Induction for Information Extraction," Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI), 1997. [53] L. S. Larkey and W B. Croft, "Combining Classifiers in Text Categorization," Proceedings of the ACM SIGIR International Conference on Information Retrieval, pages 289-297,1996. [54] D. Lewis, "An Evaluation of Phrasal and Clustered Representations on a Text Categorization Task:' Proceedings of the ACM SIGHl.. International Conference on Information Retrieval, pages 37-50, 1992. [55] R. Lempel and S. Moran, "The Stochastic Approach for Link-Structure Analysis (SALSA) and the TKC Effect," Proceedings of the 9th International World Wide Web Conference, May 2000. [56] S. H. Lin, M. C. Chen, J. M. Ho, and Y. M. Huang, "ACIRD: Intelligent Internet Document Organization and Retrieval," IEEE Transactions on Knowledge and Data Engineering, 14(3):599614, May/June 2002. [57] S. H. Lin and J.M. Ho, "Discovering Informative Content Blocks from Web Documents," Proceedings ofthe 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002. [58] S. H. Lin, C. S. Shih, M. C. Chen, J. M. Ho, M. T. Kao, and Y. M. Huang, "Extracting Classitication Knowledge ofInternet Documents: A semantics Approach:' Proceedings of the ACM SIGIR International Conference on Information Retrieval, pages 241-249,1998. [59] U. Manber and S. Wu, "GLIMPSE: a Tool to Search through Entire File Systems," Winter USE NIX Technical Conference, pages 23-32, USENIX Association, 1994.

138

Shian-Hua Lin

[60] S. Madria, S. Bhowmick, W Ng, and P. Lim, "Research Issues in Web Data Mining," Proceedings of the International Conference on Data Warehousing and Knowledge Discovery, pages 303-312, 1999. [61] A. McCallum, K. Nigam, J. Rennie, and K. Seymore, "A Machine Learning Approach to Building Domain-Specific Search Engines," Proceedings of the 6th InternationalJoint Conference on Artificial Intelligence, pages 662-667, 1999. [62] M. Mehta,J. Rissanen, and R. Agrawal, "SLIQ: A Fast Scalable Classifier for Data Mining," Proceedings of the 5th International Conference on Extending Database Technology, 1996. [63] T. M. Mitchell, "Machine Learning," McGraw-Hill, 1997. [64] J. Mostafa, S. Mukhopadhyay, W Lam, and M. Palakal, "A Multilevel Approach to Intelligent Information Filtering: Model, System, and Evaluation," ACM Transactions on Information Systems, 15(4):368-399, October 1997. [65] NCB! BLAST, http://www.ncbi.nlm.nih.gov/BLAST/. [66] R. Ng and J. Han, "Efficient and Effective Clustering Methods for Spatial Data Mining," Proceedings of the 20th International Conference on Very Large Databases, 1994. [67] R. Ng andJ. Han, "CLARANS: A Method for Clustering Objects for Spatial Data Mining," IEEE Transactions on Knowledge and Data Engineering, 14(5):1003~1016, September/October 2002. [68] S. K. Pal, V. Talwar, and P. Mitra, "Web Mining in Soft Computing Framework: Relevance, State of the Art and Future Directions," IEEE Transactions on Neural Networks, 13(5):1163-1177, 2002. [69] J. S. Park, M.-S. Chen, and P. S. Yu, "Using a Hash-Based Method with Transaction Trimming for Mining Association Rules," IEEE Transactions on Knowledge and Data Engineering, 9(5):813-825, September/October 1997. [70] G. Piatetsky Shapiro and W.J. Frawley, "Knowledge Discovery in Databases." AAAI MIT Press, 1991. [71] M. F. Porter, "An Algorithm for Suffix Stripping," Program, 14(3):130-137, 1980. [72] J. R. Quinlan, "Induction of Decision Trees," Machine Learning, Vol. 1, pages 261~283, 1989. [73] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers. San Mateo, CA, 1993. [74] S.Raghavan and H. Garcia-Molina, "Crawling the Hidden Web," Proceedings ofthe 27th International Conference on Very Large Data Bases, pages 129-138, 200 l. [75] J. Rennie and A. McCallum, "Using Reinforcement Learning to Spider the Web Efficiently," Proceedings of the 6th International Conference on Machine Learning, pages 335-343,1999. [76] G. Salton, "Automatic Information Organization and Retrieval," McGraw-Hill, 1968. [77] G. Salton and C. Buckley, "Term-weighting Approaches in Automatic Text Retrieval," Information Processing and Management, 24(5):513-523, 1988. [78] G. Salton and C. Buckley, "Improving Retrieval Performance by Relevance Feedback," Journal of American Society for Information Science, 41(4):188-297,1990. [79] G. Salton, A. Wong, and C. Yang, "A Vector Space Model for Automatic Indexing," Communications of the ACM, 18(11):613-620, 1971. [80] D. Shasha and T. Wang, "New Techniques for Best-Match Retrieval," ACM Transactions on Office Information Systems, 8(2):140-158, January 1990. [81] R. Srikant and R. Agrawal, "Mining Generalized Association Rules," Proceedings of the 21st International Conference on Very Large Databases, pages 407-419, 1995. [82] R. Srikant and R. Agrawal, "Mining Quantitative Association Rules in Large Relational Tables," Proceedings of the ACM SIGMOD International Conference on Management of Data, June 1996. [83] S. B. Thrun, et ai, "The MONK's Problems A Performance Comparison of Different Learning Algorithms," Technical report CMU-CS-91-197. Carnegie Mellon University, 1991. [84] W3C, "World Wide Web Consortium," http://www.w3.org/. [85] W3C DaM, "Document Object Model (DaM)," http://www.w3.org/DOM/. [86] W3C HTML, "HTML 4.01 Specification," http://www.w3.org/TR/html4/. [87] W3C HTTP, "HTTP: Hypertext Transfer Protocol," http://www.w3.org/Protocols/. [88] W3C Semantic Web, "Semantic Web," http://www.w3.org/2001/sw/. [89] W3C RDF, "Resource Description Framework," http://www.w3.org/RDF/. [90] W3C WebOnt, "Web-Ontology (WebOnt) Working Group," http://www.w3.org/2001/sw/ WebOnt/. [91] W3C XML, "Extensible Markup Language (XML)," http://www.w3.org/XML!. [92] K. Wang and H. Liu, "Discovering Structural Association ofSemistructured Data," IEEE Transactions on Knowledge and Data Engineering, 12(3):353-371,2000. [93] M. Wooldridge and N. Jennings, "Intelligent Agents: Theory and Practice," Knowledge Engineering Review 10(2):115-152, Cambridge University Press, 1995.

Intelligent internet information systems in knowledge acquisition

139

[94] Y. Yang, "Expert Network: Effective and Efficient Learning from Human Decisions in Text Categorization and Retrieval," Proceedings of the ACM SIGIR International Conference on Information Retrieval, pages 13-22, 1994. [95] B. Yuwono, S. L. Y. Lam, J. H. Ying, and D. L. Lee, "A World Wide Web Resource Discovery System," World Wide Web Journal, 1(1), Winter 1996. [96] 0. R. Zaiane, M. Xin, and J. Han, "Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs," Proceedings ofAdvances in Digital Libraries Conference, pages 19-29, 1998. [97] T. Zhang, R. Ramakrishnan, and M. Livny, "BIRCH: An Efficient Data Clustering Method for Very Large Database," Proceedings of the ACM SIGMOD Conference on Management of Data, pages 103-114, 1996. [98] G. K. Zipf, "Human Behavior and the Principle ofLeast Effort," Addison Wesley Publishing, Reading, Massachusetts, 1949.

AGGREGATOR: A KNOWLEDGE BASED COMPARISON CHART BUILDER FOR eSHOPPING

F. KOKKORAS, N. BASSILIADES, AND I. VLAHAVAS

1. INTRODUCTION

Most internet stores selling certain types of products, usually offer a limited set of brand names and for each brand name, a limited set of products. In addition, the design of such e-commerce sites is strongly influenced by retailers whose only goal is to sell as many products as possible to the users that visit their site. As a result, such sites follow a fixed representation for the products offered and put more emphasis on the price, less emphasis on the complete presentation of the features of the product, and unfortunately, they discourage side-by-side comparison shopping. Moreover, presenting various products, they put emphasis onjust a few strong features and they don't mention the weak ones. Although such e-shops are valuable for the final purchase transaction, they fail to service the non-informed customer, that is, the potential buyer that has no clear picture of what exactly to buy from the available alternatives. Such an information need from the customer side, can be usually covered by browsing to the product's brand site where detailed specification pages about their products can be found. The negative aspect of this approach is the huge amount of time that is required by the buyer to create a clear picture of what are the advantages and disadvantages of the available products. Considering that there are many brands making the desired product and that each of them offers many models, browsing at so many specification pages is a time consuming task. To make things worst, comparing the different models can be done, manually only, on paper or by copying and pasting information

140

Aggregator: a knowledge based comparison chart builder for eshopping

141

A regular expression, identifying font HTML tags. Extraction Rule: (?i) Source: ...kFONT size="+2">1 hello kFONT size=1>1 world ... A linear wrapper extracting a digital camera model name from an HTML snippet. Extraction Rule: sktptot-B»), extractUntil(X,
...«Pe-New model: INikon Coolpix 320q

...

A hybrid wrapper as a path expression (tree wrapper) combined with a regular expression "€\d", that extracts prices in euros from HTML table cell tags. Extraction Rule: *.table.*.td(X, "€\d") Source: .........

Figure 1. Typical expressions of wrappers of various technologies and their extracted result (framed text).

to another application, such as a spreadsheet. Even for the experienced web user this workload discourages such a task. The discussion above makes clear that there is a need for software tools that allow the as effortless as possible creation of comparison shopping charts by gathering product specification information from various known sites. This is not an information retrieval task but rather an information extraction one. A web search engine can probably help to locate an information resource but is unable to process that resource, extract featurevalue pairs and integrate that information into a singe comparison table. In the recent years, various researchers have proposed methods and developed tools towards the web information extraction task, with the buzzword of the field being the term wrapper. A wrapper (or extraction rule) is a mapping that populates a data repository with implicit objects that exist inside a given web page. Creating a wrapper, usually involves some training (wrapper induction-[31]) by which the wrapper learns to identify the desired information. Unlike Natural Language Processing (NLP) techniques that rely on specific domain knowledge and make use of semantic and syntactic constraints, wrapper induction mainly focuses on the features that surround the desired information (delimiters). These features are usually the HTML tags that tell a web browser how to render the page. In addition, the extraction of typed information like addresses, telephone numbers, prices, etc., is usually performed through extensive usage of regular expressions (Figure 1). Regular expressions are textual patterns that abstractly, but precisely, describe some content. For example, a regular expression describing a price in euros could be something like "€\d". Besides regular expressions, there are two major research directions in wrapper induction. The first and older one, treats the HTML page as a linear sequence of HTML tags and textual content ([2], [26], [35], [37]). Under this perspective, a wrapper generation is a kind of substring detection problem. Such a wrapper, usually includes delimiters in the form of substrings that prefix and suffix the desired information. These delimiters can be either spotted to the wrapper generation program by the user

142

F. Kokkoras, N. Bassiliades, and I. Vlahavas

(supervised learning) or located automatically (unsupervised learning). The former method usually requires less training examples but should be guided by a user with a good understanding of HTML. The latter approach usually requires more training examples but can be fully automated. As the Internet technologies emerge, a new breed of wrapper induction techniques appeared ([8], [12], [30]), that treat the HTML document as a tree structure, according to the Document Object Model (DaM) [18]. Basically, such a tree wrapper uses path expressions to refer to page elements that contain the desired information (Figure 1). Tree wrappers seem to be more powerful that string wrappers. Actually, if input documents are well structured and tags at the lowest level does not contain several types of data, then a string wrapper can always be expressed as a tree wrapper [36]. Thanks to the advanced tools that are available for web page design, HTML pages are nowadays highly well-formed, but at the same time the content is more decorated by using more HTML tags and attributes. As a result, although approximate location of desired information is relatively easy thanks to tree wrappers, extraction of the exact piece of information requires regular expressions or even NLP (Figure 1). Thus, hybrid approaches are becoming quite popular. In general, wrapper induction technology demonstrates that shallow pattern matching techniques, which are based on document structural information rather that linguistic knowledge, can be very effective. Until the semantic web [7] becomes a common place, information extraction techniques will continue to play an important role towards the informed customer concept. In the comparison chart building problem, extracting and integrating information from heterogeneous web sources requires more than one wrappers. Variety in the way information is encoded and presented requires the cooperation of individual information extraction agents that are specialized for certain pieces of information and web sources. Creating, coordinating and maintaining a large number of wrappers is not a simple task though. A crucial factor that can alleviate this burden is the way wrappers are encoded and trained. Having to modify an ill-described wrapped that ceased to work efficiently due to certain reasons, is much more difficult than modifying a wrapper described in a human friendly way. This need is becoming critical as more non-expert users are adapting information extraction technologies for personalization and information filtering. Visual tools that allow the easy creation of wrappers ([1], [4], [20], [27], [32]) and declarative languages ([4], [29], [32]) for wrapper encoding is the current established trend. In this chapter, we present a knowledge based approach on comparison chart building from heterogeneous, semi-structured sources (product specification web pages). We propose the usage of the Conceptual Graphs (CGs) knowledge representation and reasoning formalism to train and describe information extraction wrappers. CGs naturally supports the wrapper induction problem as a series of conceptual graph (CG) generalization and specialization operations between training examples expressed as CGs. From the other hand, wrapper evaluation corresponds to the CG projection operation. Additionally, using DaM and product related domain knowledge, as well as advanced visual tools, we turn the wrapper creation and testing problem in an effortless

Aggregator: a knowledge based comparison chart builder for eshopping

143

task. Finally, we present the Aggregator, a comparison chart builder program that is based on the proposed approach. Aggregator can be taught how to gather specification information from web pages offered by brand sites and then use this knowledge to create side-by-side feature comparison charts by mining web pages in a highly automated and accurate fashion. The rest of the chapter is organized as following: Section 2 presents related work in the field ofwrapper induction and information extraction, emphasizing in comparison shopping and visual approaches. Section 3 gives a short introduction to CGs and proposes a novel approach for wrapper training, modeling and evaluation that is based on CGs. Section 4, presents how our CG-based wrappers and domain knowledge can be used to create comparison charts from heterogeneous web sources. Section 5 outlines the Aggregator, a tool that allows to visually train and apply CG-based wrappers, and finally, Section 6 concludes the chapter and gives insight for future work. 2. RELATED WORK

In the last few years, many approaches and related tools have been proposed to address the web information extraction problem. In the following, we give some detail about approaches that are closer to ours, in the sense that, they either exploit a tree representation of a web page ([4], [29], [32]) or use target structures that describe objects of interest and try to locate portions of web pages that implicitly conform to that structures ([1], [20], [27]). A good survey on information extraction from the web can be found in [28]. XWRAP [29] is an interactive system for semi-automatic generation of wrapper programs. Its core procedure is a three step task in which the user, first identifies interesting regions, then identifies token name and token value pairs, and finally identifies the useful hierarchical structures of the retrieved document. Each step results in a set of extraction rules specified in a declarative language. At the end, these rules are converted into a Java program which is a wrapper for a specific source. XWRAP features a component library that provides source independent, basic building blocks for wrappers and provide heuristics to locate data objects of interest. In W 4F ([32], [33]), a toolkit for building wrappers, the user first uses one or more retrieval rules to describe how a web document is accessed. Then, he/she uses a DOM representation and a web page annotated with additional information, to describe what pieces of data to extract. Finally, he/she declares what target structure to use for storing the extracted data. W 4F offers a wizard to assist the user in writing extraction rules which are described in HEL (HTML Extraction Language) and denote an assignment between a variable name and a path-expression. The wizard cannot deal with collection ofitems, so if the user is interest in various items of the same type with the one clicked on, conditions must be attached to the path expression to write robust extraction rules. Lixto ([3], [4]) is a system that assists the user to semi-automatically create wrapper programs by providing a visual and interactive user interface. It allows the extraction of target patterns based on surrounding landmarks, on the content itself, on HTML attributes, on the order of appearance and on semantic and syntactic concepts. In

144

F. Kokkoras, N. Bassiliades, and I. Vlahavas

addition, it allows disjunctive wrapper definition, crawling to other pages during extraction and recursive wrapping. Wrappers created with Lixto are encoded in Elog, a declarative extraction language which uses a datalog-like logical syntax and semantics. Lixto TS [5] is an extension to the basic system aiming at web aggregation applications through visual programming. NoDoSE [1] provides a graphical user interface in which the user hierarchically decomposes the web document, outlining its interesting regions and describing their semantics. This decomposition occurs in levels; for each one of them the user builds an object with a complex structure and then decomposes it in other objects with a more simple structure. The system uses this object hierarchy to identity other similar objects in the document. This is accomplished by a mining component that attempts to infer the grammar of the document from objects constructed by the user. DEByE [27] is an interactive tool that allows the user to assemble nested tables (with possible variations in structure) using pieces of data taken from the sample page. The tables assembled are examples ofthe objects to be identified on the similar target pages. DEByE generates object extraction patterns that indicate the structure and the textual surroundings of the objects to be extracted. These patters are then fed to a bottom-up extraction algorithm that takes a target page as input, identifies on it atomic values in this page and assembles complex objects using the structure of the pattern as a guide. In [20], an ontology based approach to information extraction is presented. The ontology (conceptual model), which is described in the Object-oriented Systems Model, is constructed prior to extraction and describe the data of interest, relationships, lexical appearance and context keywords. The extraction tool uses this ontology to determine what to extract from record-sized chunks that are derived from a web page and are cleared from HTML tags. This use of ontological knowledge enables a wrapper to "sustain" in small variations existing in similar web pages (improved resiliency) and to be able to work better with documents presenting similar information but differently organized (improved adaptivity). Our proposed framework for wrapper creation offers very similar functionality with all of the above approaches, in the sense that it provides a visual environment for wrapper creation. There exists a major difference though in the core technology used, which, for our tool is the Conceptual Graph formalism. Our choice allow us to exploit both DOM representations of web documents (approach used in [4], [29] and [32]), as well as user defined structures that describe objects of interest (approach used in [1] and [27]). We achieve this by using CG-based generic wrapper descriptions which are detailed by the user in an interactive way, using visual tools that combine not only the DOM representation, but the browser itself. The CG formalism, naturally supports all the major steps in information extraction with wrappers, with its generalization, specialization and projection operations. In addition, CGs is a proven technology to encode ontological knowledge to provide a common schema for information integration and to improve wrapper's resiliency and adaptivity in the way [20] does. Beyond that, the representation we use provides the operations required to create a functional

Aggregator: a knowledge based comparison chart builder for eshopping

[Wrapper]

f-

(targetURL)

f-

145

[URL]

Figure 2. A Conceptual Graph stating that there exists a wrapper aiming at some URL.

reasoning system. This allows the creation of dynamic ontologies, where static and axiomatic/rule knowledge co-exist [15]. For example, we can use such knowledge to create structural dependencies between two wrappers. Finally, the CG formalism has, by nature, better visualization potential. This enables our system to provide a more comprehensible wrapper representation to the end-user. Regarding comparison shopping, one of the earliest attempts is ShopBot [19]. It focuses on vendor sites with form based search pages, returning lists of products with a tabular format. With today standards, ShopBot is quite restricted since it uses linear wrappers and focuses on highly structured pages. A commercial version of ShopBot, known asJango, was bought by Excite. Apart from Lixto TS [5], there are many other commercial wrapping services available on the Internet, such as Junglee (bought by Amazon), Jango, mySimon, RoboShopper and PriceGrabber. Jango and mySimon use real time information gathering from merchant sites, while Junglee pre-fetches information in a local database and updates it when necessary. All sites provide comparative shopping based on integrated information delivered from other vendor sites. Besides their unknown technology which is considered a business asset, most of these sites put emphasis on the price and provide very limited product specification information. Only PriceGrabber offers side-by-side and specification information rich, comparison charts. 3. WRAPPERS AS CONCEPTUAL GRAPHS

In this section we first give a small introduction to CGs, focusing mainly on the generalization, the specialization and the projection operations which are the key ideas behind our proposed CG-Wrap model. Then we present how CGs can be used to model information extraction wrappers. 3.1. Conceptual graphs background

The elements of CG theory ([14], [34]) are concept-types, concepts, relation-types and relations. Concept-types represent classes of entity, attribute, state and event. Concepttypes can be merged in a lattice whose partial ordering relation < can be interpreted as a categorical generalization relation. A concept is an instantiation ofa concept-type and is usually denoted by a concept-type label inside a box or between "[" and "]" (Figure 2). To refer to specific individuals, a referent field is added to the concept ([table:*]-a table, [table:{*}@3]-three tables, etc.). Relations are instantiations of relation-types and show the relation between concepts. They are usually denoted as a relation label inside a circle or between parenthesis (Figure 2). A relation type determines the number of arcs allowed on the relation as well as the type of the concepts (or their subtypes) linked on these arcs.

146

F. Kokkoras, N. Bassiliades, and I. Vlahavas

CG1: [HTMLElement: #3] CG2: [HTMLElement: #9]

f-

(attribute)

ff-

CG3: [HTMLElement]

f-

f-

[BGColor: "#FFFFFF]

(attribute) f- [BGColor: "#CCFF12] (parent) f- [HTMLElement: #8]

(attribute)

f-

[BGColor]

Figure 3. CG3 is the minimum common generalization ofCG! and CG2.

A Conceptual Graph is a finite, connected, bipartite graph consisting of concept and relation nodes (Figure 2). Each relation is linked only to its requisite number of concepts and each concept to zero or more relations. CGs represent information about typical objects or classes of objects in the world and can be used to define new concepts in terms of old ones. The type hierarchy established for both concepts and relations is based on the intuition that some types subsume other types, for example, every instance of the concept "Table would also have all the properties of HTMLElement. In addition, with a number of defined operations on CGs (canonical formation rules) one can derive allowable CGs from other CGs. These rules enforce constraints on meaningfulness; they do not allow nonsensical graphs to be created from meaningful ones. Among other operations defined over CGs, the most useful and related to the information extraction problem, are the generalization, the specialization and the projection operations. The generalization is an operation that monotonically increases the set of models for which some CG is true. For example, CG 3 in Figure 3 is the minimum common generalization ofCG j and CG z. Only common parts (concepts and relations) of OG, and CG z are kept in CG 3 . In addition, individual concepts like [BGColor:"#FFFFFFj have become generic by removing the referent field. Specialization is the opposite to the generalization operation. It monotonically decreases the set of models for which some CG is true. This is achieved by either adding more parts (concepts and/or relations) to a CG, or by assigning an individual referent to some generic concept. Projection is a complex operation that projects a CG v over another CG u which is a specialization of v (u ::: v), that is, there is a sub graph u' embedded in u that represents the original v. The result is one or more CGs IT v which are similar to v but some of its concepts is possible to have been specialized by either specializing the concept type or assigning a value to some generic referent, or both. Under the machine learning perspective, training information extraction wrappers is a combination of automatic generalization and manual specialization operations that result in a model (pattern) that describes best the training instances and that can be used to detect new, unknown instances. This is similar to the generalization and specialization operations of the CG theory. A CG wrapper is the result of generalization and specialization operations over two or more training instances expressed as CGs. Moreover, applying a CG wrapper is equivalent to a projection operation of the wrapper over web page elements expressed as CGs. Based on these analogies, we present next how CGs can be used to model and train information extraction wrappers.

Aggregator: a knowledge based comparison chart builder for eshopping

[Wrapper]

f-

(targetURl)

f-

(output)

f-

(container)

f-

f-

147

fURL]

[Info] [HTMlElement]

f-

Figure 4. An abstract wrapper as a conceptual graph.

[HTMLElement]

f-

(parent)

f-

(innerText)

f-

(tag)

f-

(siblingCount)

f-

(siblingOrder)

f-

(attribute)

f-

f-

[HTMLElement] f-

[Text]

[HTMLTag]

f-

f-

[Integer]

f-

[Integer]

[Attribute]

Figure 5. An HTML element in CG form (simplified and reduced version).

3.2. Modeling and training wrappers with CGs

The ability ofCGs to represent entities ofarbitrary complexity in a comprehensible way, make them a promising candidate for modeling information extraction wrappers. This perception is strengthened by the highly structured document representation which is defined by the DOM specification. This tree structure allows the easy mapping of web document elements to CG components. In general, a wrapper accesses a page located at a specific URL, searches inside this page for some specific HTML element which is the container of the desired information and extracts that information from it. This abstract description is encoded as the CG depicted in Figure 4. In practice, such a generic wrapper is useless, in the sense that it describes every single element of an HTML page. More specialization is required, particularly in the HTMLEIement concept. Towards this, we exploit the highly structured and information rich HTML element description provided by modern browsers. Such information includes, among others, the text contained inside the element, its attributes, the parent element under the DOM perspective, its tag name, etc. Besides this information, which is directly accessed, we also exploit calculated information that is derived if someone considers the neighborhood of some element. Such information includes, for example, the sibling order of this element being a child of its parent element and the total number of siblings. With this information in hand, a complex HTML element description can be created in CG form. Such a CG is presented in Figure 5. Note that, for clarity, Figure 5 presents a simplification (CG operation) of six CGs over the common [HTMLElement] concept presented on the left. Moreover, for space economy, a reduced version is presented, since the actual description is quite more complex. We demonstrate how our generic wrapper can be specialized using the classical problem of extracting information from an electronic flea market. Figure 6 presents a snippet from a web page of such a site. Information is organized in an HTML table,

148

F. Kokkoras, N. Bassiliades, and I. Vlahavas

ll.l.m.I

arl MO D EM MO TO RO LA S M~6 E" . ~ arl D IAMOMD SU~R'" on Inl ' . . . (//b arl US B MI CROti a MODEL 3 00 .. , arl nJPIAKO MOpE M :16K MIC" , tfIb

lI1Sl ~,

e.n2.!( k . .. k21( <

aR(

~&

.Il)

2 ho urs . 5 0 m inut @s

)

1 IJlpa , 10 kaurs

,) )

.llfW.l

Il!ia ( 15.00 (511 2

dr~ . )

( 22.00 (7197 dra. ]

2 d .ys

( 4 2.0 0 ( 143 12 dr • . )

2 d'y l

( 32 .00

(1090~

dr • . )

Figure 6. A snippet from an on-line flea market.

[Wrapper: f1eaName]t-(targetURL) t- [URL: uwww.fleamarket.gru] t- (output) t- [Info] t-(container) t- [HTMLElement: #16]t- (parent) t- [HTMLElement:#15] t-(innerText) t-[Text:"MODEM M01OROLA SM56 E..."] t-(tag)t- [HTMLTag: "TO"] t-(siblingCount) t- [lnteger:5] t-(siblingOrder) t- [lnteger:1] t-(attribute)t- [BGCOLOR: "#FFFFFF"] Figure 7. First training instance of a CG wrapper.

[Wrapper: f1eaName] t- (targetURL) t- [URL: www.fleamarket.gr] t- (output) t- [Info] t- (container) t- [HTMLElement: #26] t- (parent) t- [HTMLElement: #25] t- (innerText) t- [Text: "DIAMOND SUPRA v92 inte..."] t- (tag) t- [HTMLTag: "TO"] t-(siblingCount) t- [Integer: 5] t- (siblingOrder) t- [Integer: 1] t- (attribute) t- [BGCOLOR: "#CCCCCC"] Figure 8. Second training instance of a CG wrapper.

where the first row holds the headers and the rest of the rows correspond to records describing offered products. We assume that we want to extract the names of the products offered. In a real situation, where the user is not expected to be an HTML expert, the wrapper creation program should allow the identification ofinstances ofthe desired information, by simply pointing it with the mouse (we have developed such a tool which is presented in a following section). Let's say the user points to the table cell containing the name of the first product. This specializes the generic wrapper description, which takes the form presented in Figure 7. Unfortunately, this specialized version is not general enough since it is able to extract only the training instance. A second training instance should be used, say the cell containing the name of the second product. This results in the wrapper instance presented in Figure 8.

Aggregator: a knowledge based comparison chart builder for eshopping

[Wrapper: fleaName]

f-

149

(targetURl) f- fURL: www.fleamarket.gr]

f-

(output) f- [Info: X]

f-

(container) f- [HTMlElement] f- (parent) f- [HTMlElement] f-

(innerText) f- [Text: ?X]

f-

(tag) f- [HTMlTag: "TO"]

f-

(siblingCount) f- [Integer: 5]

f-

(siblingOrder) f- [Integer: 1]

f-

(attribute) f- [BGCOlOR]

Figure 9. Generalization result ofCG wrapper instances of Figure 7 and Figure 8.

[Wrapper: fleaNameJ

~

(targetURL)

~

[URL: www.f1eamarket.gr]

Xl

~

(output) ~ [Info:

~

(container) ~ [HTMLElementJ ~ (parent) ~ [HTMLElementJ ~

~

(siblingOrder) ~ [Integer: >1]

(innerText) ~ [Text: ?X]

~

(tag) ~ [HTMLTag: "TO']

~

(siblingCount) ~ [Integer: 5J

~

(siblingOrder) ~ [Integer:

11

Figure 10. The final CG wrapper modeling the product names of the table in Figure 6.

Using the generalization operation of the CG theory for the two CG wrapper instances, a generic wrapper describing (extracting) both product names can be created (Figure 9). This wrapper is generic enough to extract all product names of the table in Figure 6, but it also extracts the first header cell. Further specialization of our CG wrapper is required to exclude the header cell. This can be established over the HTML element that is the parent ofthe element containing the extracted information. This element refers to a row of the product table. Excluding this row is as simple as requesting that this element's sibling order is greater than one. The final wrapper is presented in Figure 10. Note that the concept of the CG wrapper that contains the desired information ([Info]) is fed by the [Text: ?X] concept, since this part of the web page contains the desired information. In addition, parts of the final wrapper description that do not affect the accuracy of the wrapper, such as the [BGCOLOR] can be dropped out. Finally, regular expressions can be used over the initially extracted information in order to fine-tune the output. For example, extracting the price in euros from the flea market example, requires the replacement of?X with some proper regular expression that is applied over X. Thus, training a CG-Wrapper, is a set of automatic generalization and manual specialization tasks that results in a model (CG) that accurately describes the desired information inside a web page.

150

F. Kokkoras, N. Bassiliades, and I. Vlahavas

[HTMlElement: #25] [HTMlElement: #26]

f-

f-

(siblingOrder) f- [Integer: 3] (parent) f- [HTMlElement: #25]

f-

(innerText)

f-

(tag)

f-

(siblingCount) f- [Integer: 5]

f-

f-

[Text: "DIAMOND SUPRA v92 inte..."]

[HTMlTag: "TO"]

f-

(siblingOrder)

f-

f-

(attribute)

[BGCOlOR: "#CCCCCC"]

f-

[Integer: 11

Figure 11. Two nodes of an HTML tree, in CG form (partially presented).

[Wrapper: fleaName]

f-

(targetURl) f- fURL: www.fleamarket.gr]

f-

(output) f- [Info: "DIAMOND SUPRA v92 inte..."]

f-

(container) f- [HTMlElement]

f-

(parent)

f-

f-

[HTMlElement] f- (siblingOrder) f- [Integer: >1]

(innerText) f- [Text: "DIAMOND SUPRA v92 inte..."J

f-

(tag)

f-

(siblingCount) f- [Integer: 5]

f-

(siblingOrder) f- [Integer: 1]

f-

[HTMlTag: "TO"]

Figure 12. The wrapper of Figure 10 after applying it over the CGs of Figure 11.

We propose two execution models for our CG-Wrappers, a naive and an optimized one. According to the naive execution model, we iterate over all the nodes of the HTML tree trying to satisfy the constraints imposed by the wrapper components. In the optimized execution model we first do some short of filtering, to exclude nodes that are definitely irrelevant. For example, the wrapper of Figure 10 can be evaluated only over the nodes that have a TD tag. Selecting only those nodes is possible by exploiting the browser's application programming interface (API). The semantics of both execution models are derived from the CG theory: The evaluation of a CG-Wrapper is the result JrV of a projection operation that projects the container part u of the wrapper over an HTML node v expressed as CG. For example, consider the two CGs of Figure 11 which refer to the table of Figure 6, representing the second product row and the first cell ofthis row, respectively.Projecting the container part of the CG wrapper of Figure 10 over the second CG of Figure 11 results in an instantiated CG wrapper where the unbound X referent of the [Text: ?X] concept have been unified with "DIAMOND SUPRA v92 inte ... ". Note that, the exact projection involves also a replacement of the concept [HTMLElement: #25] of the second CG, with the CG definition of this concept (that is, the first CG in Figure 11). This inner task corresponds to the expansion operation of the CG theory, where a concept is replaced by its CG definition. The final instantiated wrapper is displayed in Figure 12.

Aggregator: a knowledge based comparison chart builder for eshopping

151

.,/ (rest definition) ....... ....------oIi

<,

""'. (rest definition)

Figure 13. A looping wrapper in CG notation.

LoopingW rapperExecutor(LoopingW rapper:#3) begin Results:=0; repeat WrapperExecutor(W rapper:#l, subResults); Results:=AggregateResults(Results, subResults); WrapperExecutor(W rapper:#2, nextURL); UpdateWrapper(Wrapper:#1, URL, nextURL); until nextURL=null; end;

Figure 14. Abstract execution model of a looping wrapper.

3.3. Reusing CG-wrappers

The CG-Wrap model, is expressive enough to handle nested wrapper definitions, that is, wrappers that are defined in terms of other wrappers. Such a very useful case, is the definition of a looping wrapper that collects results from chained pages containing search results. Consider for example the typical case in which an on-line store presents the results of some user query in individual pages containing 10 items each. In such cases, at the bottom of all pages but the last one, there is a link to the next result page, usually named "Next Page". A looping wrapper is capable of extracting information from all results pages by automatically following the "Next Page" link. Thus, a looping CG- Wrapper (Figure 13) is a combination of a data collector wrapper and a loop definer wrapper. A data collector (Wrapper:#l) is a typical CG-Wrapper that extracts information from a web page. A loop definer (Wrapper:#2) is a CG-Wrapper that extracts the URL of the next page, in the case of information that is presented in a sequence of pages. These two wrappers have a common target URL. The evaluation ofa looping wrapper is presented in Figure 14. First, the data collector is executed and the extracted information is appended to the already extracted results.

152

F. Kokkoras, N. Bassiliades, and I. Vlahavas

Brand's Main Page ProductList Page Specific ProductPage(with specs)

Brand's Main Page ProductList Page Specific ProductPage Product's Specification Page

Figure 15. Typical location of a product's specification information in a brand site.

Then, the execution of the loop definer wrapper follows which extracts from the same page the URL of the "Next Page" link. If this second wrapper brings results then the target URL of the data collector is updated. These steps are repeated until the loop definer fails to extract information. 4. COMPARISON CHART BUILDING WITH CG-WRAPPERS

In this section, we identify problems involved in the comparison chart building task and propose visual and ontology driven approaches that can provide substantial automation to the whole task. 4.1. Locating product specification pages

Building a comparison chart for a certain type ofproducts using information presented in web pages requires, first of all, to locate those pages. Without doubt, the web sites of the various brands is the best place to visit. Locating such sites on the web is a relatively simple task. All that someone has to do is to either try some "URL guessing" heuristics using the www. . com pattern for known brands or use a search engine (or a portal) to locate an e-shop selling the desired category of products, where all major brands are usually mentioned. Having a brand's URL makes the product specification page detection a couple of clicks task. From a brand's main page someone has to follow the "Products" link to go to a page where a complete list of links to various products is available. It is remarkable how strong the above heuristic is. The detailed specifications of a particular product are usually displayed either inside the product's page or in a separate, dedicated page accessible from the product's main page. The above organizations are depicted in Figure 15. It is clear that, even considering that the URLs to brand sites are known, some automation is required towards collecting all the URLs to product specification pages. We have developed a URL wizard that allows the average user to visually manipulate a web page and collect information presented in it. For the purpose of collecting URLs where products are presented, the user can exploit the product list page of a brand site where links to all available products are provided. He/she simply points (or selects) the anchor object(s) inside such a page and asks for URL harvesting from a context menu. For better manipulation, our tool provides a tree view of the web page as well. This tree view is synchronized with the browser window (see Figure 19 in Section 5), that is, when the user points over a page element in the browser window, the corresponding branch in the tree representation is automatically highlighted and vice-versa.

Aggregator: a knowledge based comparison chart builder for eshopping

153

The above approach works perfectly for sites following the organization presented in Figure 15 (left). When the product has a dedicated specification page we train a CG wrapper that learns how to find the anchor to the specification page, inside the product's main page. 4.2. Collecting and merging specification information

The main difficulty in comparison chart building stems mainly from the fact that, product features are not presented in a uniform way inside specification pages. Figure 16 demonstrates how diverse two specification pages could be, although they both refer to similar products (digital cameras). Not only the layout of the pages is different, which renders most of the HTML tag based information extraction methods obsolete, but the exact vocabulary used across brands also varies. The latter, makes the regular expression based extraction troublesome, as well. There exist though two strong, "per brand" regularities that seem worth to exploit: • information in specification pages is usually presented in feature-value pairs enclosed in adjacent HTML tags, and • the vocabulary used by each brand to refer to product features is almost fixed. The above two regularities suggests that a dual approach is required: first locate the feature, then locate and extract the nearby value. Since this combination works at the brand level, the final obstacle is to integrate the "per brand" partial results under a common schema. We have selected to use a product ontology as a common schema. As the semantic web evolves, ontologies describing products of any kind are expected to become available. Such ontologies can be used to map features expressed in a brand's vocabulary to ontology elements. CGs are a proper candidate for describing ontological information. They offer a unified and simple representation formalism that covers a wide range of other data and knowledge modelling formalisms and allow matching, transformation, unification and inference operators to process the knowledge that they describe [23]. Having already used CGs to model and train our CG-Wrappers, CG based ontological knowledge can be easily incorporated and contribute towards knowledge-based wrappers. Consider, for example, two wrappers that extract the focal length and the optical zoom from specification pages of digital cameras. Background knowledge regarding the relation that exists between optical zoom and focal length can be used to modify the kind of information that the focal length wrapper is expected to locate, assuming that the optical zoom wrapper has already extract information. In another case, having selected the brand of a processor, should automatically prevent the extraction of information for certain, incompatible, motherboard models. Although ontological knowledge is expected to become available in RDF/RDFS, the semantic web's language, converting this encoding to CGs is not an issue ([17], [6]). Furthermore, CGs provide a "ready to use" framework for reasoning. This is not the case, at the moment, for RDF/RDFS.

...

...'"

£

iJ

.:.J

~GI:>

tx~n

Speedllght Automa tic

po~",p~R :nQ:.o.~.~Om!1 .6=e~; rNl~0.5 to ~

u (I

11024) (1,024 x 76Elpixels), [MJ ] (MJ x 48) pixels), 13:2) (2,272 x 1,520 pixels) seectal Lens I DllIlul zoom 4x ZOOm-Nlkkor; 7,85-32mm (35mm 1135)format equivalent to 38·" 5,1; 10 elements In 8 groups; Cl Q ~ al zoom oous range :Dcrn (11.8 m.) to Infinity (00), OCIom (19.7 In.) at w1 cils t zoo m aogle sell ir in.) to infinity (00) in Mao/() noU! tDr:age media COrnpaclFlas h"" C u d Type 1/11 , 512MB/ 1GB MIor::ll~lVB TO' tor:age F ile formats IncIUlS;! H I (uroo m pressed TIFF-RG9J; JPEG (COmpresse d. Basel INE, NORMAL , BASIC) umber of fr:am•• (ap prox. ) HI: 1. FINE : 8,. NORMAL : 16, BASIC: 32 (With I EMB CF tze 2,272 x 1,704) • hooting modes IAUTO), [SCENE ] portrait, ParlyllndJor, Night Portra~ , BcaohlSrow, urcet, Night L an ~ pe, Museum, F ilCV\Orks Show, Cbse Up, COpy, Bac k Li ght. Spo rt ss iSt, DawnlDusk and Mu niple Exposure) , [P I. lSI , IA), 1M ) QnolUC!:lO s hooting manu 10 ito Ba~noo. Motering, Image A djt¥.;tmont, Image SharpJ nlng, saturation COntrol and eduction), (Movie )

~ffootive pi xel. 40 million (l0la l pixels: 4.13 million) IlMge si,.. 12272) (2,272 x 1,704 pixels), [1 ro:J) (l, ro:J x 1,2\Xl pixels ), 11 2001(1,200 x ,

Figure 16. Same type of products but diversity in the way specification information is presented.

-----

7,4 - 22.2mm O Srr m fim egyjyaler l: 36 - lOBmm)

Digital Zoom

36X

LENS - roeol Length

To'a1 ~ ..

Typ"

EfTeetlw Pixels

Type of Camera Compact digital S ~ II camera ""th builH n n ~ opbcal zoom

1/1.8Inch charge coupled device~ AppUlll ( 1 mmcn Approx. 4.0 mill en

..AGE CAPTURf DEVICE

TWE OF CAMERA Typ" of Camer a

Specifications '

~~ 400

l'Ad2; , Ii) httD:II_.nIix4_SOO,t-bn

ISpeclflcations

Aggregator: a knowledge based comparison chart builder for eshopping

155

[Wrapper: #lJ ~ (targetURL) ~ [URL: www.powershot.com/powershot2!s400/specs.htmIJ

[Wrapper: #2]

~

(output) ~ [Info: X]

~

(container) ~ [HTMLElement] ~ (innerText) ~ [Text: "Digital Zoom"]

~

~

(tag) ~ [HTMLTag: "TO"]

~

(siblingOrder) ~ [Integer: ?Xj

(targetURL) ~ [URL: www.powershot.comlpowershot2!s400/specs.htmlj

~

(output) ~ [Info: Vj

~

(container) ~ [HTMLElement) ~ (innerText) f- [Text: ?Vj f-

(tag)

f-

(siblingOrder) ~ [Integer: X+1]

~

[HTMLTag: "TD"j

[DuaIWrapper: CanonDigitalZoom] ~ (featureWrapper) ~ [Wrapper: #1J ~

(valueWrapper) f- [Wrapper: #2] f- (output) f- [Info: ?Vj

~

(output) ~ [Info: V] ~ (has_value) f- [Digital Zoom]

Figure 17. A dual wrapper extracting digital zoom information.

As a result, we propose a dual wrapper approach for extracting feature-value pairs from product specification pages: • Associate a wrapper to some product feature, as it is defined in the product ontology and train the wrapper to locate that feature based on the term used by a brand. • Use a second wrapper to extract the value of the feature. This dual wrapper approach is justified by the fact that feature-value information is always located in adjacent HTML elements inside a web page. We can easily encode this information in our wrapper pair reducing in that way the search space of the second wrapper. Furthermore, the second wrapper becomes capable of performing a "blind" extraction in case the value of some feature is presented in an unknown way. In a "blind" extraction the wrapper extracts all the text inside some HTML tag, because "it knows" that the information is there. This is obviously better than an exact-or-nothing approach. In addition, we are not depended on absolute positioning to refer to HTML nodes but we follow a "relative to textual information" methodology instead which is more robust to small page changes. This is very crucial, since many commercial sites tend to make frequent alterations to their sites to prevent wrapping. The same holds for the advertisement banners and special offers, the frequent addition and removal of which, turn obsolete wrappers that use absolute positioning. Figure 17 displays a dual wrapper ([DuaIWrapper: CanonDigitalZoom]) extracting feature-value information (digital zoom ofa digital camera model). It is defined in terms of a feature locator wrapper ([Wrapper: #1]) that locates the table cell ([HTMLTag: "TD"]) containing the text "Digital Zoom", and a value extractor wrapper ([Wrapper: #2]) that extract the value of the feature. The second wrapper is modelled to search in the table cell that is right after the cell the first wrapped worked with. This correlation is established over the parameter ?X.

156

F. Kokkor as, N. Bassiliades, and I. Vlahavas

Vendor

Product SpecifICation Pages

Vendor

#1

Published Comparison Chart

#2

Internet

in-page structure Leamer Browser

HTML Parser

Ontology Derived Product Feature List

DOMTree

CoGlTaNT

Evaluator

==

Interactive Wrapper Creator

[ DomainKnowledge Wrapper Instances

I

Product Knowledge KBModule

Figure 18. System Ar chitect ure.

The dual wrapper of Figure 17 can be used as a data collector in the comparison chart building problem, in the way the simple CG-Wrapper was used in the flea market problem (Sectio n 3.2). 5. A FRAMEWORK FOR INFORMATION EXTRACTION WITH CG-WRAPPERS

In this section, we describ e the system architecture of Aggregator, a comparison chart

build er that implements th e ideas discussed in the previou s sections. In addition we present a small scale, demonstration al usage, using a prototype implementation. 5.1. System architecture

Th e Aggregator is a tool aiming at helping the user to rapidly create side-by-side comparison charts using produ ct specification web pages. It consists of four main modules (Figure 18): • the • the • the • the

interactive wrapper creato r, evaluator, kno wledge based modul e, and publisher

Aggregacor: a knowledge based comp arison chart builder for eshopping

157

The interactive wrapper creator is a sophisticated visual environment that allows the user to train wrappers. It consists of a web browser instance accompanied by the DOM tree component and interconn ected in such a way that allows th e user to focus on the elements of a web page by simply using the mouse (Figure 19). This is established with the extensive use of an HTkIL parser that gives access to all the elements of a web page. Finally, this module includ es a product feature list which can be either derived by a predefined produ ct ontology or manually edited by the user. The wrapper creato r module allows the user to navigate to desired web locations, where produ ct specification information is present ed, and visually map page eleme nts to feature- value pairs of a corre sponding wrapper template. The evaluator module " runs" the created wrappers and actually does the information extraction . The extracted inform ation can be published on the web by the publisher module in the form of a static web page. In addition, it is possible to save it as a spreadsheet table. T he knowledge based (KB) modul e is basically a conceptual graph inference engine (the core of which has been developed in our past work in [25] and [24]). Its main component is CoCrTaNT ([10], [22)), a library of C++ classes allowing the development of applications based on the CGs . Co GITaN T allows the handling of CGs using an object oriented approach and offers a great number of functionalities on them such as creation, modification , projection , definition of rules, inputs/ outputs, etc. Furthermore, CoGITaNT can be extended since it provides the programming interface to define new operations, like for example, customized concept and relation matchin g operations and rule execution methods. T he knowledge included in the KB module is divided into domain knowledge and product knowledge. Th e form er, is mostly related to the DOM specification and includ es concept types related to the DOM elements and relation types that allow us to describe the variou s usage constraints between DOM elements. The product knowledge, which is also encoded in CGs , serves in three ways: • defines the potential features/ attributes for which we may build wrappers, in a form of a produ ct ontology, • provides generic wrapper templates which the user should make mo re detailed, and • gives insight for the values that a particular wrapper should search for. The presence of product knowledge is optional since Aggregator can operate without this information but at the cost of reduced precision in the extracted information . The UR L Wizard is an imp ortant sub-component of the K13 module that helps the user to quickly popul ate the list of URLs that will be the target of the various wrappers. Th is is done using minor user input, mainly in the form of link traversal tracking. Internally, this modul e uses a proper, predefined CG -Wra pper. Finally, the itl-page structure learner is responsible to det ermin e how the feature-value pairs are organized inside a produ ct specification page. T his is done by means of a generalization operation as soon as two wrappers have been visually trained by the

-

00

'"

Figure 19. T he wrapper trainin g/evaluation scree n of Agg rcgator.

focusing RlIng,

3 6v Normal M : 46cml l .Sft - inAnity "'a ero M-: 1D-SOcml4In. -19 .61n. (WIde-angie) I 33 - SOc ml1 2," 19.61n (Ielepholo)

~ .4 - 22.2mm QSmmAlm eou!'!'j)lent 36 -1 08mm)

LENS

1/1.8 Inch charge coupled device ~ 4.1 mil lion ~proll 4.0 mill ion ~pr oll

Aggregator: a knowledge based comparison chart builder for eshopping

159

user. The learned pattern is used to partially detail the remaining wrapper templates. This will reduce the user effort for wrapper training since the system is becoming able to suggest possible page element for wrapper part assignments. The prototype of Aggregator runs on wintel machines. The user interface is built in Delphi (Figure 19) and makes extensive use of the Microsoft's HTML parser (which is used in Internet Explorer). The knowledge based components are built in C++ and make use of the CoGITaNT library. 5.2. Case study

We have done a small scale evaluation study of Aggregator. We asked four experienced web users to create a feature/value comparison chart for the digital cameras of two brands. Two ofthe users (1st group) used the Aggregator agent while the rest (2nd group) used a web browser and a spreadsheet application. All users were provided with two URLs, one for each brand site, which were the entry pages leading to individual product pages. Regarding the individual product pages, both sites had the typical organization presented in Figure 15 (right), that is, the products' page was giving access to the pages ofindividuaI products from where access was provided to the specification page of a particular product. None of the users was aware of this organization. In addition, we defined which were the exact features of interest and provided all users with the proper product feature list. The features of interest were: model name, CCD resolution, focal length, optical zoom, digital zoom, shutter speed, white balance, flash modes, storage media and power source. Exact value extraction was requested only for CCD resolution,focallength, optical zoom, digital zoom and shutter speed. The first user group used the URL wizard to train Aggregator how to locate the individual product pages. Starting from the given Brand#1 central page, the users of the first group used the visual tools of the Aggregator to quickly collect the URLs of all the product pages (ProductURLs). Just moving around the mouse, both users were able to rapidly locate the page elements (two HTML tables) that contained all the anchors to the individual product pages and ordered Aggregator (from a context menu) to record those URLs. Then, they recorded a navigation pattern from a product's main page to the product's specification page. This resulted in a wrapper that given the initial product URL list produced a list with the URLs of the specification pages (SpecURLs). The same task was repeated for the second brand site. After the target pages for information extraction had been defined, each user of the I" user group had to train the "dual wrappers" that would perform the actual information extraction. With a product specification page loaded into the embedded web browser and a predefined digital camera ontology available, the users had to select the features they were told from the digital camera ontology. The system then, internally, created the corresponding dual wrapper templates, presented the first one to the user and waited from him/her to visually associate an element of the specification page (for example, a table cell) with a wrapper element (Figure 19). After that, both users had to point over the page element that contained the value ofthe attribute under consideration. These two steps are enough for the Aggregator to create a wrapper to handle this specific attribute-value pair. The generated wrapper can be immediately

160

F. Kokkoras, N. Bassiliades, and I. Vlahavas

evaluated over the SpecURLs list. The task is repeated for a second wrapper. These two wrapper instances allow the system to automatically determine the repetitive HTML structures used in the specification page to present the attribute-value pairs. We remind here that this is done with a generalization operation between the two user defined wrappers. It is worth mentioned here that Brand#l had no visible textual model name information. Instead, it provided the model name in a form of a picture. That was no problem for the users of the 1st group since they assigned this picture's ALT property as the value of the model name feature. To the contrary, the users of the 2nd group had to manually type the model name in a spreadsheet cell. For Brand# 1, the system was able to detail automatically the seven out of the eight remaining wrappers. The missing case was related to the Optical Zoom feature because this information was included inside the general description of the product rather than in a dedicated feature-value pair. As a result this case required the user to manually train the corresponding wrapper. This issue, demonstrates the advantage of searching for both feature and value related page elements, instead ofjust value elements. Although this particular wrapper was about "Optical Zoom", it's feature part was related by the user to a page element with information about "Type of Camera". By focusing on a tiny part of an HTML page, it is possible to apply more computationally complex methods to extract an exact value for some feature. A total of 20 wrapper instances was created for both Brand#l and Brand#2 sites (10 features times the number of brand sites). The time required to perform this information extraction task is presented in Table 1. Although the recall factor was 100% for both brands, that is, all the desired features were located inside the product pages, the precision factor was 70% for Brand#l and 50% for Brand#2. These precision numbers are not discouraging because although Aggregator failed to extract exact values for certain features, it had extracted a bigger portion of information that included the

Table 1 Case study time results

2nd Group (using browser and spreadsheet)

1st Group (using Aggregator) user 1

user 2

average time

brand #1 (8 products)

training 254 sec extraction * 18 sec total 272 sec

292 sec 18 sec 310 sec

273 sec 18 sec 291 sec

sx 199 sec S x 183 sec 1592 sec 1464 sec

8x191 sec 1528 sec

brand #2 (6 products)

training 320 sec extraction" 12 sec total 332 sec

328 sec 12 sec 340 sec

324 sec 12 sec 336 sec

6x240 sec 6x224 sec 1440 sec 1344 sec

6x232 sec 1392 sec

604 sec

650 sec

627 sec

Complete Task

per page average extraction time

45 sec

user 3

3032 sec

user 4

2808 sec

per page average extraction time

average time

2920 sec 209 sec

*extraction times for the 2nd group are given in terms of the average time required to extract values from a single product specification page.

Aggregator: a knowledge based comparison chart builder for eshopping

161

exact information. This, of course, prevents the user to query the complete resulted comparison chart in an SQL fashion, but does not prevent him/her to manually examine the chart and make an informed purchase decision. It is worth mentioning that, although it takes more time for an Aggregator user to train the wrappers for a single brand page than it takes another user to manually extract (with copy-paste) the same information from the same page into a spreadsheet, additional product specification pages ofthe same brand are processed rapidly, resulting in a lower per page average extraction time (45 versus 209). 6. CONCLUSIONS AND FUTURE WORK

Product specification pages provided on-line at various brand sites, are an excellent source of information to automatically create side-by-side comparison charts for "informed" e-shopping. Apart from the information rich nature of such pages, they also use an in-site fixed vocabulary to refer to the various features of the advertised products and present these features using repetitive HTML tag combinations of arbitrary complexity. In this chapter, we have proposed a knowledge based approach on the comparison chart building problem. Our method is two fold: First, we exploit vertical (in-page) similarities, that is, similarities in the way features are presented inside a product specification page. We visually identify feature-value information, map the surrounding HTML tags to predefined generic wrappers expressed as Conceptual Graphs and use the generalization operation to "learn" how information is presented inside a specification page of a brand site. This way, additional features can be located and the related values can be extracted automatically, although sometimes at a low precision ratio because the desired information is mixed with some extra text. In addition we exploit horizontal (in-site) similarities, that is, similarities across different product specification pages of the same brand. These are vocabulary and page layout similarities. Furthermore, we argue that a product ontology and product background knowledge can speed up the wrapper training process and improve the precision ratio of the extracted information. We have proposed the use of the Conceptual Graph knowledge representation and reasoning formalism for the knowledge based part of our approach, mainly due to their expressiveness power and the analogy between operations provided by the CG theory and operations required to train and apply a wrapper. In addition, CGs allow to easily integrate ontological knowledge about the product type under consideration. This feature can contribute to the resiliency and adaptivity of our approach beyond the scope of[20], by adding rules and axiomatic knowledge that can alter the way wrappers are described under certain conditions that hold on other wrappers or the data they extracted. Finally, we have outlined the Aggregator, a side-by-side comparison chart builder that is based on the above techniques and provides visual tools to make the whole task easier,

Much more work is required, mainly in the ontology utilization part ofour approach. We firstly aim at providing automatic utilization of on-line ontologies expressed in XMLlRDF, in the way we utilize metadata information in [25] and [24]. We also

162

F. Kokkoras, N. Bassiliades, and I. Vlahavas

plan to use Aggregator for side-by-side comparison of learning objects which have XML expressed metadata and for which we have already proposed knowledge based approaches based on CGs ([25], [24]). Additionally, more work is required in the value extraction part of our method. Exact value extraction will require extensive use of regular expressions and probably of NLP techniques, but will allow us to query more fields of the resulted comparison chart in an SQL fashion. The fact that the part of a page that contains the exact value of a feature can be isolated and the kind of the expected value can be defined in the product type ontology, suggests that the whole problem is tractable at a good extend. Finally we aim at improving the adaptability of our approach by creating brandindependent wrappers. From some early attempts, this is already possible for featurevalue pairs that are crucial features of a product, like for example, the frequency of a processor or the screen diagonal dimension of a TV set. Apart from having relatively simple values, such features are usually presented alone inside a page, because are strong purchase decision criteria. REFERENCES [1] Adelberg B. "NoDoSE: A Tool for Semi-Automatically Extracting Structured and Semi-Structured Data from Text Documents", SIGMOD Record, 27(2), pp. 283-294, 1998. [2] Ashish N. and Knoblock C. "Wrapper Generation for Semi-structured Internet Sources". In Proceedings ofVVorkshop on Management ofSemi-strnctured Data, 1997. [3] Baumgartner R., Flesca S. and Gottlob G. "Declarative Information Extraction, Web Crawling and Recursive Wrapping with Lixto". In Proceedings of the 6'II International Conference on Logic Programming and Non-monotonic Reasoning, Springer-Verlang, LNCS 2173, 200l. [4] Baumgartner R., Flesca S. and Gottlob G. "Visual Web Information Extraction with Lixto". In Proceedings of the 27t ll International Conference on Very Large Data Bases, pp. 119-128, 200l. [5] Baumgartner R., Gottlob G. and Herzog M. "Visual Programming ofWeb Data Aggregation Applications", In on-line proceedings ofljCAI'03 workshop on Information Inteyration on the J.#b (IIWeb-03), http:/ / www.isi.edu/info-agents/work-shops/ijcai03/papers/Herzog-ijcai03-herzog.pdf, 2003. [6] Berners-Lee T. "Conceptual Graphs and the Semantic Web", on-line document, http://www.w3.org/ DesignIssues/CG.htmI [7] Berners-Lee T., Hendler J. and LassilaO. "The Semantic Web", Scientific American, May 200l. [8] Buttler D., Liu L. and Pu C. "A Fully Automated Object Extraction System for the World Wide Web". In Proceedings of the 21 th International Conference on Distributed Computing Systems, pp. 361-370, 2001. [9] Chidlovskii B. "Wrapper generation by k-reversible grammar induction". In Proceedings ofthe Workshop on Machine Learning and Information Extraction, Berlin, Germany, 2000. [10] CoGITaNT library, available under GPL at: http://cogitant.sourceforge.net [11] Cohen W. Wand Fan W "Learning page-independent heuristics for extracting data from web pages". In Proceedings of the Eighth International World Wide J.#b Conference (WWW-99), Toronto, 1999. [12] Cohen W W and Jensen L. S. "A Structured Wrapper Induction System for Extracting Information from Semi-structured Documents". In Proceedings of ljCAI 2001 VVorkshop on Adaptive Text Extraction and Mining, 2001. [13] Cohen W. W "Recognizing structure in web pages using similarity queries". In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99), 1999. [14] Conceptual Graphs Standard Working Draft, http://www.jfsowa.com/cg/cgstand.htm [15J Corbett D., "A Method for Reasoning with Ontologies Represented as Conceptual Graphs", In M. Brooks, D. Corbett and M. Stumptner (Eds.): AI 2001, Springer Verlag, LNAI 2256, pp. 130-141,2001. [16] Crescenzi v., Mecca G. and Merialdo P "RoadRunner: Towards Automatic Data Extraction from Large Web Sites", In Proceedings of the 26th International Conference on Very Large Database Systems, pp. 109-118,2001.

Aggregator: a knowledge based comparison chart builder for eshopping

163

[17] Delteil A., Faron-Zucker C. and Dieng R. "Extension ofRDFS based on the CGs Formalisms". In Proceedings of the ICCS 2001, LNAI 2120, Springer Verlag, pp. 275-289, 2001. [18] Document Object Model (DOM), http://www.w3.org/DOM/ [19] Doorenbos R. B., Etzioni O. and Weld D. S. "A scalable Comparison Shopping Agent for the World Wide Web", In Proceedings of the 1st International Conference on Autonomous Agents, 1997. [20] Embley D. W, Campbell D. M., Jiang Y S., Liddle S. W, Ng Y-K., Quass D. and Smith R. D. "A Conceptual-Modelling Approach to Extracting Data from the Web". In Proceedings of International Conference on Conceptual Modeling/the Entity Relationship Approach, pp. 78-91, 1998. [21] Freitag D. and Kushmerick N. "Boosted Wrapper Induction". In Proceedings of the 17t h National Conference on Artificial Intelligence, pp. 577-583, 2000. [22] Genest D. and Salvat E. "A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT", In Proceedings of the 6th International Conference on Conceptual Structures, Springer-Verlag, LNAI 1453, pp. 154-161, 1998. [23] Gerbe O. and Mineau G. W "The CG Formalism as an Ontolingua for Web-Oriented Representation Languages". In Proceedings of the ICCS 2002, Springer Verlag, LNAI 2392, pp. 205-219, 2002. [24] Kokkoras F. and Vlahavas I. "Metadata Aware Peer-to-Peer Agents for the e-Learner", A "Hercma03" Symposium on "AI Techniques in e-Learning", Athens, Greece, 2003 (accepted for publication). [25] Kokkoras F., Sampson D. and Vlahavas I. "A Knowledge Based Approach on Educational Metadata Use", Post-proceedings of the 8th Panhellenic Conference in Informatics, Y Manolopoulos, S. Evripidou and A. Kakas (Eds.), Springer-Verlag, LNCS 2563, 2003. [26] Kushmerick N., Weld D. S. and Doorenbos R. B. "Wrapper Induction for Information Extraction". In Proceedings of the 15t h InternationalJoint Conference on Artificial Intelligence, pp. 729-737, 1997. [27] Laender A. H. F., Ribeiro-Neto B. A. and da Silva A. S. "DEByE-Data Extraction by Example", Data and Knowledge Engineering, 40(2), pp. 121-154,2001. [28] Laender A., Ribeiro-Neto B., da Silva A. and Teixeira J. "A Brief Survey of Web Data Extraction Tools", SIGMOD Record, 31(2), June 2002. [29] Liu L., Pu C. and Han W "XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources". In Proceedings of the 16t h IEEE International Conference on Data Engineering, pp. 611-621, 2000. [30] Muslea I., Minton S. and Knoblock C. "STALKER: Learning Extraction Rules for Semi-structured Web-based Information Sources". In Proceedings of AAAI- 98 Workshop onAI and Information Integration, pp. 74-81, 1998. [31] Muslea I., Minton S. and Knoblock C. "Wrapper induction for semi structured information sources". Journal of Autonomous Agents and Multi-Agent Systems, 16(12), 1999. [32] Sahuguet A. and Azavant F. "Building intelligent web applications using lightweight wrappers", Data and Knowledge Encineerino, 36(3), pp. 283-316, 200l. [33] Sahuguet A. and Azavant F. "Building light-weight wrappers for legacy web data sources using W4F". In Proceedings ofVLDB '99, pp. 738-741,1999. [34] SowaJ. "Conceptual Structures: Information Processing in Mind and Machine". Addison-l#sley Publishing Company, 1984. [35] Yamada Y, Ikeda D. and Hirokawa S. "Automatic Wrapper Generation for Multilingual Web Resources". In Proceedings of the 5t h International Conference on Discovery Science, Springer-Verlag, LNCS 2534,pp. 332-339, 2002. [36] YamadaY, Ikeda D. and Hirokawa S. "Expressive Power ofTree and String Based Wrappers", In on-line proceedings of ljCAI'03 workshop on Information Integration on the Web (IIl#b-03), http://www.isi.edu/ info-agents/workshops/ijcai03/papers/Herzog-ijcai03-herzag.pdf, 2003. [37] YamadaY, Ikeda D. and Hirokawa S. "SCOOP: A Record Extractor without Knowledge on Input". In Proceedings of the 4th International Conference on Discovery Science, Springer-Verlag, LNAI 2226, pp. 428487,2001.

IMPACT OF THE INTELLIGENT AGENT PARADIGM ON KNOWLEDGE MANAGEMENT

JANIS GRUNDSPENKIS AND MARITE KIRIKOVA

This paper concerns the problem of bridging gaps between two different but hot topics in organizational theory and computer science-knowledge management and distributed artificial intelligence. Knowledge management has become increasingly important for effective operation of organizations and decision making. Two approaches have appeared in knowledge management-people track knowledge management and information technology track knowledge management. Representatives of the first track, as a rule, are educated in humanities while representatives of the second track have education in computer science. As a consequence, these two communities have rather different understanding of the essence of knowledge management. Distributed artificial intelligence community have borrowed ideas from sociology, organizational theory, economics, linguistics, computer science, etc. and have worked out concepts of intelligent agents and multiagent systems. The use of these concepts in knowledge management may produce a synergy effect reaching the balance between both tracks of knowledge management. 1. INTRODUCTION

Nowadays we can observe rapid evolution from the industrial age to the information age that influences all kinds of organizations. Currently the trend is that technology is advancing at an increasing pace, thereby affecting all aspects of typical organizations. Modern organizations are under the pressure to create a new type of workplace due to the progress of computing technology that causes dramatical changes in work environment, i.e., appearance of on-site and off-site offices. Organizations realize that 164

Impact of the intelligent agent paradigm on knowledge management

165

knowledge is their most important asset. So, there is a need for new types of systems that focus on discovering knowledge and are able to respond to the rapidly changing environment. The information age can be characterized by interpretation of non-standardized information for problem solving and decision making from the bottom-up, and highly variable organizational networks. These characteristics cause emerging of a new type of intellectual work, the so called, knowledge work. The essence of the" knowledge work" is turning information into knowledge through the interpretation of available non-standardized information for purposes of problem solving and decision making. Past and current information systems have supported managment of organizations but modern organizations even at this moment, and to a great extent in the future will need a newer type of systems, i.e., knowledge management systems (KMS), the first examples of which have already been implemented. Knowledge management (KM) has become a new way of capturing and efficiently managing an organization's full experience and knowledge. In industry knowledge has become relevant. It is recognized as a strategic resource and a critical source of competitive advantage [1]. However, relatively little attention has been devoted to how knowledge can be effectively used to enrich competencies ofsuch service organizations as, for instance, higher education or health care organizations. Intuitively, this type of organizations is very rich of knowledge. At the same time the question is still open why these organizations are "information rich" but "knowledge poor" despite the growing role of advanced information technologies in education and health care. One of the reasons is the lack of systematic (and formal) methods to capture, represent, store, convert and transfer both types of knowledge called tacit and explicit knowledge [2]. In general, it is clear that any organization nowadays needs to be more conscious of its vast knowledge resources. That is why KM is the hot topic in the business world and knowledge management techniques become more and more popular. There are a lot ofbooks and articles as well as specialized journals published tackling issues ofKM and related problems (more than 300 titles can be found on the Web). At the same time the main concepts of KM are not generally accepted and used even inside the KM professional community. Moreover, two different tracks exist in KM [3]. According to the inJormation technology track oj knowledge management researchers and practitioners (educated in computer, information and/or systems science) are involved in the construction of information management systems, artificial intelligence, reengineering, groupware, etc. This track is relatively new and is developing very fast, supported by new developments in information technology. In contrast, people track of knowledge management is very old and is not growing so fast. Researchers and practitioners in this field (educated in philosophy, psychology, sociology, business or management) are involved in assessing, changing and improving human skills and/or behavior. Because of their different origins, the mentioned above tracks use different languages and to a certain extent even do not recognize each other. To illustrate this, let us follow the classification of central themes dominating the field of KM r4], namely, organizational learning, document management and technology. The first theme represents people track, the third represents information technology track ofKM, while the second is placed somewhere between both tracks. Organizational learning specialists

166 Janis Grundspenkis and Marite Kirikova

claim that information technology has never addressed the tacit knowledge, and that information technology approach is a purely mechanistic solution of information issues which can be considered as naively promoting software and hardware packages to resolve knowledge management problems. The focus of document management specialists is on the explicit knowledge component captured in such information systems as libraries, information centers, record centers and archives. Technology specialists view KM from the point of view of systems analysis, design, and implementation. Their approach may emphasize one or several areas, in particular, knowledge storage and access, telecommunications, and application software packages. The main discrepancy between various opinions about the essence of KM is the different focus on its objects. Those who represent "people track" call themselves "organization theorists" and are convinced that knowledge is not something that can be managed [3]. For them KM is the art of creating value from an organization's intangible assets. They argue that the user inputs the knowledge, not the "knowledge manager" or "knowledge engineer" [5]. As a consequence, people track community is very cautious about success of information technology and artificail intelligence, in particular, in efforts to capture and structure the tacit knowledge to make it accessible despite the fact that more sophisticated methods and tools for improving the process of converting knowledge types like, for example, based on patterns [6] are suggested. On the contrary, those who represent information technology track are focusing their efforts on "how to achieve knowledge flow" in organizations because their argument is "that knowledge which does not flow, does not grow." Thus any technological advances that help to promote knowledge flow are considered as KM tools. In fact, the real synergy can be achieved if we have a balanced approach to KM taking into consideration the advantages and drawbacks of both people and information technology track. It is quite obvious, that if more knowledge is captured and made accessible, the organization becomes richer of knowledge and vice versa. It is very important for organizations that are operating in a rapidly changing environment or for service organizations which are under the permanent pressure from the business world and are facing danger to lose their knowledge when somebody leaves the organization. In this case knowledge capturing, storage and usage are the most important activities to keep the organization's intellectual capital up to date. Modern approaches to artificial intelligence (AI) based on intelligent agent paradigm are very promising to manage these activities. In this paper we try to bring together various concepts used in KM and AI to give a flavour of possible impact of modern approaches to AI on KM in organizations. In particular, the paper identifies the role of intelligent agents and multiagent systems (distributed AI) in KM. The contents of this paper is structured as follows. First, the introduction gives a general insight into the problem. Section two presents a historical paradigm shift from data and information management to knowledge management. Next, sections three and four describe what KM and its architecture is, and how information technology and AI support knowledge KM. KM definitions are classified into three classes using the proposed criteria of formal, process and organizational aspects. Section five offers a glimpse on intelligent agents and multiagent systems. Section six considers various

Impact of the intelligent agent paradigm on knowledge management

167

knowledge possessors, types and sources. In section seven organizations as communities of agents and passive objects are discussed. Section eight proposes a concept of intelligent organization-agent. The up-to-date role ofintelligent agents in KMS is presented in section nine. In this section a novel conceptual model of organization's knowledge management system is discussed. This section contains the outline of perspectives of the use of multi agent systems in KM. Conclusions are given in section ten. 2. PARADIGM SHIFT: FROM DATA AND INFORMATION MANAGEMENT TO KNOWLEDGE MANAGEMENT

Nowadays we can observe an evolution from the industrial age to the information age. The industrial age may be characterized as follows: • Production and consumption of material things • Accents on manual (physical) work, not so much on creative brain-work • Hierarchical and centralized distribution processes • Re-use of pre-defined content, i.e., the application of previously fixed procedures • Compliance with standardized information schemes. The information age started in the last decades of the twentieth century. It may be characterized by: • Production and consumption of information • Accents on creative brain-work not so much on manual work • Highly variable and distributed organizational networks • Interpretation of non-standardized information used for decision making and problem solving • Decentralized decision making from the bottom-up. These changes have caused the appearance of a new type of intellectual work, the so called, knowledge work, and will be constant in the new millenium. The essence of the knowledge work is turning information into knowledge through interpretation. Unfortunately, notions of information and knowledge are ambiguous because generally accepted definitions do no exist. There is a need to determine relationships among data, information and knowledge. Following [4, 5] where these relationships are considered from the management perspective, data represent the unstructured facts and figures, while information is structured data that is useful for the manager in analyzing and solving his/her problems. Knowledge is obtained from experts based on actual experience. In order to see patterns and trends that enable managers to make current and future decisions, there is a need to integrate the range of information. Several authors try to give more general definitions of information and knowledge. According to [7] "information consists largely of data organized, grouped and categorized into patterns to create meaning, and knowledge is information put to productive use, enabling correct action." Information is converted into knowledge through human process of interpretation, shared understanding and sense making. This process occurs at both personal and organizational level.

168 Janis Grundspenkis and Marite Kirikova

Looking back at the last two decades of the 20th century we can notice focusing on the quality in the 80-ies and on the reengineering in the 90-ies. Quality requirements placed an emphasis on how to achieve the level of performance when employees use their brain power better. Reengineering (redesigning the operations and workflow of organizations) emphasizes the use of information technology and electronic communication to improve business processes and to make organizations more efficient and more effective. Both business process reengineering and knowledge management derive from the same basis-that organizations more and more widely start to use information technology instead of print-on-paper technology. At the beginning of the 21st century the work environment changes dramatically. The demand for skilled "knowledge workers" escalates around the world. There is a need for new types of systems that focus on discovering and processing knowledge that responds to the rapidly changing environment [8]. Knowledge management systems are at the forefront of these newer types of systems found in typical organizations. "Knowledge workers" fulfill a new type of intellectual work. Knowledge work is about making sense. It may be considered as content creation, i.e., the generation of new knowledge to make organization's activities more effective and to stimulate the innovation process of organization. That is why there is a prevalence of knowledge workers in the sectors directly related to content creation: research, design, consulting, etc. A very significant issue is that knowledge work requires a paradigm shift in organizational thinking, with respect to process planning, control and business process reengineering. Peter Drucker [7] argues that "to make knowledge work productive is the great management task of this century, just as to make manual work productive was the great management task of the last century." KM emerges as a natural evolution of the importance of quality and reengineering. Experience obtained from quality assurance and reengineering activities has lead to a situation that now organizations turn their attention to growth. Innovation is the primary key to growth. Innovation, promoted through knowledge, is strongly connected with the need to design, develop and deliver new products and/or services. Consequently, organizations nowadays must take a more systematic approach to managing the main drivers of innovation, i.e., productivity improvements of the knowledge workers, and the rapid building and utilization of organization's knowledge accumulated as organization's intellectual capital. At the same time, innovation alone is not a key to organization's success. Even very successful organizations will progress much faster if they have the following capabilities: • Innovativeness • Social propagation • Movement (growth). One of the barriers to successful and effective knowledge work is the lack of clear distinction between information and knowledge, and information and knowledge management, especially.

Impact of the intelligent agent paradigm on knowledge management

169

In definitions given in [7] information management often starts with technological solutions. Knowledge management, in contrast, starts with laying stresson people, their work practice, experience and culture, before deciding whether and how technology should be brought into the process. 3. KNOWLEDGE MANAGEMENT: DEFINITIONS AND ARCHITECTURE

KM is a concept that has emerged explosively over the last few years. Is this concept ofKM really new? The answer is: not really! The discipline ofKM is only seventeen years old. The term" knowledge management" was coined by Karl Wiig in 1986. It is not easy to find a widely recognized definition of KM. At present there is much debate, and little consensus, about exactly what KM in fact is. There is much of a variety of definitions for KM in the corresponding literature. The perceptions of KM depend on the person and his/her speciality [4]. For example, information professionals (librarians and archivists) emphasize document management, information technologists stress hardware, software, network and telecommunications. Scientists, state or local government, specialists in education, health care, industry, business, agriculture etc., have their own viewpoints reflecting their interests in KM. General opinion is that KM is the amalgamation of earlier experience, i.e., past and current systems such as data base management systems, business process reengineering, management information systems, decision support systems, total quality management, knowledge-based systems, artificial intelligence, software engineering, human resource management and organizational behavior concepts [4, 9]. Looking through the available literature on KM we have tried to add some classification of definitions summarized by Liebowitz [10] and those given by Tiwana [11J, Sarvary [12] and Sveiby [3]. Three classification criteria have been chosen: formal aspects, process aspects, and organizational aspects. Several authors try to stress systematic andformal aspects: • Knowledge management is the systematic, explicit, and deliberate building, renewal, and application of knowledge to maximize an enterprise's knowledge-related effectiveness and returns from its knowledge assets (Wiig). • Knowledge management is the formalization ofand accessto experience, knowledge, and expertise that create new capabilities, enable superior performance, encourage innovation, and enhance customer value (Beckman). • Knowledge management involves the identification and analysis of available and required knowledge, and the subsequent planning and control of actions to develop knowledge assets so as to fulfil organization objectives (Macintosh). Several attempts to define knowledge management as a process are as follows: • Knowledge management is the process of creating value from an organization's intangible assets (Liebowitz). • Knowledge management is defined as a process through which organizations create, store and utilize their collective knowledge (Sarvary).

170 Janis Grundspenkis and Marite Kirikova

• Knowledge management is the process of capturing company's collective expertise whenever it resides-in databases, on paper, or in people's heads-and distributing it to whenever it can help produce the biggest profit (Hibbard). • Speaking in more details, knowledge management process includes three stages: organizationallearning, the process of acquiring information; knowledge production, the process of transforming and integrating information into usable knowledge; and knowledge distribution, the process of disseminating knowledge throughout the organization (Sarvary). Other definitions focus on organizational and management aspects: • Knowledge management is the art of creating value from an organization intangible assets (Sveiby). • Knowledge management is the explicit control and management ofknowledge within an organization amied at achieving the company's objectives (van der Spek). • Knowledge management means exactly the management of organizational knowledge for creating greater value and generating a competitive advantage (Tiwana). • Knowledge management is getting the right knowledge to the right people at the right time so they can make the best decision (Petrash). The most consistent definition of this group is the following: • Knowledge management is a business problem and falls in the domain of information systems and management, not in computer science. It means that knowledge management is not knowledge engineering because knowledge engineering is barely related to knowledge management. Knowledge management needs to melt information systems and people in ways that knowledge engineering has never been able to (Tiwana). Quite different opinion is demonstrated by Sveiby [3]. He tries to define KM by looking at what people in this field are doing. He distinguishes between two tracks of activities, namely, information technology track knowledge management and people track knowledge management. Because of their different origins, these two tracks use different languages that frequently cause confusion. The first track corresponds to management of information field where researchers and practitioners tend to have their education in computer and/or information science. They are involved in the construction of information management systems, artificial intelligence, reengineering, groupware, etc. To them knowledge means objects that can be identified and handled in information systems. The focus of artificial intelligence (AI) specialists and E-specialists is on the individual, while focus of reengineers is on the organization. This track is new and is growing very fast at this moment due to new developments in information technology. According to [4], in the information technology track knowledge management has become a new way of capturing an organization's expertise addressing factors such as:

Impact of the intelligent agent paradigm on knowledge management

171

• Databases, Web site interfaces and documents • Knowledge infrastructure for just-in-time knowledge and global access • Enhancing the amount and visibility of knowledge in an organization • Sharing knowledge both within an organization and with external clients • Capturing tacit knowledge and experience of knowledge workers, and promoting transformation of tacit knowledge into explicit knowledge for global access • Knowledge collection in libraries, archives, repositories, administrative and operational units. People track knowledge management corresponds to management of people. Researchers and practitioners in this field tend to have their education in philosophy, psychology, sociology and/or business and management. They are primarily involved in assessing, changing and improving human individual skills and/or behavior. To them knowledge means processes, a complex set of dynamic constantly changing skills, know-how, etc. They are traditionally involved in learning and in managing these competencies on an organizational level like, the so called organizational theorists, i.e., philosophers and sociologists, or on an individual level like psychologists. This track is very old, and is not growing so fast. The gap between these tracks is rather wide due to the different education of communities representing each particular track, and, what is even more crucial, due to the different points of view on the real nature of knowledge. Representatives of people track strongly believe that only humans possessknowledge [13]. Representatives of information technology track have a broader viewpoint, and argue that there are natural knowledge possessors and artificial knowledge possessors. Section 6 discusses this topic in a greater detail. We believe that this point of view is more perspective and will help to narrow the gap between the two tracks ofKM. In the section 9 one of the possible ways to achieve this goal is developed based on the modern approach to AIintelligent agent's paradigm. Our approach to a certain extent has parallels with another way to view KM as the evolution of existing information systems and consciousness of two relatively new insights: the recognition of the importance of intellectual and social capitals. Srikantaiah [4] defines KM expressed by the formula: knowledge management = systems + intellectual capital + social capital.

A wide variety of systems is listed: database management systems, business process reengineering, management information systems, decision support systems, just-intime inventory management, total quality management, enterprise resource planning, data warehousing, data mining, electronic data exchange, etc. Intellectual capital is defined by Stewart [14] in the following way: intellectual capital is intellectual material that has been formalized in some useful order, captured in a way that allows it to be described, shared, distributed, and leveraged to produce a higher valued asset. It is packaged, useful knowledge. Intellectual capital has two major components [5]: information/knowledge capital and structural capital. Information and knowledge capital is the organization's information and knowledge that can be informal

172 Janis Grundspenkis and Marite Kirikova

and unstructured as well as formal. The structural capital is mechanisms to capture, store, retrieve, and communicate that information and knowledge, i.e., to take advantage of the information and knowledge capital. Knowledge capital, in turn, includes all the organization's tacit and explicit knowledge. Social capital is defined as the sum of the actual and potential resources embedded within, available through, and derived from the network of relationships possessed by an individual or social unit [15]. Social capital includes such attributes as culture, trust, anticipated reciprocity, context, and informal networks. As it is shown earlier, social capital is what has been added to intellectual capital to create knowledge management. The latest way to view KM implicitly includes the idea about the necessity of knowledge flow. On the awareness of the importance of knowledge flow the concept of the KM architecture is based. In [7] there is a compelling phrase: "Knowledge that doesn't flow doesn't grow." So, knowledge that doesn't flow quickly is out-of-date and sooner or later becomes absolutely useless. Nonaka and Takeuchi [2] tried to emphasize purely social characterization of the environment of the KM architecture using the concept of life cycle of organizational knowledge. It is a rather narrow view because, in fact, KM architecture tackles issues directly related to the management of the information technology infrastructure. Following Borghoffand Pareschi's approach [7] the KM architecture is composed of four components: • The flow of knowledge (using knowledge, competencies, and interest maps to distribute documents to people) • Knowledge cartography (knowledge navigation, mapping and simulation using tools like work process simulators, domain-specific concept maps, design and decision rationale, maps of people's competencies and interests, etc.) • Communities of knowledge workers (awareness services, context capture and access, shared work-space, experience capture, knowledge work process support) • Knowledge repositories and libraries (search, heterogeneous document repository, access, integration and management, directory and links, publishing and documentation support). Wang [16] focuses on technology components that constitute the infrastructure of KMS, and proposes seven layers of its architecture: • Interface (browser) • Access and authentication (recognition, security, firewall, tunneling) • Collaborative intelligence (intelligent agent tools, collaborative information filtering, content personalization, search, indexing and metatagging) • Application (skills directories, maps ofpeople's competencies and interests, collaborative work tools, video conferences, electronic forums, digital white boards, decision support systems and tools) • Transport (the Web and TCP lIP (transmission control protocol/Internet protocol) development, E-mail and POP ISMTP support, streaming audio, video transport, electronic document exchange)

Impact of the intelligent agent paradigm on knowledge management

173

• Middleware and legacy integration (wrapper tools) • Repositories (legacy, data warehouses, discussion forums, document bases, knowledge repositories, digital libraries). The main objective ofKMS's architecture is to provide an effective knowledge flow. Effective knowledge flow is strongly connected with knowledge sharing that enhances the learning capacity both at individual and organizational level. 4. KNOWLEDGE MANAGEMENT SUPPORT

In the previous section we have discussed KM from different viewpoints. Now let us consider how it is possible to support knowledge management as an ability to turn knowledge into action. It is closely connected with the expansion of individual's personal knowledge to knowledge of organization as a whole. For this purpose organization must become a learning organization that, in its turn, requires the ability to work in teams and the capability to expand individual's personal knowledge. For organizations it is even more difficult to create a learning environment for permanent expansion and assistance of maintenance of collective knowledge. Understanding and supporting KM must lead towards creation ofknowledge environment and widespread usage of KM tools in organization's everyday life. Knowledge environment contributes: • Knowledge • Knowledge • Knowledge • Knowledge • Knowledge

creation (development, acquisition, inference, generation) storage (representation, preservation) aggregation (creation of meta-knowledge) Use/reuse (access, analysis, application) Transfer (distribution, sharing).

Moreover, knowledge environment must contribute both personal knowledge and organizational knowledge as well. KM tools and techniques afford an effective technological solution for acquisition, presentation and use of organization's knowledge. Typically it practices converting information into knowledge and connecting people to knowledge. KM tools may be supported by information technology infrastructure andlor AI techniques. In the first case information management tools allow to generate, store, access and analyze data. The well known examples of these tools are data warehouses, data search engines (Internet search engines), data mining, data modeling and visualization tools, etc. Knowledge management systems exist on computer hardware and are transmitted over telecommunication lines. A variety of computer platforms can be used, for instance, it is possible to access KMS via a workstation from a personal computer connected to a network server or from a personal computer connected to the Internet. More details about information technology infrastructure may be found in [17]. Many parts of the Internet, including the World Wide Web, HTML (hypertext markup language), dynamic HTML, XML (extensible markup language), FTP (the file transfer protocol), TCP IIp, as well as local area networks (LAN) and wide area

174 Janis Grundspenkis and Marite Kirikova

networks (WAN) are examined in detail. In particular, modems and dial-up access, the faster communication technologies such as Integrated Services Digital Network (ISDN), and frame relay, and the technologies like digital subscriber line (DSL), cable modems, and multi-channel multi-port distribution service (MMDS) used to support communications between computer on WANs are examined, too. Several advanced technological aspects are explored in [18]. KM tools, in their turn, allow to develop, combine, distribute and secure knowledge. Examples of these tools are knowledge flow enablers, knowledge navigation systems and tools, corporate memories, knowledge repositories and tools, etc. Many KM technologies have already been developed and some of them are rather widely used in thriving organizations. According to [19] KM technologies include: • Document management for publishing and control of various kinds of document circulating in organizations • Workflow for document routing and exchange • Project management for development and planning of activities and resources • Data warehouses for knowledge discovery • Intranets for connectivity and publishing • Web conferencing for dialogue maintenance • Helpdesks for problem and solution finding • Groupware for collaboration. Considering KM tools supported by AI techniques, it must be taken into account that in operational terms KM is concerned with the formal managment of knowledge-identification, creation, suppliance, access, dissemination, reuse, storage and preservation of knowledge in a knowledge base. Among the KM tools supported by AI techniques are the following: 1) Traditional AI systems such as management information systems, decision support systems and expert systems 2) Intelligent agents and corresponding tools 3) Virtual reality. Due to the orientation of this paper we will concentrate more on intelligent agents. It is worth only to add that traditional AI systems are widely described in literature. They played a certain role in KM however Koenig and Srikantaiah [5] argue that "there certainly have been AI successes but they have been at the tactical not at the strategic level envisioned by the proponents of knowledge management, nor have they been of the collaborative synergistic kind, yielding new knowledge or faster learning". Thus, there exists unbiased necessity to look for modern approaches to AI that may overcome the drawbacks of traditional AI systems used to support KM. The intelligent agent paradigm is one of the most promising relatively new directions in AI [20].

Impact of the intelligent agent paradigm on knowledge management

175

5. INTELLIGENT AGENTS AND MULTIAGENT SYSTEMS

More and more information circulates in modern organizations, it is accessible through computer networks, and we have started building systems to help us find the information we need and generate knowledge we need. These systems are one of the applications of the so called "intelligent agents." The agent metaphor subsumes both natural and artificial systems. Several approaches were made attempting to define what may be considered as an agent [21]. The software agent approach [22] emphasizes the significance of application independent high-level agent-to-agent communication and states that "an entity is a software agent if and only if it communicates correctly in an agent communication language." In fact, it is a software using techniques from AI to assist a human user of a specific application. Weiss [23] defines an agent as a computational entity such as a software program or a robot. The mentalistic approach [24], based on the knowledge representation paradigm of AI, defines that "an agent is an entity whose state is viewed as consisting of mental components such as beliefs, capabilities, choices and commitments." In the last definition two important components are missing, namely, perceptions and memory of past events and actions. Some authors, for example [20], use a more general approach. They argue that "an agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors." So, perceptions form the basis for reactive behavior. There is an ontological distinction between agents and objects [25]. Only agents are active entities that can perceive events, perform actions, communicate and make commitments. Objects are passive entities with no such capacities. The following definition given by Hayes-Roth [26] summarizes the most important performance features of agency: "Intelligent agents continuously perform three functions: • Perception of dynamic conditions in the environment • Action to affect conditions in the environment • Reasoning to interpret perceptions, solve problems, draw inferences, and determine actions." Moreover, intelligent agents should act rationally and should be autonomous. Rationality means that for each possible percept sequence, an ideal rational agent should do whatever action is expected to maximize its performance measure, on the basis of the evidence provided by the percept sequence and whatever built-in knowledge the agent has in the memory (knowledge base). Agents are autonomous to the extent that their behavior is determined by their own experience, i.e., agent can operate without direct control from humans or other agents. Some researchers add further properties such as goal directed and reactive. In other words, an agent works towards a pre-defined goal, and the user is waiting for the result of the agent's work. An agent can react to various stimulus from the environment, but there are also agents that can themselves

176 Janis Grundspenkis and Marite Kirikova

Environment Agent

Knowledge Base

Figure 1. Schematic diagram of a simple intelligent agent.

take initiatives to get closer to their pre-defined goals (proactive agents). This broader view of the term agent is used in [27] to describe any relatively autonomous actor with the sets of • Goals (conditions the agent works to achieve or fulfill) • Intentions (goals or subgoals the agent is currently engaged in pursuing) • Beliefs (necessarily limited and possibly inaccurate knowledge about the world) • Behaviours (actions the agent is able to take). From the structural point of view an agent is a program and an architecture. The initial phase for an agent program is to understand and describe percepts, actions, goals and environment. The core of the agent program the body of which consists of three functions may be written as follows:

Agent program Input: Percepts Update-Memory(memory, percept) Choose-Best-Action(memory) Update-Memory(memory, action) Output: Actions An agent architecture specifies the decomposition of an agent to a set of modules and relationships between these modules. The architecture of agents includes the main components of intelligent systems, such as knowledge base and inference engine. In addition, as it is shown in Figure 1, agents have sensors and effectors. Such architecture realizes the intelligent agent program: sensors supply it with percepts, knowledge base

Impact of the intelligent agent paradigm on knowledge management

177

and inference engine executes Update-Memory and Choose-Best-Action functions, and effectors apply actions to the environment. Several simple agent architectures are described in [20]. The more interesting architectures from the KM point of view are goal-based and utility-based agents. Agents that are able to search and to make plans (search and planning agents) are examples of goal-based agents. Decision-making agents are examples of utility-based agents. In the plethora of intelligent agents the most advanced ones are learning agents. The idea behind learning is that percepts are used not only for acting, but also for improving the agent's ability to act in the future. Learning takes place as a result of the interaction between the agent and the world, and from the observation by the agent of its own decision-making processes. Russell and Norving [20] point out that all learning can be seen as learning the representation of a function, such as, logical sentences, beliefnetworks, and neural networks. When the agent learns such a function by comparing inputs and desired outputs the learning is called inductive learning. Another type of learning is to provide a positive feedback to reinforce positive behaviours that reach a goal state successfully. This is called reinforcement learning. One of the advanced subclasses of learning agents is self-learning agents. These agents give the possibility for each user to adjust the agent's instructions and to use knowledge bases. In this case the user is offered an agent that can be trained without the user having to learn the agent's language. Instead of the traditional programming the agent is instructed through: • Giving direct, unambiguous examples of needed functionality • Importing functionality from other agents • Letting the agent observe the user's working process and determine what it should do. Learning agents obviously would be of great importance in KM but at the present moment researchers who represent the information technology knowledge management track are considering them more as the future technology (see section 9 of this paper). At the same time some of them already exist, for example, assistant and filtering agents. Besides learning, agents that are designed to participate in KM must have the ability to communicate. It is straightforward, because an agent can do itsjob well only if it can take advantage of all knowledge resources and all other agents. The ideas about agents as computational entities that interact with each other to solve various kinds of distributed problems were developed under the rubric of distributed artificial intelligence [28]. The latest developments in this area are connected with the Web intelligence

[29].

To be useful, any agent, whether intelligent or not, natural or artificial, cannot be an isolated entity. As already defined, an agent is an entity that can sense data, reason using these data and built-in knowledge, and act according to its goals and beliefs. Both sensing and acting are forms of communications. "In general, communication is the intentional exchange of information brought about by the production and perception

178 Janis Grundspenkis and Marite Kirikova

of signs drawn from a shared system of conventional signs" [20]. A shared, structured system of communications is a language [30]. Intelligent agent's communication with its environment can take several forms, such as, to inform other agents about itself and its knowledge of its environment, to query other agents about their state, to answer queries, to request or to command other agents to do something, to give a promise or offer to do deals, to acknowledge communications with other agents [20]. One of the most difficult aspects of the intelligent agent design is how to give the agent the ability to determine what to communicate, and when. Thus, it is difficult to design intelligent agents that can understand each other's communications when they take place. Understanding makes use of some language or protocol: a formal specification of the syntax and semantics of a statement, knowledge unit or message. Both natural-language processing and formal computer languages are used for interagent communications. If communicating agents share the same internal knowledge representation scheme, direct-access message interface of the form TELL (Agent X, Some Knowledge) or ASK (Agent X, Some Question) can be used [30]. In most complex cases agents need to communicate with each other having different knowledge representation schemes. As a consequence, it causes the use of more complex external languages to communicate with other agents. This, in turn, may require parsing messages, performing syntactic, lexical, and semantic analysis, and performing disambiguation-a technique used to diagnose or interpret a message in relation to a particular world model [30]. All these natural-language processing techniques are involved in developing agent-communication languages. A particularly promising agent language is Agent Communications Language (ACL), based on evolving standards such as Knowledge Query and Manipulation Language (KQML), and Knowledge Interchange Format (KIF). One way agents can use ACL is to communicate knowledge and actions about a particular application domain. This architecture is proposed by Genesereth [31]. For more details see [30, 32]. ACL arguments are based on the KIf KIF is LISP like language, which is considered to be a standard protocol for knowledge sharing and communication among diverse, heterogeneous agents. KIF not only defines the capability for declaring reasoning rules and expressions and for creating arbitrary sentences in the first-order predicate calculus, but also provides the capability to define objects, functions, and relations related to knowledge representations. KIF semantics are based on the first-order predicate logic. These semantics support variables, operators, constants, rules, and definitions. The combination of these elements allows to build knowledge about objects in a specific problem domain. The KQML is one of the widely used agent's communication formalisms. The KQML is an all-purpose agent communication and query language, that is, an advanced query protocol which allows diverse agents to communicate without forcing

Impact of the intelligent agent paradigm on knowledge management

179

agents into a specific structure or implementation [30]. Through the KQML language agents can share knowledge and information to cooperate with each other for cooperative problem solving. Besides, KQML provides a basic architecture for knowledge sharing through a special class of agents, the so called communication facilitators which coordinate the interactions of other agents (it is very important in areas like concurrent engineering, intelligent design, planning and scheduling, and knowledge management). For in-depth information on KQML language navigate to http://www.cs.unibc.edu/kqml. Learning and communication abilities concentrate in multiagent systems that are the research topic of distributed AI. Distributed artificial intelligence today is a promising research and application field which concentrates on agents as intelligent connected systems consisting of agents that are autonomous and distributed, and which brings together ideas, concepts, and results from many disciplines: computer science, artificial intellgence, organization and management science, economics, philosophy and sociology. Distributed AI, in its essence, is "the study, construction, and application of multiagent systems, that is, systems in which several interacting, intelligent agents pursue some set of goals or perform some set oftasks" [23]. Thus, in multiagent systems interaction is goal-and/or taks-oriented coordination. Coordination is a particularly important form of interaction with respect to goal attainment and task completion. Two basic, alternative activities of coordination are cooperation and competition. In the case of cooperation, agents work together using their knowledge and capabilities to achieve a common goal. In the case of competition, agents work against each other because their goals are conflicting. Cooperating agents try to accomplish as a team what the individuals cannot, while competitive agents try to maximize their own benefit at the expense of others. It is obvious that for KM cooperation is relevant but competition is undesirable at least inside the organizatIOn.

Multiagent environments provide an infrastructure specifying communication and interaction protocols. So, the main issues in multiagent systems are centered around the question of interaction (when and how to interact with whom). Interaction indicates that agents may have relationships with other agents or humans. Interaction can be indirect (agents observe each other or carry out actions that modify the environment state) or direct (agents use shared language to provide information and knowledge). Today multi agent systems have the capacity to playa key role in knowledge management at least for two main reasons. First, modern organizational and information systems are distributed, large, complex and heterogeneous. These modern systems like multiagent systems are typically intended to act in complex-large, open, dynamic and unpredictable environments. They require the processing of huge amounts of decentralized data, information and knowledge. Second, multi agent systems model interactivity in human (natural agents) societies when humans form organizational structures. Modeling allows to explore sociological and psychological foundations of interactive processes among humans that are still poorly understood.

180 Janis Grundspenkis and Marite Kirikova

In [33] the following characteristics of multiagent systems are identified: • Each agent has incomplete information • Each agent is restricted in its capabilities • Data are decentralized • System control is distributed • Commutation is asynchronous. It is easy to see that all these characteristics match organization's needs in the case if knowledge management system is implemented. Multiagent systems can differ in the agents themselves, the interactions among them, and the environments in which agents perform their actions. An extensive overview of multiagent system attributes is given in [28]. The modern concept of multiagent system covers both primary types of distributed artificial intelligence systems: multiagent systems in which several agents coordinate their knowledge and activities and reason about the processes of coordination, and distributed problem solving systems in which the work is divided among a number of agents that divide and share knowledge about the problem and its solution [28]. Both types of systems may be important for the development of KMS. To conclude this section, let us accent that all agents overviewed may be designed and implemented using programming languages and environments that effectively support agent building and execution. Agent programming languages usually have some common features, namely, some support for AI and networking (easiness of distributing agents across a network and collecting information from networks) as well as make it easy for agents to talk to each other so they can cooperate. Usually such programming languages as Java, Smalltalk, and Objective C are suggested, but also Tcl/Tk, Telescript Obliq, Limbo and Python are mentioned in literature. Knapik and Johnson [30] argue that Java and Smalltalk have a tremendous potential as agent languages however they are not yet ready to provide a standardized agent execution environment and architecture. The arguments in favour ofJava are the following: it is an object oriented language, it has an excellent network support, it is platform independent. That is why it is a good choice for agent programming. Practically Smalltalk and Objective C, which is an object-oriented superset ofC with Smalltalk style message syntax, have the same features. These programming languages are a good choice for doing agent programmmg. In general, all technologies that incorporate an object-oriented language and development environment can be successfully used for building agents. 6. KNOWLEDGE TYPOLOGIES

As it is discussed in previous sections, the notion of knowledge plays the crucial role both in KM and AI. At the same time many aspects of this notion are not investigated and described in practically needed details. There have been very many publications on knowledge in recent years. As it is shown already in the second section, almost all definitions fail to define knowledge

Impact of the intelligent agent paradigm on knowledge management

181

in absolutely clear terms. In many cases this concept is used as defined in everyday life. It is obvious, that in knowledge management much deeper understanding of knowledge nature, typologies, knowledge possessorsand sources, etc. must be achieved. The reason is that in current context in which organizations find themselves (rapidly changing environment, world wide competition, crucial need to be innovative, etc.), organizations which have been designed to optimally support the decision making process by enhancing the information processing capacity are not operating effectively enough. A growing number of organizations find themselves in the process of creating new knowledge [2]. So, organizations continually gather, convey, utilize and create knowledge with the aid of information. Many different questions arise and must be answered on the way towards a really creative organization. First, one must know: • Who knows what • Who needs to know what in organization • Who or what are sources and sinks of knowledge • How to elicit knowledge • How knowledge is generated • Whereabouts of knowledge networks • How dynamic knowledge is. Second, one must understand that knowledge of markets is a business weapon, knowledge about customers makes selling easier, knowledge about people means better working groups, and many other things. Moreover, classification of knowledge categories may give better understanding of why knowledge is so different, i.e., why it is tacit or explicit, soft or hard, natural and "artificial" and so on. This is needed to make critically important knowledge explicit and widely accessible, to use intelligent systems for deep knowledge capturing, to implement intelligent agents for organizational learning purposes. And last but not least, better understanding of basic KM concepts is the prerequisite for the effective and efficient use of KM techniques and tools. 6.1. Notion of Knowledge and Knowledge Possessors

Knowledge is a phenomenon that is intended to be managed by knowledge management. Therefore understanding of the nature of knowledge is relevant in achieving good results in this kind of management activities. The first philosophical discussions concerning knowledge have to be attributed to such thinkers as Plato, Aristotle, Sextus Empiricus, Augustine, and Thomas Aquinas [34]. Since then the debates on this topic are still continuing. However, they have not yet resulted in a common view on knowledge or in absolutely clear definitions of the phenomenon. Another important aspect of knowledge is its dynamic development. The definition ofknowledge given by Sildjmae [38] states the following: "Knowledge consists ofdynamic functional structures. It comprises the unity ofthree following aspects: first, understanding of the reality, second, attitude to the reality, and, third, corresponding reaction."

182 Janis Grundspenkis and Marite Kirikova

Thus, definitions of knowledge coming from different areas of investigations show that in consideration of agent's knowledge the following three very important aspects must be understood: • Systemic nature of knowledge • Dynamic development of knowledge • Ownership of knowledge. Agent's knowledge is a system that it uses to perform any mechanical or intellectual tasks. This system is an opened one and changes according to the agent's environment and nature ofthe task. A concept is usually referred as an elementary unit ofknowledge [39,40]. The concepts are organised in different aggregate, causal, historical, contextual and other types of structures and thus form knowledge subsystems. Each subsystem of knowledge, taken together with its external links, is a piece of knowledge [41]. In case the presence of external links is optional, the subsystem of knowledge may be called a part of knowledge. In case of human being use of knowledge simultaneously introduces changes in the knowledge [42]. New knowledge pieces may be added to the existing knowledge, the level of truthfulness of some parts of knowledge may be changed, the internal structure ofknowledge may be modified. Similar changes may occur also in knowledge of software agents and robots. Some parts of agent's knowledge can be copied on another media of knowledge and thus possessed not only by the agent-initial owner of the knowledge, but also by other knowledge possessors. All knowledge possessors can be divided into natural and artificial ones [41] (Figure 2). Artificial knowledge possessors, in turn, can be divided into active (AI techniques) and passive knowledge possessors. Natural and active artificial knowledge possessors possess a dynamic knowledge system and, thus, belong to the class of agents. Agents can elaborate copied knowledge in their knowledge system and represent it in the ways that differ from the original one. Passive knowledge possessors do not change the form of initial copy of the knowledge. In other words, active knowledge possessors or agents have knowledge processing capabilities, while all passive possessors or passive objects do not have such capabilities. The boundary between passive and active possessors of knowledge to a certain extent is fuzzy. The main distinction is that passive possessors can represent information but cannot change it or generate new knowledge on the basis of the existing one. In organizational setting natural knowledge possessors are management and employees of the organization, and also customers, providers, consultants, and representatives of competitors and partners of the organization. Artificial knowledge sources are different kinds of documents, as well as more sophisticated means of knowledge representation such as virtual reality and multimedia elements, and AI techniques such as active databases, neural networks, etc (see Figure 2). Viewing knowledge as an inherent property of an agent the following definition of knowledge [43] satisfies all three knowledge aspects stated above (systemic nature, dynamic development, and ownership of knowledge):

Impact of the intelligent agent paradigm on knowledge management

183

Natural kna.Yledge

possesson;

KncwIedge

Activeknowledge

possessors

possessors

Agents

Passhle knowledge

possessors

Robots Documents Elcpert systems

Pictures

Images

Audiorecords Videorecords

Intelligent decision support systems ArtifICial neural nelwoIts Acthle and InIeUigent data base systems

Software

Pattern recognition and inageprocessing

VIrtual reality

IntelligentlnfOllTllllion sytemsend CASE

(calculus, animation, etc.)

tools

Figure 2. Knowledge possessors.

"Knowledge comprises all cogmnvc expectancies-observations that have been meaningfully organised, accumulated and embedded in a context through experience, communication, or inference-that an individual or organisational actor uses to interpret situations and to generate activities, behaviour and solutions no matter whether these expectancies are rational or intentional," Original definition is provided only regarding human knowledge. However it may be also applied to artificial agents ifwe understand cognition as a complex set of mental processes by which humans acquire, organize, and apply knowledge [44]. In case of artificial agents the software processes stand instead of mental processes. In knowledge management the terms "knowledge" and "information" are used in parallel. There are many definitions that define knowledge as a special kind of information and there are definitions that define information as a special kind of

184 Janis Grundspenkis and Marite Kirikova

Table 1 The most important types of knowledge

Dimension

Knowledge types

1.

Externalisation Source Accessibility Security Organisational level

7.

6.

Formality Generalisation

8.

Medium

tacit vs. explicit knowledge internal vs. external electronically accessible vs. electronically inaccessible secured vs. unsecured knowledge individual vs. collective (materialised in organisational routines) or migratory knowledge formal, institutionalized, approved vs. informal unapproved knowledge specific, particular, contextualised vs. abstract, general, decontextualised knowledge knowledge as a product vs. knowledge as a process or distinction between knowledge held by an object, an individual or a social system natural vs. artificial knowledge built in or inherited, elicited, and inferred knowledge declarative, procedural, packaged knowledge soft, hard (encoded), manifested knowledge

N 2. 3. 4. 5.

9. 10.

11. 12.

Substance Mode of acquisition Expertise level Mode of appearance

knowledge [10,41]. In this chapter knowledge of an agent is regarded as a primary source of any information available. This means that information is considered as a product of knowledge, not the opposite. 6.2. Types of knowledge In many cases researchers do not attempt to define knowledge. Instead, they describe knowledge by different knowledge types [45]. More than 20 different knowledge typologies are frequently mentioned in readings on KM. Condensed overviews and analysis of knowledge typologies are given in several sources [10, 43, 46, 47]. Table 1 amalgamates several knowledge typologies organised around twelve dimensions. Typologies of the first eight dimensions are suggested by Maier [43] as the most important knowledge types from the organisational point of view. All these dimensions of knowledge are important also from agent perspective, however, they do not show several factors relevant in agent knowledge acquisition and processing. These factors are reflected by dimensions 9-12 in Table 1. The most popular distinction is between tacit knowledge and explicit knowledge (Dimension 1 in Table 1). Tacit knowledge is personal knowledge embedded in individual experience and it is shared and exchanged through direct, face-to-face contact. Tacit knowledge can be communicated in a direct and effective way. Explicit knowledge is externalised knowledge that can be packaged as information, i.e., encoded. The acquisition of explicit knowledge is indirect because it must be decoded and encoded into one's mental models where it is kept as tacit knowledge. In fact, these two types of knowledge are the two sides of the same coin. Tacit knowledge is practical knowledge that is the key to getting things done. Unfortunately, tacit knowledge frequently was neglected in the past. In particular, it is true in business process reengineering, where cost reduction was generally identified with dismissing of people-the only repository of tacit knowledge. This has damaged the tacit knowledge ofmany organizations [43]. Explicit knowledge defines the identity, the competences, and the intellectual capital

Impact of the intelligent agent paradigm on knowledge management

185

of an organization independently of its employees, but it can grow and sustain itself only through a rich background of tacit knowledge. From organisation al perspective the distinction bet ween knowledge types organised around Dimensions 1-8 (Table 1) helps to choos e and utilise appropriate means ofKM regarding each type of kno wledge. From agent perspective it is imp or tant to distinguish between natural and artificial substance of kno wledge (Dimension 9) [41]. Natural knowledge inherently resides in human brains, but artificial knowledge is possessed by a particular artificial knowledge possessor. N atural knowledge as a whole cannot be externalized, that is, it cannot be expressed in any particular language. An externalization artifact can show only a part of it. Thinking in term s of N onaka and Takeuchi's four phases of knowledge conversion as a kind of life- cycle of organizational knowledge [2], we can state that only a part of tacit knowl edge can be transformed into explicit knowl edge through externalization. From agents perspective the distinction between different modes of knowledge acquisition also is useful (Dimension 10). An agent can have built in knowledge, it can inherit knowledge, it can elicit knowledge from an external know ledge source and it can infer knowledge by elaborating on its own knowledge. Built-in knowledge [20] is a part of agents A kno wledge that is embe dded in agent B with a special purpose to enable it to process data and information and/o r develop its own knowledge. Th e origin and the owner of the built-in knowledge is agent A, however, knowledge is possessed also by agent B; and is a fund ament al part of B's knowledge. O nly agents possess built-in knowledge. Passive objec ts possess enco ded knowledge. R egarding natural kno wledge possessors the term "inherited" may be preferred instead of"built- in" when discussing cognition enabling inherent hum an knowledge. Elicited and inferred knowledge of an agent is its self-acquired knowledge. Regarding hum an agents it is possible to distinguish bet ween two types of self-acquired knowledge, namely, first, sensual experience and, second, abstract knowledge that consists of memories about sensual or intellectual experiences. There are three types of abstract knowledge that play a significant role in expertise development [42] (Dime nsion 11 in Table 1). Declarative knowledge domi nates in the initial stages of skill acquisition, later procedural knowledge is developed, at last, wh en proper speed and accuracy in skill application is reached, many things are don e automatically on the basis of the so called packaged knowledge. Externalisation ofpackaged knowledge and its sharing with other agents is problematic because packaged knowledge belongs to the tacit knowledge ofhuman agent. Declarative, procedural and packaged knowledge types may be also used for artificial knowledge processing agents. Extern alized knowledge can be exhibited using any artificial or natural mode of knowledge transfer, e.g., paper, electronic files, sound, movemen t etc. In organizational setting two kinds of externalized knowledge are exploited, namely, first, soft knowledge, and, second, hard (encoded) knowledge (Dimen sion 12). Externalized knowledge can be regarded also as information. Soft information is characterized as being fuzzy, unofficial, intuitive, subjec tive, implied and vague. It is acquired in faceto- face communication, telephone conversations, tours, social activities, transferred

186 Janis Grundspenkis and Marite Kirikova

by gossip, assessments, interpretations, etc. On the other hand, hard information is characterized as definite, certain, official, factual, clear, and explicit [48]. Only part of knowledge in organisation is to be externalised. Therefore the third type of knowledge appearance, manifested knowledge [49], should be recognised (Dimension 12). The manifested knowledge is the knowledge behind the agent's products and processes, which are perceivable representations of this type of knowledge. Externalized knowledge and in particular cases also manifested knowledge has a special role in the organizational context, because it can be captured using artificial knowledge possessors and, therefore, exploited independently of the agent that has provided the knowledge. Such knowledge that is indepedent of its owner or creator is called migratory knowledge [11]. It is one of the main sources of knowledge to be exploited with the help of AI techniques. 6.3. Sources of knowledge

In general, everything around the agent and the agent itself can be regarded as a source of knowledge. However, this "everything" does not immediately and fully become an agent's knowledge. The portion of knowledge taken in by the agent depends on its perceptive ability. The notion "source of knowledge" should be considered from two points of view, namely from the point ofview of the seeker of knowledge and from the point of view of the provider of knowledge. From the point of view of the seeker of knowledge a source is anything that can be used to develop seeker's knowledge, i.e., it isanything that the agent is able to perceive concerning the object of interest. By the object of interest here is denoted any phenomenon of interest: physical objects, software objects, events, processes, etc. From the point of view of the provider knowledge can be represented in the following two ways: • As manifested knowledge, i.e., the object of the interest-made, possessed or organised by the provider which is at the disposal of the seeker of knowledge • As an abstract externalized knowledge [42), when particular abstractions ofprovider's knowledge are presented directly (asin face-to-face interview) or via particular media. From the point ofview of the seeker of knowledge the manifested knowledge is the knowledge that is represented by a particular artifact, natural object or other phenomena, such as event or process [49]. Intelligence, observation and experimentation are necessary to discover original knowledge that is behind the phenomenon faced by the interested agent. Only hypothetical knowledge concerning the object can be obtained by the agent-the seeker of knowledge. To provide abstract knowledge an agent-provider must sort out his experience and decide what properties and relationships of the elements of his knowledge he is going to present. In locating knowledge sources it isimportant to distinguish between masters knowledge and observers knowledge concerning the object of interest (Figure 3). Both, master and observer have tacit knowledge and can provide explicit knowledge, but the contents of

Impact of the intelligent agent paradigm on knowledge management

Can CllIIl/IllIIIIe wllh

187

CanU18

Figure 3. Knowledge sources.

knowledge is different, in terms of declarative, procedural and packaged knowledge. Therefore there can be quite a considerable difference not only regards tacit but also explicit knowledge provided by the master of the object to compare with knowledge provided by the observer of the object. Thus, the agent that seeks for knowledge about a particular object have the following main sources of knowledge (Figure 3): • The object of interest that represents manifested knowledge • The agent that has made the object (master of the object), i.e., knowledge of the master (externalized or non-externalized) • The agent that has observed (or investigated) the object (observer of the object), i.e., knowledge of the observer (externalized or non-externalized) • Descriptive migratory knowledge that has been prepared by the master of the object (externalized master's knowledge) • Descriptive migratory knowledge that has been prepared by the observer ofthe object (externalized observer's knowledge) • Seeker of knowledge itself in terms of its built-in, inherited, procedural and inferred knowledge.

188 Janis Grundspenkis and Marite Kirikova

In Figure 3 by the observation ofthe object are meant all possible methods ofinvestigation ofthe object starting with simply looking at it and ending with modern scientific methods of investigation. Communication here involves ordinary conversation, use of special knowledge elicitation procedures, and observation of the agent. Descriptive migratory knowledge is explicit knowledge encoded using any artificial media. Possessors of knowledge in Figure 3 are divided into three classes, namely-master of theobject, observer oftheobject and seeker of knowledge. The master ofthe object is an agent who has made the object, the observer ofthe object is an agent who has investigated the object by methods available at its disposal. The seeker of knowledge is an agent whose goal is to obtain knowledge about the object. Human agents can obtain knowledge even without conscious purpose of knowledge acquisition [42]. However, in Figure 3 the situation of purposeful knowledge acquisition is reflected. There is a difference in quality, richness and completeness between knowledge of the master of the object and knowledge of the observer of the object. Actually, as experience shows, the observer must learn to make an object by himself or herselfto obtain knowledge that is adequate with the masters knowledge (see, e.g., the example about the development of bread making machine [2]). Figure 3 reflects the sources of knowledge from the point of view of the seeker of knowledge about the object in a particular point of time ti. However, two other agents also can be considered as seekers of knowledge. Actually, the observer of the object would not be used as a source of knowledge if it had not been a seeker of knowledge in some point of time tj = tj - t.t, t.t ::': O. On the other hand, each object made by any agent becomes a part of natural environment (if not purposefully restricted from it by special methods). None of agents depicted in Figure 3 can possess complete knowledge about the natural environment, therefore, the master ofthe object becomes a seeker of knowledge when it observes the object in natural environment. On the other hand, the seeker of knowledge likewise can take the roles of the observer and the master. The seeker of knowledge can utilize different methods of knowledge acquisition to get knowledge from all six types of knowledge sources and acquire soft, encoded (hard) and manifested knowledge concerning the object of interest. All three types of agents represented in Figure 3 may be temporal or permanent groups of co-operating agents possessing internal or external organisational knowledge. Therefore both product and process dimensions of their knowledge are to be considered. Sources of knowledge described above can be regarded as intellectual capital of the organization [10, 14]. Definitions of intellectual capital stress such features of the organizational knowledge as the collective sum of human-centered assets, intellectual property assets, infrastructure assets, and market assets that are embedded in routines and processes that enable actions. It is knowledge captured by the organization's system, processes, rules, culture and products. Various forms ofintellectual capital, for example, ideas, know-how, skills, competencies etc., can be transformed into intellectual assets. So, intellectual capital is becoming the most valuable resource of organizations to provide their competitive advantages. The dynamics of the intellectual capital requires

Impact of the intelligent agent paradigm on knowledge management

189

a new type of management capable to follow fast changes within an organization. This capability may be built on effective use of existing and acquisition of new knowledge sources. The effectivity of managing knowledge sources, in turn, may be achieved on condition that organisation understands the spectrum of its existing and potential knowledge sources well and is aware about static and dynamic relationships between different organisational knowledge possessors. 7. ORGANIZATIONS AS COMMUNITIES OF AGENTS AND PASSIVE OBJECTS

Let us consider any type of organization as a set of various objects together with relationships between these objects, i.e., organization is a system which components are objects. It was already mentioned that objects can be classified as active objects, called agents, and passive objects (further called simply "objects" when it will not cause ambiguous understanding). Agents, in turn, may be natural or artificial ones. Natural agents are humans who act in a real environment. Nowadays there are two kinds ofartificial agents, namely, software agents and robots. Artificial agents are acting within a real environment (robots) or within a virtual environment, that is, cyberspace (robots and software agents). All agents are acting within their environment via information exchange, or communication signals. Natural agents and robots may have to perform speech processing, optical processing and representation, and even be able to process percepts from such senses as touch, taste and smell. Robots have a physical embodiment equipped with sensors to perceive relevant aspects of the environment and effectors that affect the environment. Moreover, agents must understand what their percepts mean. Effective robots (artificial intelligent agents) are equipped with a representation of their physical and software environment as well as their physical embodiment. While robots have existed for decades and they belong to the agents that are well defined, it is not the case with software agents. There are many attempts to define what the software agents or softbots are. One definition is given in the fifth section, two others follow. Intelligent software agents are software programs that perform a given set of tasks on behalf of a user or other agents without a direct human intervention, and in so doing, employ some knowledge of user's goals [50]. Jennings argue [51] that "an agent is an encapsulated computer system that is situated in some environment and that is capable of flexible, autonomous action in that environment in order to meet its design objectives." All agents are called knowledge workers whose decisions effect their environment, which could consist of other agents and/or passive objects, for instance, other types of software and/or hardware that include also control devices. Environment entities can be local to the agent (the same platform or machine on which agent resides) or remote if agent is connected via some type of network with other objects [30]. Fourteen classes of different systems can be considered depending on the nature of their components. First, let us consider three classes of systems that consist only of homogeneous components. The simplest systems consist only of passive objects. We have a lot of artifacts, which belong to this class of systems, for instance, furniture in a room, a

190 Janis Grundspenkis and Marite Kirikova

blower, and an engine. In this case objects are not knowledge workers at all, and systems cannot operate autonomously without external effectors. At the other end there are systems consisting only of humans, for instance, one person, small groups, football club, etc., where people are using their mental models for knowledge exchange. Artificial agents, that is, intelligent robots or intelligent software agents form the third class of systems with only one type of components. This type of intelligent systems can operate autonomously in a wide spectrum of possible environments [20]. Last two types of systems represent knowledge workers and/or their communities. Second, let us consider systems that consist of two types of components. We have six classes: • Software agents and objects, for instance, intelligent frame based temperature controller in a room • Robots and objects ("robot living in cubes world" is a well known example) • Humans and objects, for instance, operators of some complex technical system or process • Humans and software agents (a manager supported by intelligent decision support systems, or a doctor using diagnosis expert system) • Humans and robots (manufacturing processes) • Robots and software agents (an intelligent robot capable of autonomous and active exploration of the environment). There are four systems that consist of three different types of components. They are as follows: • Humans, software agents and objects, for instance, decision maker (manager) and on-line expert system providing process control • Humans, robots and objects (astronauts and spacecraft carrying an autonomous vehicle to explore the surface of the Moon) • Humans, software agents and robots (we can mention the previous example, where astronauts use diagnosis expert systems in the spacecraft) • Robots, software agents and objects (autonomous robot working on the surface of Mars and collecting samples of rocks). There is only one class of systems that includes all four types of components, namely, humans, robots, software agents and passive objects. It is the extension of the previous example where astronauts use diagnosis expert systems and an autonomous robot is sent to work on the surface of Mars with the purpose to collect samples of rocks. In the proposed classification there are several classes where passive objects are left outside of the system under consideration. In these cases objects would be included in the environment. It is obvious, that this assumption is a little bit artificial because practically all systems are dealing with passive objects. On the other hand, general systems theory stresses that the boundary between the system and its environment is fuzzy. The solution which objects belong to the system and which objects belong to the environment depends very much on the investigator's point of view.

Impact of the intelligent agent paradigm on knowledge management

KNOWlEDGE

SPACE

191

Tacitand expncitknowledge

FUTURE

PRESENT

Tacitand explicit knowledge

PAST

Figure 4. Organization's "knowledge space".

8. ORGANIZATIONS AS INTELLIGENT AGENTS

A wide variety oforganizations considered as collections of active objects, i.e. agents or knowledge workers, and passive objects, allows to predict that it is hopeless to develop an effective general purpose KMS usable for all classes of organizations defined in the previous section. At the same time, the role ofKM is steadily growing, particularly, for organizations operating in rapidly changing environments. For such kind of organizations (not only) KM based on active use of past experience and skills is the relevant way towards more effective performance in future. From this point of view organization's knowledge life cycle may be represented as an organization's "knowledge space" shown in Figure 4.

192 Janis Grundspenkis and Marite Kirikova

Environment

Input

Intelligent OrganizationAgent

Output

Figure 5. O rganization as an intelligent agent.

Nowadays information techn ology provides access to data, informatio n and knowledge captured in past and organized as a " knowledge space" . These resources are used with the purpose to get additio nal value out of them at present , and, what is even more important, in the future . Each intelligent organization is trying to reach this goal auto no mo usly makin g rational decisions and taking the best possible action s. So, the interpretation of an intelligent organization as a who le using the concept of an intelligent agent is quite obvious. Intelligent organization like an intelligent agent is perceiving the cur rent state of the environment , using its detectors (sensors) for data, information and kno wledge acquisitio n. The knowledge abo ut the cur rent state and the goal state is used to determine actions that th rough effectors will be applied in the organization's environment . This output is determined on the basis of percepts and built in knowledge. The interpretation of an organization in terms of intelligent agents is shown in Figure 5. In the field of KM all knowledge used to support organization s activities, e.g., business processes is con sidered to be an organiza tion's intellectual capital. More precisely an organization's int ellectual capital is formed from • Human knowledge • Knowledge embedded in organization's business processes, produ cts and services • Internal relationships in organization (relatio nships between agents operating in an o rganization) and relationsh ips between an organization and its environment. The creation and use of the organization's intelle ctual capital frequently cause several serious problems at least for two main reasons:

Impact of the intelligent agent paradigm on knowledge management

193

FUTURE

Figure 6. The role of knowledge management for business process improvement.

• Workers (employees) are unwilling to share their knowledge • Workers who leave the organization take their knowledge, experience and skills away with them. As a consequence, a rather long time is required while novice workers (novice business people) are able to acquire the needed knowledge and skills. How may intelligent organizations solve the mentioned problems? First, organizations must develop their own culture to support knowledge sharing. Moreover, knowledge sharing must be promoted using corresponding technologies. Second, organizations must try to capture knowledge, which is in the heads of their workers. This goal may be reached in different ways starting with promoting communications between individual workers (transmission of tacit knowledge) and ending with building repositories, data warehouses and knowledge bases of explicit knowledge (making tacit knowledge explicit). So, all available means, tools and techniques must be used to make the process of acquiring new knowledge and skills easier for novices. What are the main activities of organizations as intelligent agents to build their own intellectual capital? First, they must perceive and identify intellectual values, which are in the environment, as well as, inside the organizations themselves. Second, they must evaluate whether the identified intellectual values are sufficient for reaching the predefined goals, running business processes and rising the competitiveness. Third, the organizations must create an additional value from their intellectual capital by choosing more rational actions. The maintenance of the knowledge flow provided by the KMS is the vehicle for generation of a new additional value from the intellectual capital of the organization. This will lead to improved business processes in the future as it is shown in Figure 6.

194 Janis Grundspenkis and Marite Kirikova

From organization's as an intelligent agent point of view KM is considered to be knowledge acquisition, processing and use for rational decision making and choosing the best actions as well as for generation of new knowled ge. In other words, KM is systematic management of the intellectual capital of an organization. KM is directly connected with decision makin g, and its success depends on several factors. First, an int elligent organization-agent must clarify its needs, i.e., the problems in its own knowledge flows. Second, an intelligent organization -agen t must create the correspond ing infrastruc ture for KM. The infrastruc ture consists of mut ually integrated techn iques and tools, usually called the KMS [8J. Th e main functions of the KMS are the following: • Detection of information and/ or know ledge (the function of sensors) • Storage of information and/or knowledge (the function of the mem ory) • Inference of conclusions (the function of the mind or inference engine) • Retrieval and visualization of knowl edge • Deci sion making. The list of functions clearly manifests chat these function s are the same that characteri ze any intelligent agent. All business processes are supporte d by int elligent organization-agent activities. Intelligent organization-age nt is making decision s and acting because it is using know ledge expressed in some perceivable form. Each activity is an eleme nt of the decision making process. Intelligent organizations-agents are generating altern atives and are modeling possible situations, wh ich are the possible results of applying the chosen actions. And finally, intelligent organizations-agents are making decisions knowing th e goals and the utilities of the predicted outcomes of actions. 9. ORGANIZATIONS AS MULTIAGENT AND KNOWLEDGE MANAGEMENT SYSTEMS

If on e is lookin g at more details how organization's business pro cesses are supported from the inside, it may be found that organizations employ managers, research assistants, advisers, secretaries, etc. as an omn ipresent staff T hey are employed asschedulers, planners and searchers to do the diverse mundane tasks. In KMS all these activities require intelligent support, which may be implemented in the form of communities of intelligent agents. Thi s "inside look" on intelligent organization-agent is shown in Figure 7. Intelligent agents (the staff of an organization) are using organization's intellectual capital and supported by the KMS continuously are trying to improve business processes. N ow let us consider how the intelligent agent paradigm may be integrated with the KM S to build an intelligent organization's knowledge management system. Th e conceptual model of an organization's knowledge management system (O KMS) based on the intelligent agent paradigm is shown in Figure 8. The basic idea of the conceptual model is that the 0 KMS must operate like the human brain and fulfill the following basic functions: kno wledge acquisition through

Impact of the intelligent agent paradigm on knowledge management

Figure 7. Schematic diagram of organization as multiagent and knowledge management system.

Co-operation platform (virtual)

Ce>operation platform (physical)

Functions Slnlcturallayer "Engine room"

Figure 8. Conceptual model of organizations knowledge management system.

195

196 Janis Grundspenkis and Marite Kirikova

sensors, knowledge storage in some kind of memory, inferencing, knowledge retrieval and representation. The conceptual model consists of two main parts: an organization as a multiagent system for business process support and a knowledge management system. The conceptual model has three layers called an "engine-room", a structural layer and a "co-operation platform". The "engine-room" is an integrated set of technologies, hardware and software to provide knowledge acquisition, storage, processing, retrieval and representation. The purpose of the structural layer is to identify intellectual resources of the organization, and to organize knowledge to make it easily accessible and applicable. A "co-operation platform" is the physical and/or visual environment where organization's intelligent agents may communicate with each other for effective knowledge sharing and distribution to achieve the business process goals. A "co-operation platform" maintains such components as video conferencing, chat rooms, electronic white boards and other tools for co-operative work (groupware). It is needed to point out that at the present moment the proposed conceptual model of OKMS has not been implemented. The next step towards the implementation of the conceptual model is estimation of the potential already manifested by intelligent agents and multiagent systems for KM. For this purpose let us mark out three groups of agents: 1) Agents that may promote the knowledge management and may be used as organization's common vehicle of the "engine-room". 2) Agents that provide communications. 3) Personal agents of knowledge workers. Starting this overview, it is worth to point out some relevant features of KMS that show the similarities between the proposed conceptual model and the known concepts on which KMS's notions are based. According to [8] a framework of the KMS consists of: • The use ofproblem finding and its related techniques to determine present and future problems and to identify future opportunities. • A knowledge infrastructure that is related to very large databases, data warehouses, and data mining (authors remark: we wander why knowledge bases are missed in this list?) . • Network computing (company's intranets and extranets, and Internet) to allow dissemination of relevant knowledge. • An appropriate software that is focused on data, information and knowledge collection, search for needed knowledge, and sharing of knowledge. Thus, the KMS centers on the organization, representation (codification) and dissemination ofknowledge in an organization. The KMS represents a collaborative work environment in which organizational knowledge is captured, structured and made accessible to facilitate a more effective decision making and actions to reach the business

Impact of the intelligent agent paradigm on knowledge management

197

process goals. KMS has been influenced by differnet kinds of prior information and knowledge-based systems [52]. First, there were management information systems (MIS) that provide periodical reports and give periodical answers what should have been done [8]. The next step was addition of a viewpoint of decision maker implemented in decision support systems (DSS). These systems were designed to support problem-finding and problem-solving decisions of the manager. The evolution of the DSS resulted in three new types of systems, namely, group decision support systems (GDSS), executive information systems (EIS) and idea processing systems (IPS) [8]. GDSS combines computers, data communication, and decision technologies to support problem-finding and problem-solving for managers and their staff. The emergence of technologies such as groupware, electronic boardrooms equipped with electronic whiteboards or large screen projectors, LAN, Web and video conferencing, decision support software, etc. have promoted interest in these systems. EIS bring together relevant data from various internal and external sources to obtain useful information that helps to make strategic and competitive decisions. These systems filter, compress and track relevant data as determined by each individual executive end user. IPS are the subset ofGDSS and are designed to capture, evaluate, and synthesize ideas into a large context that has real meaning for decision makers. The inputs of these systems are the problem statement and the observations about the problem. Processing involves idea generation and evaluation for problem solving. The outputs are report and dissemination of information about specific ideas how to solve the problem. The on-line analytical processing (OUP) systems are closely related to the previous kinds of systems. These systems center on the question "what happened" and provide a multidimensional view on aggregated and summarized data, that is, OLAP tools allow to look at different dimensions of the same data stored in data bases and data warehouses. As such, these systems and tools provide a starting point for knowledge discovery within the KMS's operating mode [8]. From the broader view knowledge discovery or data mining tools are needed to complement OLAP systems because these tools tell decision makers why something has happened in their business. Knowledge discovery tools are capable of uncovering patterns that can lead to discovering new knowledge. So, they are considered to be the next step beyond OLAP systems for querying data warehouses, and as a prerequisite for interpretation and dissemination of knowledge. Knowledge acquisition, processing and usage typically have been implemented in knowledge-based systems (KBS), in particular, in expert systems. These systems are designed to simulate expert's problem-solving abilities in a narrowly specified problem domain. In KM context expert systems can be thought ofas knowledge transfer agents [8]. The problem with expert systems is well known-they are able to respond only to queries about something that is stored in their knowledge bases, otherwise they cannot respond. This is where neural networks could help because neural networks learn the human decision-making process by examples internally developing the proper algorithms for problem-solving. Thus, neural networks do not require a complete knowledge base and extensive interpretation of its contents by an inference engine. Neural networks

198 Janis Grundspenkis and Marite Kirikova

are effective in processing fuzzy, incomplete, distorted, and "noisy" data. They are suitable for decision support under conditions of uncertainty, and extremely useful in data mining and other database specialized tasks. Recent trends manifest the transition to a combined environment of KMS and advanced techniques, such as virtual reality, multiagent systems, and VVeb intelligence [29, 30]. The integration ofKMS with virtual reality allows decisions makers to think from a different perspective and, as a consequence, to enhance their skills. Using sophisticated interactive computer graphic, special clothing .and fiber optic sensors it is possible to treat system-generated objects almost as real things. These developments to a large extent are related with the appearance of the notion "cyberspace" as the environment for both humans and intelligent agents [53] to support interactive users. Building agents that can live and survive in the broad variety of environments, including hostile ones, promote new exciting results of AI and new promising applications as well. It will be shown later that intelligent software agents, in particular, offer an ideal technology platform for providing data sharing, personal services, and pooled knowledge

[54].

9.1. Intelligent agents for OKMS "Engine Room"

How is the intelligent agent paradigm exploited in KM already now and what are the perspectives of single agents and multiagent systems in this field? To answer the question, let us describe the agents from each of the mentioned above groups, starting with the first group-agents that may be used to build an OKMS's "engine room". At the beginning two aspects should be stressed. First, we have neither intention nor possibility to give an exhaustive description due to the sweeping changes in this field. Second, our division of groups has fuzzy boundaries because several agents may be included in more than one group. Nowadays agents are good at performing lists oftasks when specified triggers (events like "report completed", "fax received", and so on) prove to be true [30]. Agents serve for monitoring and collecting information from data streams and taking action on what they encounter. In this case multiple agents are responsible for network access, searching for information and filtering it. They are designed for information handling in information environments like WAN and LAN, for instance, the Internet, organization's intranet, etc. These agents are more commonly used because for most people navigation and using network systems is increasingly difficult and time consuming. Moreover, for intelligent agents others than humans information available on the Web is not understandable at all and hopes to change this are connected with the evolution of the Semantic Web. In [30] Knapik and Johnson list a plethora ofagents that can be useful in KMS. First, there are network agents like NetWare management agent (NMA) or NetWare LANalyzer agent, and many others. The NMA provides the NetWare management system with server statistics and notification of alarms so the network supervisor can monitor, maintain, and optimize server performance in a distributed computing environment from a single location. The NetWare LANalyzer agent is designed to complement the

Impact of the intelligent agent paradigm on knowledge management

199

NetWare management system. This agent monitors the interaction between various devices on the network, warns of potential problems and lets optimize Ethernet and Token Ring segments. Network software distribution agents make the process of installing or updating software (operating systems, new data, applications) completely transparent to users across any size of network. The network administrator can graphically configure and launch an agent. After that, the agent can either install software or data on all nodes in the network, or delay it until the network is less crowded. Connection and access agents automatically configure and connect the user to the correct service depending on his/her needs and available resources. Second, database agents become even more valuable in database managment due to the fact that data warehouses become huge and complex. This class of agents can perform many useful tasks, including data integrity support in the database, providing constraints, for instance, preventing out-of-range data from being stored or illegal operations for trashing data, and distributing reports that can be automatically formatted in many different ways and distributed via E-mail, fax, on-line services, the Internet, and so on. In distributed database systems agents can perform backups and other routine tasks. Database agents in the future may automate all database access and updating, checking the validity of data, and perform natural language queries. They will also coordinate application execution among distributed databases, support data security requirements and maintain referential integrity [30]. Without doubt the richest source of data, information and knowledge that is accessible for any organization nowadays is the WWW (the Web, in brief). Unfortunately, we must conclude that the Web currently contains a lot of data, more and more structured data (structured documents, online databases) and simple metadata but very little knowledge, i.e., very few formal knowledge representations [55]. One of the main reasons is that the knowledge is encoded using various languages and practically unconnected ontologies. As a consequence, each knowledge source requires the development of a special wrapper for its knowledge to be interpreted and hence retrieved, combined and used. Many researchers are trying to overcome these problems. Their efforts resulted in the appearance of a new paradigm, so called VVeb intelligence for developing the Web-supported social network intelligence. Many details on developed approaches and tools in this very hot research topic, for example, intelligent Web agents, information foraging agents living in the Web space, social agents in Web applications, Web mining and farming for Web intelligence, intelligent Web information retrivel, Web knowledge management, and Web intelligent systems are given in [29]. The final goal of all these research efforts is a Semantic VVeb. Though the Semantic Web vision begins with information discovery [56] its potential goes well beyond information discovery and quering. In fact, it encompasses the automation of Web-based services. The influence ofSemantic Web on knowledge management is obvious. New exciting perspectives will appear when researchers come closer and closer to the goal of the Semantic Web-the Web that is unambiguously computer interpretable, and thus very accessible to intelligent agents. The Semantic Web would allow intelligent agents to do the work of searching for and utilizing services required by organizations as well as humans [27].

200 Janis Grundspenkis and Marite Kirikova

9.2. Agents that provide communications

Now let us continue with the second group of intelligent agents-agents that provide communications. Communications between individual of the multiagent community is the most relevant issue for effective knowledge creation, sharing and distribution in the KMS. Several communication management agents are already known and many others will appear in the near future. Messaging agents, for instance, Wildfire can connect people with each other no matter where they are and what communication medium they use [30]. Agents that are responsible for real-time monitoring and management of telecommunication networks, that is, for call forwarding and signal switching and transmission also belong to this class of agents. Assistant agents can perform automated meeting scheduling, inviting the right people, fixing meeting details like location, time, and agenda, arranging teleconferencing and videoconferencing if necessary. The next step in agent technologies that provide communications is the use of cooperative agents that are able to communicate with other agents and collaborative agents that are able to cooperate with other agents. Ending the short overview of agents that provide communications let us point out the fact that in [57] the discussed classes of agents are included into a groupware that is hardware and software technology to assist interacting groups. Computer Supported Cooperative Work, in its turn, is the studies how groups work, and how this technology helps to enhance group interaction and collaboration for promotion of knowledge flow and transformation in the OKMS. There are many groupware systems, for instance, GDSS, Workflow management systems, meeting coordinators, desktop conferencing (audio and video) systems, distance learning systems, systems for group (concurrent) editing and reviewing documents, etc. In addition, computer aided software engineering (CASE) and computer aided design (CAD) tools are well known representatives of groupware systems. Besides groupware modules relevant for operating of the entire groupware system, the modules that perform specialized functions and involve specialized domain knowledge are frequently needed. These modules are called team agents [57]. Examples of team agents are user interface agents, "social mediators" within an electronic meeting, and appointment schedulers that allow to schedule a meeting along a group ofpeople by selecting a free time slot for all participants. 9.3. Personal agents

Finally, let us discuss the role of personal agents in KM. Perosonal agents belong to humans, support human computer interaction and help knowledge workers to acquire, process and use the knowledge. Several types ofthese agents can be considered, namely, search, assistant, filtering and work-fiow agents [30]. Search agents are the most commonly used ones and work in different ways. Some agents search titles of documents or documents themselves, while others search other indexes or directories on the Web. Filtering agents may monitor the data stream searching the text for knowledge and phrases as well as the list ofsynonyms, and try to forward only the information that the users really need. These relatively simple agents can ideally search any document found and download it if search criteria are met. More

Impact of the intelligent agent paradigm on knowledge management

201

Figure 9. An agent-based environment of the knowled ge worker.

sophisticated filtering agents can be trained by proving the sets of examples illustrating articles that users choose to read. Afterwards these agents begin to make suggestio ns to the user and receive feedback, which leads to a more representative profile of the user's nee ds. Assistant agents are designed to wait for events such as E- mail messages to occur, th en to sort them by sender, pr ior ity, subj ect, etc. T hese agents can also autom atically track clients and remind users of follow-up action s and commitments. In KM work-flow agents are useful for daily task coordination , appoi ntme nt and meeting scheduling, and routing communication from E-mail, teleph on es and fax machines. The facilities to support these types of agents are going to appear because the embedded real-time operating system vendors start to incorporate the standard infrastru cture and language support [30]. The progress in personal agent techn ologies is connected with the use of smart agmts [54] that exhibit a combination of all capabilities that are characteristic for coo perative, adaptive, person al and collabora tive agents. Smart agents will be able to collect informatio n about databases and business applicatio ns, as well as to acquire, store, generate and distribute knowledge. N owadays we enter the int elligent agents age using relatively simple agents but even th is situation offers pretty goo d opportuni ties to build an agent-based enviroment for knowledge worker suppor t in the KM S, as it is shown in Figure 9.

202 Janis Grundspenki s and Marite Kiri kova

For future vision let us speculate on the future impact of intelligent agents on the KMS. Today we are makin g only the first steps towards th e development of a cybercivilization w here agents will help efficiently by providing uniform access to the Web resources (the Semantic Web, whi ch is understandable for inte lligent agents instead of the Web w hich is understandable only for humans), makin g it possible to get information in time , to acquire, store, process and share knowl edge. The future evolution of suitable agents for KM is connected with information agent s and their ex tension- know/edge agents that will be able to learn from their environments and from each other as well to coo perate with each other [30]. They will have access to many types of information and kno wledge sources and will be able to manipulate information and knowledge in order to answer querie s posed by humans and othe r kn owledge agents. Teams of agents will be able to search Web sites, heterogeneo us databases and kno wledge bases, and work togeth er to answer queries that are outside the scope of any individual intelligent agent. These agents will execute searching in parallel, showing a considerable degree of natural language understanding, using sophisticated pattern extraction, graphical pattern matching, and context-sensitive searches. Coordination of agents will be handled either by supervi sing agents, or via communication between searchin g agents. So, more and more activities perfo rmed by hum ans will be automated that allow not only to speak about an agent- enhanced hum an but even to replace at least part of humans that now provide information and knowl edge base services by int elligent agents and their communities. This, in turn, will cruc ially impa ct the evolution of the KMS makin g them more and more intelligent. 10. CONCLUSIONS

T he analysis ofthe two different tracks in KM , namel y, people knowl edge management and informatio n technology knowledge management, reveals the existing gap between them . In this paper the intelligent agent paradigm is used as "a bridge" between these two rather isolated fields. Amalgamation of advanced AI and KM techniques may give a synergy effect for the developm ent of OKMS based on single intelligent agent s and their communities . The paper has several objectives. First, in order to realize the importance of concepts used in KM, we discuss the paradigm shift in organizational thinking from information to knowledge processing. Second, we consider many sometimes even conflicting definition s of knowledge man agement and classify them into three classes using such criteria as formal, process and organizationa l aspects. Third, we have introduced th e reader int o the intelligent agent paradigm and describe the essence of simple agents as well as mu ltiagent systems . Fourth, we discussed the notion "knowledge" in details and describe knowle dge possessors, knowledge types and kn owled ge sources. Fifth, in accordance with their active agent and passive object compo nents we have divided organization s into fourteen different systems. We show that an organization as a whole may be analyzed as an intelligent agent, introduce the notion of organization's " knowledge space" and outline the role of KM for organization's business pro cess improvement . We propose a novel conceptual mo del of th e OKMS based on the intelligent agent paradigm. Regardless of efforts needed to

Impact of the intelligent agent paradigm on knowledge management

203

achieve a considerable amalgamation of AI and KM techinques, we see a potential of the proposed model moving towards the development of a general approach that will make the intellectual capital of an organization work as an effective knowledge engine in the framework of the KMS. Practical implementation of the proposed model is the major topic of the future work. It could serve as a research platform for integration of interdisciplinary approaches to knowledge management. Being enthusiastic about the perspectives of intelligent agents and multiagent systems in general, we could not neglect the dark sides of agents that can impede the evolution of the KMS. Rogue or maliciously programmed agents can make the worst viruses or try to destroy the whole KMS and, as a consequence, the organization itself. On the other hand, taking care about the agent security and privacy issues in the world of distributed agents, we can achieve vital progress in the KMS development making them more and more intelligent. To conclude, subsequent efforts are needed with a focus on the implementation and application of various intelligent agents and multi agent systems, to develop advanced KMS. ACKNOWLEDGMENTS

This work would not have been possible without the contribution of Dace Apshvalka, MSc. REFERENCES

[1] Piccoli, G., Ahmad, R., and Ives, B. Knowledge management in academia: a proposed framework. Information Technology and Management, 1,2000, pp. 229-245. [2] Nonaka, I. and Takeuchi, H. Knowledge Creating Organizations. Oxford University Press, New York, 1995. [3] Sveiby, K.-E. What is Knowledge Management?, 2000. Available at http://sveiby.com.au/ KnowledgeManagement.html.

[4] Srikantaiah, T. K. Knowledge management: a faceted overview. In Srikantaiah, T. K., Koenig, M. E.

n

(eds.). Knowledge Management for the Information Professional. ASIS Monograph Series, Medford, New Jersey, 2000, pp. 7-17. [5] Koenig, M. E. n, and Srikantaiah, T. K. The evolution of knowledge management. In Srikantaiah, T. K., Koenig, M. E. D. (eds.). Knowledge Management for the Information Professional. ASIS Monograph Series, Medford, New Jersey, 2000, pp. 23-36. [6] May, n and Taylor, P. Knowledge Management with Patterns. Communications of the ACM, 46(7), 2003, pp. 94-99. [7] Information Technology for Knowledge Management. Borghoff, U. M., Pareschi, R. (eds.). SpringerVerlag, Berlin, Heidelberg, New York, 1998. [8] Thierauf, R. J. Knowledge Management Systems for Business. Quorum Books, Westport, Connecticut, London, 1999. [9] Liebowitz, J. Building Organizational Intelligence: a Knowledge Management Primer, CRC Press, Boca Raton, Florida, 2000. [10] Beckman, T. J. The current state of knowledge management. In Liebowitz, J. (ed.). Knowledge Management Handbook, CRC Press, Boca Raton, Florida, 1999, pp. 1.1-1.21. [11] Tiwana, A. The Knowledge Management Toolkit, Prentice-Hall, New-Jersey, 2000. [12] Sarvary, M. Knowledge management and competition in the consulting industry. California Management Review, 41,1999, pp. 95-107. [13] Galliers, R. D. and Newell, S. Back to the future: from knowledge management to data management. In Smithson, S., et. al. (eds.). Proceedings of the 9th European Conference on Information Systems, University ofMaribor, Slovenia, 2001, pp. 609-615. [14] Stewart, T. A. Your company's most valuable asset: intellectual capital. Fortune, 130, 1994, pp. 68-74.

204 Janis Grundspenkis and Marite Kirikova

[15] Nahapiet, J. and Ghoshal, S. Social capital, intellectual capital, and the organizational advantage. Academy of Management Review, 23(2), 1998, pp. 242-266. [16] Wang, K. personal communication, 2000. [17] White, C. M. Telecommunications and networks in knowledge management. In Srikantaiah, T. K., Koenig, M. E. D. (eds.). Knowledge Management for the Information Professional, ASIS Monograph Series, Medford, New Jersey, 2000, pp. 237-253. [18] Knowledge Management and Virtual Organizations, Malhotra, Y. (ed.). Idea Group Publishing, Hershey USA, London UK, 2000. [19] Grey, D. Knowledge Management Tools, Smith Weaver Smith Cultural Changemakers, 1998. Available at http://www.smithweaversmith.com/kmtools&.htm. [20] Rusell, S. and Norvig, P Artificial Intelligence. A Modern Approach, Prentice Hall, New Jersey, 2nd ed., 2003. [21] Murch, R. and Johnson, T. Intelligent Software Agents. Prentice Hall PTR, New Jersey, 1999. [22] Genesereth, M. R. and Ketchpel, S. P Software agents. Communications of the ACM, 37(7), 1994, pp.48-53. [23] Multiagent Systems. A Modern Approach to Distributed Artificial Intelligence. Weiss, G. (ed.). The MIT Press, Massachusetts, 2000. [24] Shoham, Y. Agent-oriented programming. Artificial Intelligence, 60,1993, pp. 51-92. [25] Wagner, G. Agent-Oriented Analysis and Design of Organizational Information Systems. In Barzdins, J., Caplinskas, A. (eds.). Databases and Information Systems, KIuwer Academic Publishers, 2001, pp.111-124. [26] Hayes-Roth, B. An architecture for adaptive intelligent systems. Artificial Intelligence, 72, 1995, pp. 329-365. [27] Bryson, J. J. et al. Agent-Based Composite Services in DAML-S: the Behaviour-Oriented Design of an Intelligent Semantic Web. In Ning Zhong, Jiming Liu, Yiyu Yao (eds.). Web Intelligence, Springer, Berlin, 2003. [28] Huhns, M. N. and Singh, M. P Agents and multiagent systems: themes, approaches and challenges. In Huhns, M. N., Singh, M. P (eds.). Readings in Agents, Morgan Kaufman, San Francisco, CA, 1998, pp. 1-23. [29] Web Intelligence. Ning Zhong, Jiming Liu, Yiyu Yao (eds.). Springer, Berlin, 2003. [30] Knapik, M. and Johnson,]. Developing Intelligent Agents for Distributed Systems. McGraw-Hill, New York, 1998. [31] Genesereth, M. R. Interoperability: an agent based framework. AI Expert, March, 1995, pp. 34-40. [32] FlPA-Foundation for Intelligent Physical Agents. Available at http://www.FlPA.org. [33] Jennings, N. R., Sycara, K., and Wooldridge, M. A roadmap of agent research and development. Autonomous Agents and Multi-Agent Systems, 1(1): 7,1998, pp. 7-38. [34] Musgrave, A. Common Sense, Science and Scepticism: A Historical Introduction to the Theory of Knowledge. Cambridge University Press, Cambridge, 1993. [35] Moser, P. K. and Nat, A. Human Knowledge: Classical and Contemporary Approaches. Oxford University Press, New York, Oxford, 1995. [36] Davenport, Th. H. and Prusak, L. Working Knowledge: How Organisations Manage What They Know. Harvard Business School Press, Boston, 1998. [37] Aadmot, A. and Nygard, M. Different roles and mutual dependencies of data, information and knowledge-an AI perspective on their integration. Data & Knowledge Engineering, 16, 1995, pp. 191-212. [38] Sildjmae, I. J. Artificial Intelligence: Knowledge and Thinking, Tartu Technical University, Tartu, Estonia, 1989, (in Russian). [39] Kangassalo, H. Conceptual level interfaces for data bases and information systems. In Jaakola, H., Kangassalo, H., Ohsuga, S. (eds.). Advances in Information Modelling and Knowledge, lOS Press, Amsterdam, Washington, Tokyo, 1991, pp. 66-90. [40] Zack, M. H. Managing codified knowledge. Sloan Management Review, 40(4), 1999, pp. 45-58. [41] Kirikova, M. and Grundspenkis,]. Using knowledge distribution in requirements engineering. In Leondes, C. T. (ed.). Knowledge Based Systems, Techniques and Applications, Vol. 1, Academic Press, San Diego, 2000, pp. 149-184. [42] Anderson, J. R. Cognitive Psychology and Its Implementations. W H. Freeman and Company, New York, 1995. [43] Maier, R. Knowledge Management Systems: Information and Communication Technologies for Knowledge Management. Springer, Berlin, Heidelberg, 2002.

Impact of the intelligent agent paradigm on knowledge management

205

[44] Drott, M K. Cognition defined' Available at http://drott.cis.drexel.edu/I625?1625def.html. [45] Mertins, K., Heisig, P., and Vorbeck, J. Knowledge Management: Best Practices in Europe, Springer Verlag, Berlin, Heidelberg, 2001. [46] Venzin, M., von Krogh, G., and Roos, J. Future Research into Knowledge Management. In Knowing in Firms: Understanding, Managing and Measuring Knowledge, Sage Publications, London, 1998, PI"26-66. [47] Kirikova, M. and Grundspenkis,J. Typesand Sources of Knowledge. In Scientific Proceedings of Riga Technical University, 5th Series: Computer Science, Applied Computer Systems-3'd Thematic Issue, Riga Technical University, Riga, 2002, PI'. 109-119. [48] Watson, H. J., Haudeshel , G., and Rainer, R. K., jr. Building Executive Information Systems and Other Decision Support Applications. John Wiley & Sons, Toronto, 1997. [49] Wikstrom, S. and Normann, R. Knowledge and Value: A New Perspective on Corporate Transformation. Routledge, London, 1994. [50] Woodridge, M. and Jennings, N. R. Intelligent agents: theory and practice. The Knowledge Engineering Review, 10(2), 1995, PI'. 115-152. [51J Jennings, N. R. On agent-based software engineering. Artificial Intelligence, 117, 2000, PI'. 277296. [52] Grundspenkis, J. Concept of Intelligent Enterprise Memory for Integration of Two Approaches to Knowledge Management. In Haav, H.-M., Kalja, A. (eds.). Databases and Information Systems II. Kluweer Academic Publishers, Dordrecht, 2002, PI'. 121-134. [53] Bradshaw,J. M., et al. Terraforming cyberspace. Computer, July, 2001, PI'. 48-56. [54] Case, S., Azarmi, M., Thint, M., and Ohtani, T. Enhancing e-communities with agent-based systems. Computer, July, 2001, PI'. 64-69. [55] Martin, P. Knowledge Representation, Sharing, and Retrieval on the Web. In Ning Zhong, Jiming Liu, Yiyu Yao (eds.). Web Intelligence, Springer, Berlin, 2003, PI'. 243-276. [56] Berners-Lee, T., Hendler, J., and Lassila, 0. The Semantic Web. Scientific American, 284(5), 2001, PI"34-43. [57] Ellis, C. and Wainer, J. Groupware and Computer Supported Cooperative Work. In Weiss, G. (ed.). Multiagent Systems. A Modern Approach to Distributed Artificial Intelligence, The MIT Press, Massachusetts, 2000, PI'. 425-458.

Definitions of knowledge differ depending on the field of investigations. These differences help to understand the inherent properties and nature of knowledge. The philosopher John Lock has defined knowledge as follows: "Knowledge then seems to me to be nothing but perception of the connexion of an agreement, or disagreement and repugnancy of any of our ideas" [35].

There are two important aspects in the definition given above, namely, first, the systemic nature of knowledge is revealed by emphasis on connection, and, second, the fact of the ownership of knowledge is mentioned by referring not to the general but to the particular ideas. In the area ofknowledge management one ofthe most popular definitions ofknow1edge is given by Davenport and Prusak [36]: "Knowledge is a fluid mix of framed experience, values, contextual information, expert insight and grounded intuition that provides an environment and framework for evaluating and incorporating new experiences and information. It originates and is applied in the minds ofknowers. In organization it often becomes embedded not only in documents or repositories but also in organizational routines, processes, practices, and norms."

206 Janis Grundspenkis and Marite Kirikova

Davenport and Prusak's definition shows that fact that the agent needs to possess the knowledge to be capable to acquire knowledge, as, according to the definition, the knowledge provides environment and framework for incorporating new experiences and information. This view is in line with the understanding of knowledge that is expressed in literature of AI [37]. Janis Grundspenkis is a professor at Riga Technical University. He currently teaches in systems theory and artificial intelligence. His research interests are in the area of applications of intelligent agent technologies in knowledge representation and processing for knowledge management purposes. He leads the research project "Modeling of Intelligent Agent Cooperative Work for Knowledge Management and Process Reengineering in Organizations." His 38-year career has focused on the development of structural modeling methods and tools for heterogeneous system diagnosis. He has published more than 140 scientific publications in this and related fields. Marite Kirikova has Dr.sc.ing. in Information and Information Systems. She is an author of more than 30 scientific publications. Marite Kirikova is a scientific researcher and associated professor at Riga Technical University. She has done fieldwork at Stockholm University and Royal Institute of Technology, and Copenhagen University. Marite Kirikova currently lectures in systems analysis, knowledge management, and requirements engineering. She also participates in the research project "Modeling of Intelligent Agent Co-operative Work for Knowledge Management and Process Reengineering in Organizations." Contact address: Department of Systems Theory and Design, Faculty of Computer Science and Information Technology, Riga Technical University, 1 Kalku Street, Riga, LV-1658, Latvia. E-mail: [email protected]. [email protected]

METHODS OF BUILDING KNOWLEDGE-BASED SYSTEMS APPLIED IN SOFTWARE PROJECT MANAGEMENT

CEZARY ORLOWSKI

INTRODUCTION

Information Technology today penetrates all fields of human activity and is becoming a general element in the functioning of contemporary society. It is an instrument of multi-sided communication and information exchange and embraces all areas of life to an ever-increasing degree. The global dimension of information systems introduced into firms has created diverse communication media and is causing radical changes in world economic and organizational structures. Information techniques and tools are one of the most significant elements in these developments, deciding the global character ofmanagement, speed and transfer ofinformation as well as speed of decision making. The development of computerization and telecommunications and the fusion of the two technologies provides managers with ever more effective information systems (adapted to the needs of the user, precise, fast and able to meet deadlines), which are tools in shaping products of high quality and profitability. The implementation of knowledge-based information projects, which is becoming an important problem for individual companies and for the economy as a whole, involves engaging considerable financial resources and a large implementation risk. In the case of enterprises financed from public money the high costs are linked to considerable social expectations. These expectations as well as the high implementation risk mean that complex research is being undertaken, covering technical analysis of cases ofintroduction and the possibilities ofmaking use ofexisting methods, techniques and models to find new solutions in creating knowledge-based systems [14]. 207

208

Cezary Orlowski

In this chapter, existing methods of building knowledge-based systems in Software Project Management (SPM) are discussed. New possibilities of modelling these systems are indicated and an example of building a model ofa social system is presented-a fuzzy model of information project management. The field of research was narrowed down to implementation of systems by international project consortiums, consisting of several or more project teams, understood in work as "distinguished from the structure of the organization, commissioned for a defined period and consisting of specialists from various fields, whose knowledge and experience have a bearing on the problem" [68]. The concept of knowledge-based systems (KBS) is used in literature in a variety of senses: Ullman [71], Bazewicz [2], Bubnicki [7], and Hickman [28]. In the present analysis it is taken to refer to an information system with rule-object representation of knowledge in the form of a hierarchical decision network and mixed (before and after) conclusion-drawing type. The first part presents the state of modelling knowledge-based systems for SPM. The existing methods and project tools are presented and methods of assessing project organization are indicated. The second part discusses new ways of creating SPM models. The possibilities of applying fuzzy sets and fuzzy regulators are examined. In the third part an example of constructing a model of a fuzzy system of SPM is presented, on the basis of the theory of fuzzy regulators and fuzzy systems. First, conceptions of the model are discussed and then details of the model's construction are described: hierarchical-presenting the hierarchy of levels in managing projects and teams; structural-emphasising the variables: input and output state, static, dynamic; and fuzzy-formalizing knowledge with the help of fuzzy sets. 1. PROBLEMS OF MODELLING SPM

In attempting to build a SPM model for a knowledge-based system, the aim of recognising the state-of the-art was set. This knowledge indicates the hierarchy of problems in managing and implementing projects (fig. 1). In management [20] these problems concern access to expert knowledge of SPM, use by managers of management methods to support project implementation and application of models for assessing project processes and teams. The consequences of these problems are exceeding the budget, failure to meet deadlines and limitation of the aim of the enterprise [5]. 1.1. Expert knowledge of project management

According to SPM experts [23], scope, time resources, communication, risk and project changes are inter-related management problems whose occurrence makes knowledge of SPM and the experience of implementation described in project experiment documentation important elements in assessingand directing future projects [73]. Access to such documentation is however made difficult because of the unwillingness of firms to

Methods of building knowledge-based systems applied in software project management

209

publish their own failures, as also in many cases by the absence of records [20]. Recreating such knowledge is in turn difficult because of the limited possibilities (one-off events without obvious marks of the frequency of their occurrence) of describing risk and change management. This is also a consequence of the difficulty of documenting project management processes, of lack of adequate knowledge of the mechanisms occurring during their implementation and of the problems of formalizing knowledge of SPM [40]. It is also conditioned by the commercial nature of the enterprises [19]. Because of this there is also a lack of knowledge of implementations and of managing them [61]. For project directors, a source of knowledge may be experience of implementing previous projects [601. This may constitute a source of knowledge on management, but it depends on the specific character of the projects and on the director's ability to make use of this experience in implementing further enterprises in a new field and with a new project team.

Problems of project management

Limited application of the model in assessing projec t processes and project teams

Difficult ies of using management methods to support proj ect implementation

Lack of expert knowledge of project management 'the art of management '

Pr obl ems of project implementation Exceeding budget

Exceeding schedule

Limitation of project aim

Figure 1. Relationship between management problems and implementation of information projects.

1.2, Methods of supporting management processes

Apart from expert knowledge, a source of knowledge on enterprise management are the methods applied in them, gained for the most part by firms dealing with designing and introducing information systems on the basis of their own experience. They constitute guides to formal behaviour in SPM. For example: KADS, presented in the

210

Cezary Orlowski

work ofHickman and Killin [28] and Pragmatic KADS in the work ofKingston [33] divide enterprises into implementation phases and indicate a selection ofsolutions from the field of management for these phases. Adaptation of known methods of designing information systems, as presented in the work of Coleman and Somerland [11] as well as Nerson [50], is also possible. Individual methods created by large project teams are also applied, an example being the method of Project Management Methodology (PMM), worked out by the firm IBM, presented by Lopacinski and Kalinowska-Iszkowska [44]. It places the main emphasis on the processes ofplanning and implementing the project. In addition, as documentation of project experience of SPM, it makes use of the packet WSDDM (Worldwide Solution Design andDelivery Methods) supporting knowledge acquisition and project management in the phases distinguished: • project identification (assessment of the feasibility of implementing the project, definition and specification of user needs, definition of critical factors, estimation of risk level), carried out independently by the client and by the provider of the system; • project initiation (definition of project management structure, assembly of team and assignment of tasks to particular people, definition of processes ensuring quality and management of exceptions as well as definition of criteria for acceptance of results, costs and duration); • project implementation (cyclical implementation of tasks, regular team meetings, reports on work in progress, internal and external controls, analysis of exceptional situations); • project completion (preparation of documents of implementation and experience). With the use of this method, definition of processes is also possible: • plan management-preparation of plans and reports, analysis of progress of work; • contract management including documentation; • exception management-covering implementation risk; • reaction to changes-decision-making when problems and errors occur; • quality assurance-surveys of correspondence between project implementation and methodology adopted; • management of personnel and organization-definition of project structure; • assignment of tasks to key people and identification of processes, plans of work and employment, team management, taking account of changes and development. Solutions ofthe PMM type can be applied by the IBM project team, but considerable risk is involved in adapting them for project teams on less mature levels. According to Boehm [3], algorithmic methods from the groups COCOMO and COCOMO II are used for economic assessment of enterprises. COCOMO II, which takes account of the maturity of project processes, contains three models: Application

Methods of building knowledge-based systems applied in software project management

211

Composition, Early Design and Post Architecture; while COCOMO classifies information projects with regard to risk group: • organic projects, whose particular characteristic is small teams of a high technical level with projects of recognised object field and known information tools and methods; • semi-detached projects-quasi-autonomous, in which the team members represent various levels of technical knowledge, while the object area and the information tools and design methods applied are generally known; • embedded projects, in which a complex project of unrecognised object area is implemented, with methods and information resources unknown for the given area, but capable of application. Other methods support project management in respect to cost, composition and size of teams as well as labour intensity [64]:

• estimation by analogy-assessment ofprojects on the basis of earlier implemented and documented projects; • expert assessment carried out by a group of independent experts; • input estimation: based on elementary work units (Work Breakdown StructureWBS); • top down estimation-method of design within the limits of costs set (Design to Cost)-introductory decomposition on simpler tasks (Workpackages) and definition of the necessary outlay of work, further decomposition to tasks and exact processes, assuming that the cost of the enterprise is the sum of the costs of individual tasks; in cases where the costs are exceeded modification of the system is required. Top down estimation is: • estimation based on a parametric model (relationship between output of work and duration of project as well as factors directly bearing on it); • estimation in order to win (Price to Win)-assessment of the enterprise is conducted in such as way as to outdo potential competitors. To manage time and resources, the method offunction points analysisis applied. This method was worked out by Albrecht of the firm IBM in 1979 and later perfected by IFPUG (International Function User Group) [24]. Its main aim is to calculate "attributes of productivity of the information system" [1] by receiving: • input variables; • output variables; • internal data collections; • external data collections; • questions for the system. Each of these attributes is subordinated to three degrees of complexity: simple, moderate and complex. Each degree of complexity is assigned a weight. For example:

212

Cezary Orlowski

for the category of input to a system with a moderate degree of complexity, the weight is 6. The total value of uncorrected function points is calculated by applying the equation:

L Li=1 111; Nj 5

NPF =

3

(1)

i=1

where: NPF - total value of uncorrected function points TV;; - value of the co-factor of weight N} - number of elements in the project i-number of the conversion element j - number of the complexity level.

The next stage is correction of the calculated value resulting from the conditions of implementation of the real system, embracing 14 factors, including: conversion distribution, productivity of the final user, simplicity of installation. It is assumed that assessment of the value of these factors is subjective and arises in the course of observing the implementation of the information system. The complex value of the corrected function points is calculated on the basis of the equation: PF = NPF x (0.65

+ 0.01

x

t

K;)

(2)

where: PF - complex value of the corrected function points NPF - total value of uncorrected function points K; - value of corrective co-factor. The value PF is the basis for assessing the labour intensity (expressed in people months) of implementation of the information system. Trans-calculation of the cofactor PF to a labour intensity value takes place with the help of the labour intensity curve, arising on the basis of assessment of implemented information projects. The method offunction points analysis,like COCOMO, is characterised to a considerable degree by the influence of subjective judgment, and not by objective indicators in the assessment of project implementation. This means that assessment of information projects by the use of the solutions presented demands considerable experience and acquaintance with often complicated algorithms of conduct in applying these assessments. It also means that their use to manage changes and risk is markedly limited because of the complexity of the risk and change issues in enterprises. 1.3. Description of project teams

Managing information enterprises demands a considerable involvement of human resources. The most beneficial solution seems to be cooperation with the external

Methods of building knowledge-based systems applied in software project management

213

organiser of the enterprise, definition of attitudes and tasks of particular people and assignment of work to be carried out to a team consisting of employees possessing a high level of subject knowledge and considerable organisational skills. In assembling the team, account should be taken (see Heller [27] and Kerzner [32]) of the changes that occur in organisational structures. This means that the production processes should take account of the roles of the employees and their influence on the product under production. Such organizations are seen as a complex system in which information tools and management techniques are directly connected with each other (Workflow Management, Groupware Process Reengineering, Computer Supported Co-operative Work) and function as a team defining and implementing the aims. In speaking of project teams we define them as [68] "distinguished from the structure of the organization, commissioned for a defined period and consisting of specialists from various fields, whose knowledge and experience have a bearing on the problem". They are called into being as a result of the small efficiency of operation of the organization and the necessity of implementing project tasks. Today, in the era of innovative approaches to organization, the role of project teams has increased owing to the notable effectiveness of their operation. The majority of projects implemented today both in industry concept of manufacturing body cars and in the field of show business depend on the cooperation of groups of people working within the framework of project teams. The idea of project teams derives from the concept of the synergy of knowledge. For this reason also the results of team work are not commensurate with the results of the activity of groups that do not co-operate with one another. It is assumed that project teams can be of both formal and informal composition. The first are assembled to implement a particular task, while the second are often structures functioning within enterprises for implementing shared tasks. Project teams are brought together to implement a particular task. It is assumed that the project team is assembled by the project manager, whose task is to present to the team the aims of the team's operation. As a rule the team assembled is interdisciplinary, which makes it essential to apply solutions for consolidating the team (a variety of specialists, various visions of the aim, sources of conflict in implementing the tasks). According to Butler [10], another solution is to call into being an executive team to implement a task, create a system, or put a project into action. Typical executive teams are: problem teams (Work teams) summoned to assess a project, project teams (Project groups) assembled for a longer period of time to implement a longer-term task and advisory teams (Reference Groups). The aim of the advisory team is to manage the many project teams implementing partial aims. On the level ofinternational organizations there are also task forces such as steering groups (Steering Committee) whose members include representative-experts. Their role depends on directing large global and international economic enterprises. Another type of project team is the group with a particular defined aim (Task Force). A characteristic of these teams is the fact that they consist of a narrow group of specialists concentrated on carrying out a narrow task.

214

Cezary Orlowski

1.4. Models for assessing team and project processes

I

f I

Optimizing level (5)

I

(4)

I

Defined level

I

(3)

Repeatable level (2)

I

I

l i initial level (1)

Figure 2. Five levels of process maturity.

In this chapter, the presentation ofmodels for assessingteams and project processes is important with regard to recognising the formal possibilities of assessing project teams and processes implemented. The models CMM, SPICE and norms of the ISO 9000 type are presented. According to Paulk [126] they constitute effective mechanisms for assessing team and project processes and support the work of managers. The CMM model-Model of Process Maturity-should supply solutions making possible the control of project processes. Its application supports assessment of team management processes and defines their level of maturity. It also identifies critical elements of the process that affect the quality of the system being created. Details of the construction of the model are contained in the work of Paulk and Weber [55], while the former also contains information on its use [54]. The team is assessed on the basis of five levels of maturity of the project processes: initial, repeatable, defined, management and optimizing. The structure of the CMM model is presented in fig. 2. Assessment of the level of project teams is dependent on their method of implementing project processes and on the influence of the environment. For example, a team on the initial level is characterised by the lack of a stable environment. A team on the repeatable level is controlled by the agency of a system of information management (Management Software) [35]. It is distinguished by stable planning processes and by tracking the project, which means that it is properly managed and constitutes an integrated work environment.

Methods of building knowledge-based systems applied in software project management

215

Table 1 Key areas for the levels of process maturity Maturity level

Key areas of the process

Initial level Repeatable level

Lack of stable environment Management of project solutions Assessment of quality of programming Management of contract details Tracing and supervising project processes Planning project processes Management of demands Quality management Measurement of processes and analyses Management of changes in processes Technological innovations Protection against errors

Management level Optimizing level

A team on the defined level is characterised by the use of standard processes in creating information systems. It applies information systems to management of the project team as well as supportive design processes that create an integrated project environment [34]. It is defined as SEPG (Software Engineering Process Group)-a team making use ofinformation tools to define and support its activity by constantly training the work force, raising their skills in imparting knowledge. Members of the team define their own processes for specific types of projects under implementation. In implementing project processes their assessment is applied (Peer Review) in order to raise the quality of the information system [38]. A team on the management level defines quantitative criteria for assessingthe quality of the system being created. Productivity and quality become measurable values. Systems making use ofdatabases collect up-to-date information on the subject ofprocesses being implemented. Processes and products are defined quantitatively.

The functioning of teams on the optimizing level depends on the concentration of work around processes that ensure the possibility of their being constantly improved. Weak elements are sought out and strengthened as they appear. Innovative solutions are introduced, mainly based on new technologies. Key areas of the processes for the levels of maturity are defined (table 1). Thus the model CMM constitutes a solution whose nature is both qualitative and quantitative, which can be applied to assessingproject teams. The assessment indicates the level of maturity and at the same time the level of risk in implementing the enterprise. It is therefore an important indicator in team selection (level of maturity of the team), method of directing the team (key areas of processes) and exploitation of information technologies. The question arises, however, whether this model can be a pattern of conduct in defining maturity levels of teams and processes. In the course of creating a project consortium with the task of implementing project COMMODORE (financed from European Union resources), an assessment of the level of maturity of the organization was carried out by a potential coordinator on the basis of questionnaires issued to two project teams. This showed that:

216

Cezary Orlowski

implementation

Figure 3. M ethod of assessing project processes.

• identifiers are defined too precisely, which causes problems in referring them to the functioning of real teams; • in order to conduct assessment, considerable knowl edge and experience is needed in assessing teams on the basis of identifiers; • the assessment refers to the initial state of the team (before implementation of the enterprise), whereas its level of maturity may undergo change in the course of implementation ; • assessme nt of teams is carried out on the basis of states, and not the character of processes, e.g. whether it uses Cants's diagrams, not how it uses them, whether it tests user quality, not how it tests user quality; • problem s appear with quantitative assessment of the solution obtained on two levels of maturity. The SPIC E [29] model for assessing processes is an example of a solution of both quantitative and qualitative character for estimating the processes of creating information systems. It is applied in estimating project processes. The procedure is present ed in fig. 3. S PIC E can be used by organizations dealing with monitoring, developing and improving proje ct processes, covering [4]: • possibility of self-assessment of implement ed project processes; • assessme nt of proje ct implem entation environment; • creation of a set of methods for assessing processes (profile of pro cesses); • creation of conditions for directing pro cesses. The model may be used in th e work of project teams of varied sizes and implement ation capacities. It is accepted that the estimation of processes is based on their repeatability.

Methods of building knowledge-based systems applied in software project management

217

Table 2 Description of process categories Process category

Description of category

Customer-supplier-CUS Project ENG (Engineering) Project management PRO (Project)

Processes on which the user has a direct influence Processes that specify, implement and maintain the system Processes of creating the project infrastructure, co-ordination and management of resources Processes supporting other processes in the project Processes of assessment and support for the business character of teams as well as product improvement

Support SUP Organization

Initial assess ment of processes .ItS Aim of the project .ItS Scope itS Limitations .itS Possibilities of

Prat e s asse sment Too ls for process assessment .ItS Indicators .ItS Comparative scales Process models itS Process selection itS Verification

.ItS itS

Final assess ment Quantitative adequacy Process capaci ties

Figure 4. Elements of SPICE.

It is assumed that each implemented project process is characterised by certainty, weakness and executive risk. It may be assessed according to the stated aim, time of implementation and costs incurred, as well as the possibilities of its implementation and the project risk. The stages of assessment of the project processes (fig. 4) provide the initial and final assessment, in the course of which the model of project processes and information tools are used. Initial assessment of the processes takes into account: the aim of implementing the project, its scope, limitations and implementation capacities as well as definitions. The final assessment covers the level of adequacy in relation to the model processes and the possibilities of their implementation. Identifiers of processes are used for assessement, as are comparisons of real processes in relation to model ones. The SPICE model constitutes a complement to several other international standards of assessment ofprocesses, presented for instance in the work ofCrosby[12], Dion [16] and other models for assessing the capacities and effectiveness of teams and processes. In table 2 categories of processes are presented, while within the framework of particular categories the type of process is defined and codified as:

218

Cezary Orlowski

• PC (Process Category)-identifier of the process category; • PR (Process Number)-number of the process; • CL (Capability LeveD-level of capability; • CF (Common Future Number)-common identifier; • PT (Practice Number)-number shared with another process. As examples, within the category of project process (ENG) particular processes are defined:

ENG. 1 ENG. 2 ENG. 3 ENG. 4 ENG. 5 ENG. 6 ENG. 7 -

changes of requirements in relation to project processes; changes of requirements in relation to project tools; changes of requirments in relation to the system; application of project tools; integration and software tests; integration and system tests; system maintenance and programming.

According to the Base Practice Adequacy Rating Scale these processes are estimated using the scale: • N - inappropriate-its implementation does nothing to meet the aims of the enterprise; • P - partially appropriate-its implementation does something to meet the aims of the enterprise; • L -largely appropriate-its implementation does a great deal to meet the aims of the enterprise; • F - totally appropriate-its implementation entirely meets the aims of the enterprise. Analysis ofthe SPICE model shows that the division presented there into categories and processes is too detailed, which means that managers have serious problems in classifying the implemented project processes. Identification of processes, e.g., designing the rules of the knowledge base, enables them to be placed in one of the categories, e.g., ENG. It is clear whether they are ENG3 processes-changes of requirement in relation to the system, or ENG5-integration and programming tests. Application of the Base Practice Adequacy RatingScale also becomes complicated-e.g., partially appropriate or largely appropriate, while the presence of subjective judgement considerably influences the result, which means that the main criterion in conducting assessment becomes acquaintance with the method and experience in applying it. It therefore becomes essential to seek solutions that make possible a quantitative assessment of teams and processes. They should: • minimalize the aspect of subjective assessment of processes and teams; • limit the complexity of the method of assessment and adapt it to the level of the manager;

Methods of building knowledge-based systems applied in software project management

219

• create easy possibilities of implementation, taking account of the operative system and the project tools • be based on expert knowledge of SPM • fulfil the role of a storehouse of knowledge that is constantly updated by new experrence; • enable the scenario of action to be analysed.

It is worth emphasising that in practice team directors in situations in which implementation problems arise are not interested in applying methods for assessing processes but in completing the project in the set time and by the agreed means. 2. NEW POSSIBILITIES FOR CREATING THE SPM MODEL

The view often appears in subject literature [58] that SPM is more an art than a planned method ofaction. This view results form the fact that in the course ofproject work the decisions taken by the team leader are the results of processes that are difficult to plan, such as for example changes in the make-up of the team (the best programmer may be "bought up" by a rival firm). For this reason also many managers consider experience to be the main source of knowledge on enterprise management. They connect its success (completion within the set time, with the agreed means and previously established quality) with the ability to forecast changes and reactions to increased risk in completing the project. Others in turn, while not questioning experience, assert that it is not possible to plan, organise and control an enterprise without applying formal methods of procedure to management (system approaches). Today the top down approaches described by Budgen [8], Sommerville [67] and Yourdon [75] are used. According to Ganea [22], in top down approaches to modelling information enterprises, the task division is a function ofboth the particular features ofthe given enterprise and the standards ofwork accepted by the programming firm. This is connected with models of programming construction that define the methods of implementing project tasks [30], e.g., the cascade model, prototyping, incremental implementation, spiral model and formal transformations. Besides the model ofprogramming life cycle, another solution that makes use of the top down approach is diagnostic analysis, presented among others in the work of Kusiak [42]. Two phases of its application can be distinguished: analysis of the existing state and definition of the anticipated state. It is used in designing organisational, technical, economic and social systems. A conceptual approach is represented by prognostic analysis [52]. The procedures embrace: justification of the aim of research and a synthetic stage that contains a working out of the concept of the system and an analysis of individual elements on this basis. Methods based on selection and reduction of variants of the solution are used, ensuring the adaptation of the model to the conditions and limitations of the enterprise. System analysis from the cybernetic aspect [37] makes use of reversible spring techniques. It is used in analysis oftechnical systems [53] (ofthe SCADA type, diagnostic, advisory) and organisational ones (of the SWD type-decision support systems).

220

Cezary Orlowski

The initial stage requires detailed description of the system, after which an attempt is made to describe it mathematically. Mathematical system analysis [42] is used mainly in building models ofthe black box type, in which the relationship between input and output parameters is analysed by means of the operator Ax [69] next a detailed identification of input parameters is conducted. This may take the form of Vetter's linear integral operator. The methods of assessing projects in fuzzy categories (Fuzzy Projects) discussed in the work of Slowinski [66], Weglarz [72], Hapke and Jaszkiewicz [26], are also used. These are methods of scheduling projects by metaheuristic algorithms, that is genetic/evolutionary algorithms, simulated annealment, searching out taboos. Such methods are also used in problems ofcontinuous and non linear, aswell asin problems of combination optimizing. The multifaceted variation ofthis algorithm (Pareto-Simulated Annealing-PSA) makes it possible to search for representations of competitive (not dominated) solutions. It is also possible to use the interactive search method (Light Beam Search) [48] to support the work ofthe manager in choosing one ofmany solutions in the area found by PSA. The formal, many-criterioned, static, dynamic and informal methods presented in the work ofBubnicki [7] and Ghezzi [23] are also used. A different group ofsystem models are the models of management in uncertain conditions presented by Pawlak [56] and Kacprzyk [31]. Their characteristic feature is incomplete or absent information. Examples of types of such models are relational, probability, game and fuzzy. Relational models are characterised by definition of the dependency between conditions and results. In statistical models of the probabilistic type of the breakdown of probabilities used in decision-making or selection of factors is analysed. Game models make sue of game theories to assess members of the team, decision-makers or coparticipants in decision making. Peled presents a model of certainty of solution [57]. Fuzzy models enable decisions or management processes to be analysed in situations in which an algorithmic description is impossible, but expert assessment are applied [49]. The work of Krawczyk and Mazurkiewicz [40] presents the creation of applications by the use of a method supported by heuristic techniques and information tools (Borland c++ Builder). This method is based on a conceptual model applied to a skeletal application which is then implemented. This fits into the concept of project patterns (Design Patterns) presented in the work of Buschmann and others [9]. An object approach is used in analysing and specifying as well as in implementing the system, which covers elements with "independent concurrent units". For what remains, the activity and methods of their use are defined: • method of making a component of a system-program, library, types of function; • inter-accessibility-method of communication between component and user; • functions used by other components. Projects and project groups are used. Projects enable elements to be built on a modular basis, while groups support the compilation process, making it possible also to use files which can be a base for creating components in other environments.

Methods of building knowledge-based systems applied in software project management

221

2.1. Use of modelling and simulation theories

Conducting modelling processes demands definition of basic concepts appearing in modelling and ofthe relationship between the model and the real system. According to Cross [13], the aim of modelling is to examine "the relationship between real systems and models". The modelling relationship concerns the relevance of the model, that is its correspondence with the real system. The degree of this correspondence is measured on three levels. One: the model has repetitive relevance, if the data it generates correspond to the data obtained earlier from the real system. Two: the model has predicative relevance, that is the correspondence of the two groups of data can be assured before obtaining data from the real system. Three: the model has structurally relevance, if not only duplicates the observed reaction of the real system, but also faithfully reflects the way the real system works. The basic framework of the modelling process covers: • informal description of the model (with the use of the following notation: natural language, diagrams, support techniques-semantic networks, frames) with the aim of defining interaction of elements and descriptive variables; • formal description (structural or object) according to the following categories: time-continuous and discrete models; values accepted through chance variablesdiscrete, continuous and variable; chance variables-deterministic and stochastic; category of model's effect on the environment (lack-autonomous) and (effectnon-autonomous) [21]; • implementation of the model with the use of information tools. In system modelling hierarchical approaches are used with the following stages (fig. 5): • description of the real system; • description of the structure of the experiment; • definition of the basic model; • acceptance of the integrated model; • implementation of the model with the use of information tools; • testing the results obtained; • assessment of the model. The real system is defined as the source of data; it may also be defined as a natural, artificial or mixed system, analysed in categories of observable and non-observable descriptive variables. Observable variables include input and output variables (the result of the operation of input variables). The structure ofthe experiment involves a collection ofdata (subsets ofinput-output relations), in which the real system-management on the project team level-can be described. The basic model represents all existing input and output of the real system (within the framework of the experiment's structure) and should supply essential information about the reactions of the real system.

222

Cezary Orlowski

Figure 5. Hierarchical approaches in modelling.

The relevant integrated model arises from a simplification of the basic model. 2.2. Application of fuzzy set theory Fuzzy models of the linguistic type (Linguistic Models LM) are models with a set of rules of the type IF-THEN of underfined conditions and fuzzy conclusion drawing. They are discussed by Yager [74]. In turn, models containing logical rules with fuzzy condition and functional conclusion are presented in the work of Tong [70]; they carry the name Takagi-Sugeno-Kanga (TSK). The most commonly used model is Mamdani's model [47], describing a real system with the help of linguistic rules. The example below presents the process of fuzzy modelling for a case involving two input variables and one output variable. The rules are:

(3)

where:

A, Bj , Ck - fuzzy sets u1,

U 2, Y - input and output variables i, j, k - quantity of fuzzy sets

Methods of building knowledge-based systems applied in software project management

223

The Takagi-Sugeno-Kanga (TSK) models were also presented in the work ofKlir [36]. They are also known as quasi-linear models or fuzzy linear models. The TSK models differ from Mamdani's model by the form of the rules. IF (Uj is A,) AND (U2 is B i ) THEN (y is f(Uj, U2))

(4)

where:

f (u1, U2)

marks the function of the output variable oflinear or non linear form.

Relational models were worked out by Pedrycz and presented in the work of Piegat [59]. It is accepted in these that fuzzy rules are treated as partially true. The appropriate co-factor of trust is subordinated to them. The theory of relational equations is used in identifying the bases of the rules. The global and local fuzzy models presented in the work of Dean [15] refer to the conditions in which global space is divided into local spaces in order to obtain a high degree of accuracy. Here both Mamdani's model and TSK are created. The basis for building fuzzy models is the fuzzy set, which is used to assess physical size, states of the system and properties of the objects [76]. We describe as fuzzy sets (Ai), sets of pairs: (5)

where: f-LA i ( U l ) is a function the value

Uj'S

membership of the fuzzy set Ai.

Linguistic variables represent a type ofinput, output or state variable, e.g., state of management of an information enterprise. Linguistic value is a verbal assessment of the linguistic variable (example for the variable described above: adequate, inadequate). Examples of fuzzy numbers are: around zero, more or less 5, a little more than 9, somewhere between 10 and 12. In turn, the linguistic space oflinguistic variables (Linguistic Term-Sets) is a set of all the linguistic values applied in assessing linguistic variables. Membership function /LA i (u 1) realises the reflection of variable value u 1 to [0, 1]: (6)

Examples of the membership function for set Ai are presented in fig. 6. The number of pairs (f-L Ai (u1), U1) appearing in the set is called the power of the fuzzy set

IIAill = n

(7)

224

Cezary O rlowski

Figure 6. Exampl es o f membership function.

F ZZI FICAI FE REI CE TIO Jl AI (II \ ) " 2 -

. I mbership functi n building for input value II,.U,

-

Jl B . (II 2 I

•

Rul bend n to rship functionhuilding for output value

DEF ZZI FIC T IO Cal ulating cri p valu u ing mem hip function

y

.v

Figure 7. Pro cesses of fuzzy m ode lling for a case invo lving two inputs and o ne output.

Th e processes of fuzzy mod elling for a case involving two inputs and one output are presented in fig. 7. The fuzzy modelling processes (for two inputs and one output) include: fuzzification, inference and defuzzification. In fuzzy processes for crisp values (u I , uz), constituting input to the mo del, their degree ofmembership offuzzy sets (Aj , Bj ) is calculated . A condition of implementing fuzzy processes is definition of the membership functi on (/LA;(lit), /L B, (liZ)) of fuzzy sets. ln conclusion drawing processes the memb ership function for the output value (/Lck (y)) is calculated on the basis ofthe input degree ofmembership (/LA; (III), /LB; (liZ)) ' C onstruc tion of th e membership function for the output variable (y) takes place in th e following stages: • construc tion of the rule base; • activation of the concl usion mechanism ; • definit ion of the degree of membership for the output value of the model; • calculation of the crisp value for the output value.

Methods of building knowledge-based systems applied in software project management

225

Analysis of the structure ofthe fuzzy model and mechanisms offuzzi±ying, inference and defuzzifying reveals the considerable influence ofthe kind of operators used on the accuracy of the fuzzy model. It has been demonstrated that in the case of self-tuning models the selection of operators has less significance because of the model's learning processes. If we use untunable models, the influence ofoperators is considerably greater and cannot be compensated for in any way. According to Driankow [17], it is then necessary to use the trial and error method in the processes of selecting operators. According to Gupta [25], the indicator for applying various operators is the frequency of their use. Knowledge of the use of operators is the important in so far as in the course of constructing the model it allows preliminary estimates to be made of their effect on the accuracy of the model. 2.3. Application of elements of fuzzy regulator theory

On the basis of Drucker's work [18] it can be agreed that management processes include: planning, organization, motivation, monitoring and decision making. Modelling such processes requires knowledge of theories of steering and modelling mechanisms of feedback mechanism as well as definition of the object and regulator of steermg. The general form of the steering law for the arrangement presented in fig. 8 is as follows: (8)

where:

Jl - steering function, fuzzy rules t - time

in the case of fuzzy regulators described with the help of

where: 5 -

denotes the object of steering

w - value set

y - reaction of the object at output

c - steering regulator u - steering signal

e - error (e

=w

- y)

By the concept of regulators of the FLC type (Fuzzy Logic Control) we understand the steering law in the form ofrules ofthe IF- THEN type, with fuzzy conditions and steering mechanism based on fuzzy logic, the following are used in modelling them: • expert knowledge of system operation, definition of its function and the construction of an informal model in the phases of analysis and synthesis, modelling and production;

226 Cezary Orlowski

Figure 8. Block outline of the steering arrangement with feedback mechanism.

• experience of knowledge engineers and experts in creating and implementing a formal model-simulation and quality management methods are used; • decision techniques, agent systems [62], design patterns [9] and shell type environments [33]; • measurement data of system input/output (self-organising models); • measurement data of system input/output (self-organising and self-tuning models). A definition of the self-tuning model was given in the introduction, while the selforganising model is understood as characterised by ability to define the optimal number of rules, their form and the fuzzy sets [17]. Rules constructed with the use of expert knowledge are an example of solutions that integrate open and hidden knowledge from the content area. A drawback is the considerable influence of the expert's subjective judgement on the form of the rules representing open knowledge. Sometimes two different models of the same system arise. The selection of the parameters of the membership function minimalizes the model's error with regard to the system. The selection of the assessment criterion (size of model error) depends on the modeller, who accepts: average real error and maximum error. For the remaining cases the number of rules and fuzzy sets as well as the input and output data are automatically selected in the course of modelling. The construction of a self-tuning and self-organising model, treated as a dynamic fuzzy regulator, involves the following stages: • analysis of the method of modelling appropriate to building the fuzzy regulator; • analysis of fuzzy steering; • design of dynamic fuzzy regulator; • construction of a model of the fuzzy regulator; • denotation of linguistic variables; • construction of the knowledge base; • tuning. The steering function 11 in the FLC regulator for the dynamic input/output model is described by a rule base of the form: IF u, is B 10 AND u,_! is B ll AND ... AND Y'-n is AI" THEN y, is A 10

(9)

IF u, is B zo AND u,_! is B Zl AND ... AND Y'-n is A Zn THEN y, is A zo

(10)

IF u, is B mo AND u,_! is B m l AND ... AND Y'-n is A 3n THEN y, is A mo

(11)

Methods of building knowledge-based systems applied in software project management 227

where: B lO , B l l , ••• B l l1 -input fuzzy sets, A lO , All, ... A111 - output fuzzy sets, Ut-l - input values, Yt - output values, t - time. In turn, tuning processes will involve: • denoting FLC parameters and scaling co-factors; • denoting the knowledge base for the regulator; • constructing the membership function; • minimizing the model error, e.g. absolute average error (algebraic difference between comparative value, obtained on the basis of the model, and the result of the measurement of the measured size in relation to the number of measurements). In some work [59] account has been taken of application of the following tuning processes: fuzzy neuron networks, searches, clusterisation, unfuzzy neuron networks and heuristic methods. Unfuzzy neuron network methods are based on transforming the fuzzy model into a fuzzy neuron network [73] and using measurement data in the processes of learning the network with the aim of tuning it. Search methods depend on using organised and unorganised forms to tune the model [63]. Clusterization methods depend on grouping the results of measurements in clusters and subordinating their centre of gravity to the apexes of the membership function. Unorganised forms are trial and error methods. An example of organised methods could be genetic algorithms [631. Unfuzzy neuron networks, like heuristic methods, are rarely used in model tuning processes. Other steering laws can be described with the help of ordinary differential equations and partial equations. These are continuous or discrete dynamic arrangements of concentrated or diffuse parameters, stationary or non-stationary [39]. The presented survey ofalgorithms ofsteering fuzzy regulators shows the possibilities of implementing them both in social and in technical systems after earlier definition of the steered object and steering regulator. The selection of a dynamic steering regulator and the methods of describing it are dependent, however, on the type of input and output trajectories of the system. In the author's earlier work [51] possibilities of employing fuzzy regulators in building management models were presented. 3. EXAMPLE OF BUILDING A FUZZY SPM MODEL

In this chapter it is accepted that fuzzy models are constructed using knowledge of SPM and ofenterprise modelling. In the first case this knowledge is obtained from managers, while in the second from specialists in the field of system modelling. In the first place knowledge concerning SPM is presented. The applied formal methods of describing it and the possibilities of using it in building a fuzzy model are discussed. On the basis

228

Cezary Orlowski

of the collected knowledge of SPM and the solutions from the area of modelling that are capable of application, a concept of the structure of the model is presented. It has been assumed that the strategic approach for the modeller was the selection of methods of formalising the knowledge possessed by the manager with regard to the necessity of describing the complex socio-technical system that is SPM. Problems connected with the selection of formal methods appeared quickly in the course of the initial documentation monitored by the instigator ofthe enterprise, when it turned out that accuracy of description of often unique management processes, if their number is considerable, loses its meaning and does not lead to a full description ofthe system. The known statement of Zadeha [76], co-creator of the theory offuzzy sets, hence suggests itself "if the complexity of the system increases, then our ability to formulate accurate and also meaningful views of its behaviour decreases, until we reach a threshold value, beyond which precision and meaning become almost mutually exclusive features". Making use of complex mathematical apparatus gives the possibility of precise description of repeatable processes mainly for technical systems (mechanical, electric), whereas in the case of social systems a departure from such precise description is suggested [46], with the use of approaches based on fuzzy logic [48]. Therefore also in the case of modelling socio-technical systems such as SPM, the apparatus of fuzzy set theory may be used both in processes of formalizing knowledge and in adaptation of the model [68]. Then the construction of the fuzzy model of SPM (fig. 9) can be treated as a process of continuous modelling with the help of fuzzy algorithms. The idea behind this concept is to distinguish two areas: managing the enterprise and modelling it. The manager within the framework of the structure of the experiment, understood here as a set of management processes on the level of the project team in the course of implementing the information enterprise, provides the modeller with information. This knowledge concerns: the structure of the model created (number of input and output variables and the fuzzy sets characterising each of the variables, the form and number of the rules) and its parameters (the apexes of the membership functions) (fig. 9). The enterprise modeller transforms data obtained from the manager on the linguistic level and applies the apparatus of fuzzy set theory. The application of modelling on the linguistic level results from the effectiveness of transforming data obtained from experts [45]. The data concerning the parameters of the model are supplied by the modeller in the process of tuning. Next, the processes of adapting the parameters and structure of the model are carried out, steered by the manager, who while directly influencing (broken line) the modeller also affects the structure and parameters of the model, according to the criteria of experimental and model correspondence [48]. 3.1. The concept of model construction

Keeping in mind the accepted assumptions on the necessity of building a fuzzy model by exploiting the knowledge of the manager of the enterprise, the possibilities of making use in constructing the model of the knowledge of the enterprise manager (expert knowledge, methods and models) and of the modeller of the enterprise (basics of modelling and simulation, theory of fuzzy sets and regulators) have been presented.

Methods of building knowledge-based systems applied in software project management 229

1-

-

-

_.-

[

J

Software project management

Structure of the experiment

I Modelling theory

I

I

tructure of themodel

DATA :

>

J

Parameters

of the model ~-'"

Structural modelling on the linguistic level

Tuning model parameters

~.~

"'-~.

. . ....

-"

~

.

- . ~, ~ Adaptation?

;';-l Parameters l-

'- ,

~"'"

-._~

; Structure s

Area of management

I

Area of modelling

I

Figure 9. Continuous modelling as an open concept of system approach in building a fuzzy model of SPM with the help of fuzzy algorithms.

The enterprise manager's knowledge concentrates on recognising problems, describing experiences and methods and models applied. We have indicated the problems of SPM when the manager is inexperienced, or when the methods and models are applied to managing teams and processes that concentrate mainly on assessing teams and processes, on economic assessment, on assessment of time and project resources as well as knowledge. Appropriate examples have been adduced in support [29]. In the cases of the modeller of SPM, knowledge is collected concerning foundations of modelling and simulation processes, fuzzy set and regulator theory essential in defining the concepts and processing appearing in the course of modelling. The manager's and modeller's knowledge enables the processes of modelling SPM to be conducted with the use of an open concept of system approach in modelling SPM. This concept is presented in fig. 9, taking account of: • preparation of data concerning the structure and parameters of the model; • structural modelling on the linguistic level; • tuning the parameters; • adapting the parameters and structure.

230

Cezary Orlowski

Data concerning the structure of the model

On the basis of the manager's knowledge it is accepted that SPM will be implemented on the level ofthe project team SPMz in four areas: management ofknowledge, project processes, infrastructure and supporting technologies. It is also accepted that SPMz will be analysed according to the phases of the enterprise, while the scope of activity of the manager will concern planning the selection, organization and monitoring of application ofinformation and management methods and tools (MNliZ). With regard to the dynamic changes in MNliZ the use of concepts of variable states in describing them is planned and an expert assessment of them according to the practices applied in SPMz is assumed. Three areas of exploitation of MNliZ are defined (for each field), as shown on a linear scale in fig. 10. These scales will be constructed for the previously given four areas of management both for methods and for tools of information technology and management. Two layers have been marked on each scale.

I

ta yer I

Scalar values for M liZ

[%1

METHODS

,

0

ta yerU

Range of MNliZ

i

so

I

,.

TOOLS

100

BBB

Figure 10. Method of creating a linear expert scale to define management states.

Layer I is described by three identified values: scalar states and methods of information and management and scalar states of information and management tools. They are calculated as a weighed value both for methods and for tools of information and management on a scale from 5% to 100% for methods and from 0 to 100% for tools. It is accepted that the choice of a scale for methods incorporating a 5% value results from the fact that it is impossible not to manage an enterprise (0%). It is accepted that the values of the co-factors appearing in calculating scalar states of methods and tools of management depend on: • in the case of managing infrastructure and knowledge-on the number of team members who apply MNliZ in relation to the number of team members at a given stage (ks-composition co-factor);

Methods of building knowledge-based systems applied in software project management

231

• in the case of managing processes and supporting technologies-on the number of implemented processes in which MNliZ is applied in relation to the overall number of processes implemented at a given stage (kp---process co-factor). On the basis of the sum of scalar states of information and management methods and scalar states of information and management tools a generalised scalar state of management of knowledge, infrastructure, project processes and supportive technologies is defined. This is a sum that takes into account the influence ofmethods and tools, represented as weights (a 1, a2, f31, f32, Xl, X2, 8 1 , 82 ) , It is accepted that in calculating generalised scalar states of management weights are incorporated in which the values are established on the basis of expert assessment. Layer II covers the fields of the MNliZ used (which are obtained from experts on the basis of the best practice in managing information enterprises) that have an influence on its planning, implementation, monitoring and decision making.

Example Method of using the scale to define thegeneralised scalar state

of knowled};e management

If the manager ofan enterprise obtains knowledge by means of direct talks with experts and formalises them with the help of diagrams or a rule description, the application of this method has considerable influence on planning the obtainment of knowledge, on the control of its obtainment and the taking of decisions concerning the obtainment of knowledge. For this reason also, the idea of the previously given concept is to separate the area of knowledge management, in which for the given scopes of information and management methods used a scalar state of information and management methods is defined, as well as a scalar state of information and management tools. For the method of direct talks with experts the scalar state of information and management methods is defined at 5%, formalisation of knowledge with the help of diagrams at 50% and rule description at 100%. The scalar state of information and management tools is similarly defined. Correct values are connected with the use of the co-factor kp resulting from the number ofteam members employing the given methods and tools in relation to all the members of the team. Next, the generalised scalar state ofknowledge management, being the sum of both values with the weights taken into account, is calculated. The use in the work ofproject teams ofnew or modified MNliZ involves employing financial resources defined further as resources. The manager, in employing new or modified versions of existing MNliZ, pays attention to the current generalised scalar states of management and plans resources and the implementation time for a stage of the enterprise in relation to the schedule and budget. Next, after introducing MNliZ, he assesses the resources set aside and analyses the task implementation time with the use of MNliZ. He also takes into account both the generalised scalar states of management and the planned and real resources as well as the time of implementing the stage of the enterprise. Analysis of the relationship between the exploited MNIiZ and the generalised scalar states of management as well as resources and time set aside for implementing the stages of the enterprise raises a question about the reversibility of

232

Cezary Orlowski

the function, for which the arguments are the generalised scalar state of management and the values: real resources and implementation time. This function is reversible for the arguments: generalised scalar states of management and the values: resources, implementation time. On the basis of the values of resources and time one cannot however define the scalar states of information and management methods and the scalar states of information and management tools. In constructing the concept of the model, solutions from the field of management were ignored, such as "hand steering" the composition of the team. Equally irrelevant are so far identified phases of the enterprise (definition and specification of demands, construction of the model and its implementation), because the manager initially chooses MNIiZ for all phases of the project and conducts team training (if necessary). It is also accepted that the manager is a specialist in the field of information science and can select, exploit and assess information solutions applied in enterprise management; he is not simply responsible for selecting the team and supervising their work. Data concerning theparameters

of the model

A concept of a self-tuning fuzzy model of permanent rule structure has been accepted, making use of expert knowledge SPMz . The team manager collects knowledge in the knowledge base for the experiments, recording them in the form of rules whose structure corresponds to the rules of the fuzzy model. It is accepted that the experimental knowledge base is classified with regard to the type of enterprise (e.g., successful, unsuccessful). An expert in enterprise management, the co-ordinator of the enterprise or team leader conducts this classification. Successful enterprises are defined as completed in the given time, with the agreed resources and implemented aim. By using this kind of solution we avoid "averaging", that is creating a useless model. The rules recorded in the experimental knowledge base will be grouped according to the phases of the enterprise. Such ordering influences the method of identifying clusters and creates the conditions for calculating the co-ordinates of the centres of gravity of the clusters, identified later as co-ordinates of the apexes of membership functions. Adding new rules to the experimental knowledge base will effect a change of position of the centres of gravity of the clusters, and in consequence, of the apexes of the membership function. In the concept of the model the possibility of its selflearning is assumed, accepting the idea of selecting changes of position of the apexes of the membership function with the addition of new rules, earlier classified with regard to degree of certainty (defined as the ratio of the degree of membership of the variables to the rule analysed). The concept of introducing new rules to the experimental knowledge base is presented in fig. 11. In the conception of the model's construction it is also assumed that in the processes of tuning the model, knowledge will be exploited that covers management of project teams implementing information enterprises through international consortiums. The sources of knowledge will be: documentation of the enterprise's implementation and the knowledge of the co-ordinators and leaders of the project teams making up these consortiums.

Methods of building knowledge-based systems applied in software project management

233

ADDING NEW RULES TO THE EXPER IMENTAL KNOWLEDG E BAS E

CALCULATL G Til E DEG REEOF CERTAINTY OF RULES

CALCULATIN G CEN TRE OF GRAVITY OFCl USTERS

MO DIFYING MEMBERSHIP FUNCTION

Figure 11. Procedures of introducing new rules to the experimental knowledge base.

Structural modelling on the linguistic level

A four-stage construction of the fuzzy model has been proposed (fig. 12). The first is analysis of the real system. It covers management of the project consortium consisting of several or more project teams. In this chapter, this structure of the experiment is called a hierarchical model. This concept is introduced in the desire to obtain a hierarchy of management levels for the project consortium and project teams. Next, a model is obtained, referring to management on the level of the project team SPMz. According to the theory of modelling [168], this corresponds to the basic model. In the conception of this chapter, creation of a fuzzy model has been assumed (according to the modelling theory-an integrated model), formalizing management processes with the help of fuzzy rules. Tuning the model parameters

We have proposed a concept of tuning the model that embraces construction of the membership function according to the phases of the enterprise for input, state and output variables. The membership functions will be tuned for input and output variables, while for the state variables, permanent membership functions are proposed (the apex-parameters of the functions and their form will be established). The construction of the membership function will be conducted with the application of data from implemented information projects, with the use of clusterizing methods. Adaptingtheparameters

It is planned to conduct adaptation processes on two steering levels: direction of the work of the project team SPMz (the object of steering) and adaptation of the model SPMz-RFM (Software Project Management-Rule Fuzzy Model) on the level of the steering regulator. On the first level adaptation will depend on the selection of "better" solutions of higher scalar value of generalised management states, while on

234

Cezary O rlowski

FUZlYMODEL

Figure 12. Stages of construction of the fuzzy model.

the second , on qualification of the outp ut variables by the team leader according to the quality ofteam management. It is necessary to emph asis that there is a great variety of meth ods of implem entation and levels of quality in managing th e project teams implementing information projects, which leads to an over "averaged" , and therefore useless model. For this reason in the concept of adaptation on the level of the steering regulator, it is recommended that pre-selection of whole enterprises be carr ied out by an expert SPA1.z according to quality levels in team management , e.g., on "successful" and " unsuccessful" projects. T his will influence the classification of input and output variables to appropriate models and the position of the apexes of the membership function. The method of modelling and adapting the mo del presented in this work SPMz-RFM: (1) allows various (established by the expert) types of model to be obtained; (2) creates conditions for the enterprise manager to use the proper mod el appropriate to his knowledge on the level of managing the team actually implem enting the inform ation project. Adapting the structure

It is accepted that the model obtained will be a compl ete model. In connection with this, adaptation of the model structure is not assumed.

Methods of building knowledge-based systems applied in software project management

235

3.2. Construction of the model

The proposed model relies on knowledge-based solutions and the theories of dynamic systems and fuzzy sets [39]. A detailed design of fuzzy models takes advantage of the experience ofmanaging two environmental projects. Data from these applications have been utilised in tuning the fuzzy model using knowledge-based rules and membership functions. Knowledge from a third environmental project has been applied to verify the system, by means of self-tuning mechanisms. Within this chapter, the symbol SPMT means generating a set of solutions, consisting of IT methods and tools, for a given project Team and SPM p generating a set of solutions, consisting ofIT methods and tools, for a whole consortium. The results of this work for Software-Project Management of Teams based on the Fuzzy-Rule Mechanism will be referred to as the SPMT-RFM system. 3.2. 1. Fuzzy Models of Knowledge- Based System for Software Project Management

The starting point for a models creation procedure has been an evaluation of a real system from the experimental perspective. Firstly, appropriate formal models have been developed and next, hierarchical and structural models of the project team have been constructed. With the project team in mind, the analytical (dynamic) integrated model has been developed. In order to build a useful model, elements of the fuzzy control theory have been used in the form of fuzzy rules of the Mamdani type. The result is a matrix and vector model with a fuzzy sub-system. The completeness and consistency of the fuzzy-model rules have been verified. Dynamic state variables have been introduced to define (temporarily) fuzzy values of the states of management. 3.2.2. Hierarchical model

The hierarchical model presents the hierarchical structure of management: whole project consortium and teams. The structure of the hierarchical model is given in fig. 13. It has the following levels of management (assigned to the respective management functions) : • Project co-ordinator: decisions made after comparing the scheduled and budgeted tasks with their actual status; • Project team manager: planning based on evaluation of the IT methods and tools used in the process; • Project team manager: decisions to change the IT methods and tools; • Project team manager: introduction and follow-up on the use of proposed solutions and their evaluation. Preliminary forecasts are made to support the decision-making system at the level of individual project teams, built using the model SPMz-RFM, the opposite ofthe project team management system SPMz. The decision-making support system generates actual increases in resources and time for further evaluation by the team manager. He decides

N

""e-

Innovations

,

(n(ereneed change sin lhe money and lime

- t

DECISION SUPPOR T SYSTEM BASED ON TIl E INY ERSE MOD EL OF SPM T

PRELIMINARILY FORECASTED INCREASES

ESTIMATIO:-; OF TilE FOHECASTEDIJ'iCHEASES IN METIIODS ,\ 110' 1) TOOLS

~

Sub-le vel I

SPMz

SPMp i

I

Q

~ C

~ E i; ....,

III ~

~ >.

c~

I =~ :;

--

I~ ]

13

TI-+

Tea m management

Sub- leve llV

of foreca sted

Sub-levelll

.~ f_e~ti~~.l~~~. _ . _

" - " -" - " -" -" - Sub -Ievelll

of comparisons

Real project team Processes

....

System of comparisons _._ ._

Figure 13 . H ierarchical m od el of SPM p w ith emp hasis placed on team mana gement.

u

~

~

~

~

~

=

c

~

~

=

~ ~

~I

Methods of building knowledge-based systems applied in software project management

237

how to use the time and resources and introduces his own technology innovations (seen as changes to the methods and IT tools used) and lets the team use them. As the work progresses, the increases suggested by the manager continue to be modified to reflect any changes. The changes result in actual increases in the time allocated for project tasks and resources for completing them; those are measured in stages using a measurement system for the individual teams. 3.2.3. Structural model

Let us now concentrate on modelling a system that is a single project team. Unlike the above hierarchical model (which we use to describe the developments within a whole system ofsoftware project management) obtained in the course of the process of model formalisation and description, the team management model performs a purely analytical function. It plays a key role in our synthesis process of the decision support system, which is meant for SPMz-RFM. The structural model that describes project management at the project team level, focuses on data preparation procedures and preliminary data processing for a dynamic subsystem, yielding data for the hierarchical system of SPM p on its 4th level (concerning SPMz). The whole project management is synthetic. This means that particular SPMz decision support systems are used to assist the general management processes at its team (4th ) level with a view to fulfilling a superior task of optimizing the management of the entire project. The structural team management model reveals a (module-based) structure of the formal (analytical) model of SPMz-RFM. As shown in fig. 14, it consists offour main areas: input data, preliminary processing of input data, dynamic SPMz model and output data. The model has references to the 3rd and 4th level of management in the hierarchical model (fig. 13).

Figure 14. The structural model of the SPMz-RFM (invert to SPMz).

3.2. 4. Integrated model

In the processes of integrated model design, simplification procedures have been used that eliminated descriptive variables (e.g. consortium level management) and grouped some elements (IT methods and tools according to the states of management and resources according to a project phase).

238

Cezary Orlowski

The discrete time analytical integrated model shown in fig. 15 (compare fig. 14) describes the input variables (forecasted increases in the money and time), formal team-management sub-models (dynamic and static), state variables (knowledge, infrastructure, supporting technologies, and project processes), output variables (referring to actual increases in resources: project money and time).

tatic parr

! i

/

Resources changing

"

R,~ ,.'

Iew states of

,/

. . _...•. ,. "

~

\

&1 ;

'"

management

XI ;

: R...

: q.1

....

.,.

9

pr¢~.iouS state

\'.'

of-management

-....

"

~

X t-I ' -

Figure 15. Formal, integrated, analytical model of SPMz-RFM describing the team management with the use of the state and input variables.

By introducing formal symbols of the data (variables) and operators (discrete functions), the model of SPMz-RFM (the fourth level of the hierarchical model given in Fig. 13) can be described as follows. The preliminary input data include technological innovations (the preliminary forecasted increases in the IT methods (t.mt) and tools (t. T,)), as well as the preliminary forecasted increases in the money (t. 5t) and the time (t.[,). Note that resource increases originate from the changes in the IT methods and tools generated in the 3rd level of the system (Fig. 13). The project phase plays the role of a decision variable 1, which reprograms the DSS used by project managers (Fig. 13). Preliminary processing also performs an analysis of resource increases, respectively these increases are derived from the project resources needed to implement or modify the IT methods and tools. The conversion of the input variables is done by using the function RIjJ-forecasting of preliminary increases resources vector t.g,. The variables of the changes of the methods and tools are aggregated in one vector of technological innovations t. vt , which is next converted

Methods of building knowledge-based systems applied in software project management

239

to the vector of the previous increases of the management states (~Xt-l) using the function of the state of management changes R rr. Within the dynamic sub-model, the forecasted increases resources gt and the new management states x, are aggregated using a function R R allowing the determination ofthe actual increases: in money (~s t) and time (~c t) that prove to be necessary for the project implementation. A function R x , called a state transition function, shows the transitions of the management states during the project. As presented in fig. 14, resulting from suitable decomposition procedures, the integrated model ofthe project management at the team level relies on the above described variables, including the static and dynamic states, as well as on the characteristics of the static R\(J, R rr, RR and dynamic R x functions. As a result of our analysis, we propose to treat the model of SPMz-RFM as an integrated vector-matrix entity. Principally, the structure of this model includes the static and dynamic parts (sub-models). As shown in fig. 16, within the dynamic part containing two state-space S-S mechanisms, we have a classicallinear state-space sub-system (one of the S-S mechanisms) and a fuzzyrule F-R sub-system, including a F-R mechanism (static function) and a dynamic S-S mechanism.

Static part

Dynamic part

(sub-models)

(sub-model)

F-R

sub-system

sub-system X,

Figure 16. Integrated matrix-vector model of SPMz-RFM.

Thus, in general, this matrix-vector model of SPM[RFM covers two areas distinguished in the structural model (fig. 14, as well as fig. 15) as the static (pre-processing)

240

Cezary Orlowski

and dynamic parts. The black border in fig. 16 separates the static and dynamic submodels, while the dotted line divides the dynamic part into the state-space and fuzzyrule sub-systems. 3.2.5. Tuning of the Fuzzy Model

The developed model of SPMz-RFM, which has been briefly described above, has all but one ofits elements established. Nevertheless, it is the fuzzy-rule (F-R) mechanism, included in the F-R dynamic sub-system of the full dynamic integrated model of SPMz-RFM, that needs a parameter tuning procedure. Thus the process ofoptimization (converting only the fuzzy-rule mechanism) should have two stages: • development of the rule descriptions for an experiment (using data from software projects performed in reality); the set of these descriptions will be referred to as the experimental knowledge base; • design of membership functions based on the experimental knowledge base. Two IT environmental projects were utilised to supply data for our knowledge base. Information on how these projects were managed has been acquired by using inductive methods of "machine learning". The particular sources of this knowledge were the following: • a documentation of the considered IT projects including descriptions of work packages; • an expert evaluation of the effects of project realisation and completion, as well as the project reports (including own materials, website publications, notes ofco-ordinators, etc.). At first, the number of rules for a FIRST PROJECT was defined as: n

=k x I =4 x

12

= 48

(12)

and in the case of a SECOND PROJECT the number was: n

= k x I = 10 x

12

= 120

(13)

where:

k - is the number of teams. 1- means the number of reports. 3.2.6. Adaptation of the model to the needs of newprojects

3.2.6.1. THE SPM-RFM MODEL AS A SUPPORT FOR SOFTWARE PROJECT MANAGEMENT. The project managers of selected teams decided that a THIRD PROJECT is of the same type (a similar subject, and the type of management) as the two previously considered projects. Therefore the previously tuned SPMz-RFM model (the fuzzy-rule

Method s of building knowledge-based systems applied in software project management 241

mechanism and the experimental knowledge base) could have been used as a support in the management of the THIRD PROJECT (obviously, in terms of the team manageme nt) as a decisions support system S PMz- R FM for evaluating the respective flow of design pro cesses. These pro cesses included: the preparation of initial data (indicato rs and data for the simulation models of pollutant emission and ambient concentration), model reliability tests and mod el integ ration. By considering the S PMz- R FM, the team managers have made their operating decisions on the forecasted project resources (precisely speaking, they could modi fy their decisions by look ing into the system's suggestions). D~fi ll il1g

the membership[unction

The changes in the cluster gravity centres have an effect on the membership functions. As a result the peaks of the memb ership funct ions have shifted. A modified model SPMz-RFM + THIR D _PROJECT has thus been designed (tu ned) by the use ofthe data from the THIRD PROJEC T, after it has been com pleted. As an analysis of the obt ained outcomes showed, the two decision-support systems (i.e. the SPMz-RFM model based on the originally designed fuzzy mechani sm and the SPMz- R FM + THIRD_PROJEC T model augmented by the knowledge of the new data) proved to be effectively similar (fig. 17). The indicated deviation s (in resources) are placed in the beginning, main and final phases of the project.

20

~-

I I

I

~15

.. ~ -, -g 5 .. -,

I

u

I

;.,

~ 0

0

~

E

..... c

.;;; ·5 '" C-l0 .E ;;; ~' 1 5

-e

·20

7

-

T-

1'-

A

I ,~~\ 1\ If.":' ' . \

X

~ .: / V \ lP

1~

.j

1\,.

::1

H. Begining phase .J

I'

I

-- ----

1

Main phase

I

I

I

J. r-, '

~

1 •. \

I

If ' \

I

I

,

I

-.

(

1~ ,

1

1

\

. '1.

Final phase

!

(

2k

," " > t..-lV J

I

Time[months) I-+-TiII,. SPII.RFIl - , -IIGn.,.SPII·RFIl .. () .. T.,.SPII·RFII.THIRD.PRDJECT • "" lIonr,.SPII·RFII.THIlD.PROJECr

Figure 17. R esults of the analysis of the obtained outcomes, the two decision-support systems.

I

242

Cezary Orlowski

4. ASSESSMENT OF EXISTING SOLUTIONS

The existing methodological solutions (methods PMM, KADS, models CMM and SPICE) provide formal approaches to supporting SPM concentrating exclusively on collections of procedures for assessment of teams (model CMM) or processes (model SPICE). There are no overall solutions to assess the implementation and management of projects. The existing fuzzy models for scheduling projects support only processes preparatory to project implementation. In the case of methodological solutions (methods PMM and KADS) we can observe the use of project tools to implement information systems, but not to manage them. In the light of the above discussion the presented fuzzy model and the model SPMz-RFM indicate on the one hand the potential possibilities for making use of information project tools, and on the other, the creation of models and systems of managing knowledge-based information projects. In the first case, the model SPMz-RFM gives the possibility of selecting qualified solutions (on the basis of generalised management states) for realising project processes and for the functioning of project teams. In the second case, a comparison of the described model SPMz-RFM with the two models that constitute an example of improving the paths realising information projects, shows that the model SPMz-RFM takes account of significant elements for the realisation of information projects: the level of infrastructure management (on the basis of z), knowledge (on the basis of w), the means of realising the processes (on the basisp) and the information technologies applied (on the basis n). It creates an integrated environment for ongoing assessment of both teams and processes. The concept of project team management assumed in this chapter also contributes to the search for new approaches in the field of creating organisational solutions. The introduction of selection criteria MNIiZ for project teams increases the probability of selecting the best (on the basis of strict assessment). It also creates conditions for gaining knowledge, more effective rule-object processing and increases the efficiency of mechanism of cooperation between members of the project team (as a result of applying project tools for direct knowledge acquisition). At the same time it shortens production time (group work mechanisms) and raisesproduct quality (constant control of product realisation) as well as assuring control over complex processes of a heuristic nature (control of management processes with the use of knowledge based rules). REFERENCES [1] Balcerzak, S., Gorski, ]., and Eksperyment, w zastosowaniu Metody Punktow Funkcyjnych do szacowania projektow informatycznych, Materialy Konferencyjne, I Krajowa Konferencja Iniynierii Oprogramowania, Kazimierz Dolny, 1999, ss: 395-407. [2] Bazewicz, M., Metody i techniki reprezentarji wiedzy w projektowaniu svsternow, Wydawnirtwo Politechniki Wroclawskiej, Wroclaw 1994. [3] Boehm, B. W., Horowitz, E., Westland, c., and Madachy, R., Cost Models for Future Software Life Cycle Processes: COCOMO 2.0, Annals of Software Engineering, ]. D. Arthur and S. M. Henry (Eds.),]. C. Baltzer, AG, Science Publishers, Amsterdam, The Netherlands 1995, pp: 57-94. [4] Brodman,]. G. and Johnson, D. L., Return on Investment (ROI) from Software Process Improvement as Measured by US Industry, Software Process Improvement and Practice, John Wiley & Sons Ltd., Sussex, England and Gauthier-Villars 1995, pp: 35-47.

Methods of building knowledge-based systems applied in software project management

243

[5] Brooks, P.J. and Mityczny osobomiesiac, eseje 0 inzynierii oprogramowania. WNT Warszawa 2000. [6] Bubnicki, Z., Podstawy informatyczne systemow zarzadzania, Wydawnictwa Politechfliki Wroc1awskiej,

Wroclaw 1993. [7] Bubnicki, Z., Wst,p do systemow ekspertowych, PWN, Warszawa 1990. [8] Budgen, D, Introduction to Software Design, Addison-fM?sley, New York 1994. [9] Buschmann, E, Meunier, R., Rohnert, H., Sommerlad, P., M. Stal., Pattern-oriented Software Architecture,John Wiley & Sons, New York, 1996. [10] Butler, K., The Economic Benefits of Software Process Improvement, Crosstalk, Hill AFB, Ogden, Ut, 1995, pp: 14-17. [II] Coleman, D and Somerland, P., Object-oriented Development: The Fusion Method. Engl. Cliffs: Prentice Hall, London 1994. [12] Crosby, P., Quality Without Tears, McGraw-Hill, New York 1986. [13] Cross, N., Engineering Design Methods,John Wiley & Sons, New York 1989. [14] Curtis, G., Business Information Systems, Analysis, Design and Practice, Adison-fM?sley Publishing Company, London 1993. [15] Dean, T., Arificial Intelligence. Theory and Practice, Addison Wesley, New York 1995. [16] Dion, R., Process Improvement and the Corporate Balance Sheet, IEEE Software, October 1993, pp: 28-35. [17] Driankow, D, Hellendoorn H., Reinfrank M., Wprowadzenie do sterowania rozmytego, WNT, Warszawa 1996. [18] Drucker, P. E, Skuteczne zarzadzanic, Zadania ekonomiczne a decyzje zwiazane z ryzykiem, PWE, Warszawa 1986. [19] Durlik, 1., Restrukturalizacja procesow gospodarczych, Reenigneering. Teoria i praktyka, Agellda Wydawnicza Placet, Warszawa 1998. [20] Dyczkowski, M., Owczarzy, A., Wdrazanie gospodarczych systemow informacyjnych w zintegrowanym srodowisku zarzadzania, Materialy konferencyjne BIS '97, Poznan 1997, ss: 43-47. [21] Fliegner, W, Metodyka i identyfikacja obiektow ksiegowych, Materialy konferencyjne, I Krajowa Konferencja Ini.ynierii Oprogramowania, Kazimierz Dolny 1999, pp: 245-262. [22] Gane, C and Sarson, G., Structured Systems Analysis and Design Tools and Techniques, Prentice Hall, New York 1979. [23] Ghezzi, C, Jazayeri, M., and Mandrioli, D, Fundamentals of Software Engineering, Prentice Hall, New York 200L [24] Gorski, J., Poprawa technologii w zakresie inzynierii oprogramowania. Doswiadczenia praktyczne, Informatyka 2000, nr 7-8 ss. 30-34. [25] Gupta, M., Kiszka, B., and Trojan, J., Muliverable Structure of Fuzzy Systems, IEEE Transactions and Systems, 1986 Vol. 7, pp: 638-656. [26] Hapke, M.,Jaszkiewicz, A., Zintegrowane narzedzia harmonogramowania przedsiewziec programistycznych w warunkach niepewnosci, Materialy konferencyjne, I Konferencja Ini.ynierii Oprogramowania, Kazimierz Dolny 1999, ss. 65-76. [27] Heller, M., The Japanese Menace, Management Today, 1998. [28] Hickman, E R. and Killin, K., Analysis for Knowleadge-based Systems. A Practical Guide to the KADS Methodology, Ellis Howard Limited, London 1992. [29] http://www.sqi.gu.edu.au/spice. Software Process Improvement and Capability Determination. [30] Jaszkiewicz, A. i inni, Eksperymentalna ocena podstawowych technik inzynierii oprogramowania, Materialy konferencyjne, I Krajowa Konferencja lnzvnierii Oprogramowania, Kazimierz Dolny, 1999, ss. 311324. [31] Kacprzyk,J., Multistage Fuzzy Control, John Willey Inc., New York 1997. [32] Kerzner, H., Project Management, Van Nostrand Reinhold Company, New York 1994. [33] Kingston, J., Pragmatic KADS: Methodical Approach to a Small Knowledge Based Systems Project, Expert Systems, 1992, Vol. 9, pp: 171-179. [34] Kim, S. and O'Hara P, Co-operative Knowledge Processing, Springer Verlag, London 1997. [35] Kisielnicki, J. and Sroka H., Systemy inforrnacyjne biznesu, Agencja Wydawnicza Placet, Warszawa 1999. [36] Klir, J., Clair, St. and Yuan, B., Fuzzy Set Theory, Foundations and Applications, Prentice Hall, New York 1997. [37] Konieczny, J., Inzynieria systernow dzialania, WNT, Warszawa 1984. [38] Korf, R. E., Planning as Search: A Quantitative Approach. Readings in Planning, Morgan Kaufmann Publishers lnc., New York 1996, pp: 566-577.

244

Cezary Orlowski

[39J Kowalczuk, Z., and Orlowski, C, Design of Knowledge-Based Systems in Environmental Engineering, Information Systems in the Environmental Engineering, Proceedings International Computer Science CO/wention, Gdansk, 2003 (accepted chapter). [401 Krawczyk, H., Mazurkiewicz A., Metoda wytwarzania i implementacji szkieletowych aplikacji rozproszonych ella zastosowan przemyslowych, Materialy konferencyjne, I Krajowa Konferencja Inzynierii Oprogramowania, Kazimierz Dolny 1999, ss. 76-83. [41J Krawczyk, T. Strategie zarzadzania systemami informacyjnymi [w]: Integracja architektury system6w informacyjnych przedsiebiorstw, Katedra Informatyki Gospodarczej i Analiz Ekonomicznych. Uniwersytet Warszawski, Warszawa 2000. [42] Kusiak, A. and Wang J., Decomposition of the Design Process, Journal of Mechanical Design, 1993, Vol. 115, pp: 687-695. [43J Luger, G. and Stublefield W, Artificial Intelligence. Structures and Strategies for Complex Solving, Addison-VVesley, New York 1998. [44] Lopacinski, T. and Kalinowska-Iszkowska M., Narzedzia firmy IBM do wspomagania proces6w planowania i realizacji projekt6w informatycznych, Materialy konferencyjne, I Kraiowa Konferencja Inrvnierii Oprogramowania, Kazimierz Dolny 1999, ss. 145-149. [45] Maciaszek, L., Requirements Analysis and System Design, Addison-VVesley, New York 2001. 146] Madachy, R., Systems Dynamics Modeling of an Inspection-Based Process, Proceedings of the 18th International Conference on Software Engineering, Berlin, March 1996, pp: 376-386. [47J Mamdani, E. H.: Applications offuzzy algorithms for control of a simple dynamic plant. Proc. IEEE, 1974, vol. 121, pp. 1585-1588. [48J Mesarovic, M. and Takahara Y., Abstract Systems Theory. Lecture Notes in Control and Information Science, Springer Verlag, New York 1989. [49] Mulawka, J., Systemy ekspertowe, WNT, Warszawa 1996. [50] Nerson, J. M., Aplying Object-oriented Analysis and Design, Communications of the ACM, 1992, Vol 9. [51J Orlowski, C, The Methods of Creating Membership Functions in the Fuzzy Type Rules of the Knowledge Object Bases, Proceedings, Australasian-Pacific Forum on Intelligent Processing and Manufacturing of Materials, Honolulu, USA 1999, pp: 654-661. [52] Pacholski, L. and Jablonski, J., Ergonomiczne i ekonomiczne aspekty optymalizacji ukladu czlowiekmaszyna, Zeszyty Naukowe Politechnilei Poenansleie], Organizacja i Zarzadzanie, Poznan 1996, Nr 19 ss.121-135. [53] Padulo, L. and Arbib, M. A., System Theory, WB Saunders, Paris 1974. [54] Paulk, M. C, Software Capability Maturity Model, Version 2, Draft Technical Report, Software Engineering Institute, Carnegie Mellon University, Pittsburgh 1997. [55J Paulk, M. C, Weber, C v; Curtis, B. and Chrissis, M. B., The Capability Maturity Model: GUIdelines for Improving the Software Process, Addison-Wesley, New York 1995. [56] Pawlak, Z., Rough Set and Data Mining. Proceedings IPMM '97, Gold Coast, 1997, Vol. 1, pp: 663667. [57J Peled, D., Software Realiability Methods, Springer Verlag, New York 2001. [58] Pfteeger, L., Software Engineering, Theory and Practice, Prentice Hall, New York 1998. [59] Piegat, A., Modelowanie i sterowanie rozrnyte, Akademicka Oficyna Wydawnicza EXIT, Warszawa 1999. [60J Primorse, P, Selecting and Evaluating Cost-effective MRP and MRP II, International Journal Operations/Production Management 1990, Vol. 1, pp: 51-66. [61] Robson, W, Strategic Management and Information Systems, Pitman Publishing, Boston 1994. [62J Rolstadas, A., Enterprise Performance Measurement, International Journal Production/Operations Management, 1998, Vol. 18, pp: 989-999. [63] Rutkowski, L., Tadeusiewicz R. (red.): Neural Networks and Soft Computing, Polish Neural Network Society, Zakopane 2000. [M] Singpurwalia, N. and Wilson, S., Statistical Methods in Software Engineering. Reliability and Risk, Springer Verlag 1999. [651 Slagmulder, R., Bruggeman, W, and Wassenhove, L., An Empirical Study of Capital Budgeting Practices for Strategic Investment in CIM Technologies, International Journal Production Economics, 1995, Vol. 40, pp: 121-152. [661 Slowinski, R. and Hapke, M., eds.: Scheduling under Fuzziness. Physica-Verlag, Heidelberg, 1999. [67J Sommerville, I., Software Engineering, Addison-VVesley, New York 1995. [68J Stoner, J. A., Kierowanie, PWN, Warszawa 1994.

Methods of building knowledge-based systems applied in software project management

245

[69] Szczerbicki, E., Orlowski, C, Qualitative and Quantitative Mechanisms in Management IT Projects in Concurrent Engineering Environment, System Analysis Modeling and Simulation, Gordon and Brench Science Vol. 43, No.2, pp. 219-230. [70] Tong, R. M., The Construction and Evaluation on Fuzzy Models in Advances in Fuzzy Set Theory and Applications, North Holland, Amsterdam 1979, pp: 559-576. [71] Ullman, J., Principles of Database and Knowledge Base Systems, Computer Science Press, Rockville 1988. [72] W\,glarz, J. (ed.), Project Scheduling: Recent Models, Algorithms and Applications, Kluwer, Dordrecht, 1999. [73] Wordsworth, J. B., Software Engineering with B, Addison-Wesley, New York 1996. [74] Yager, R. and Filew, D., Podstawy modelowania i sterowania rozmytego, WNT, Warszawa 1995. [75] Yourdon, E., Modern Software Analysis, Prentice Hall, New York 2001. [76] Zadeh, L. A., Fuzzy Sets as a Basisfor Theory of Possibility. Fuzzy Sets and Systems 1, 1978. [77] Ziegler, B., Teoria modelowania i symulacji, PWN, Warszawa 1984.

SECURITY TECHNOLOGIES TO GUARANTEE SAFE BUSINESS PROCESSES IN SMART ORGANIZATIONS

ISTVAN MEZGAR

1. INTRODUCTION

The developments in the fields of information technology, telecommunication and consumer electronics are extremely fast. The ability of different network platforms to carry essentially similar kinds of services and the coming together of consumer devices such as the telephone, television and personal computer is called "technology convergence" [1]. The ICT (Information and Communication Technology), the "infocorn" technology covers the fields of telecommunication, informatics, broadcasting and e-rnedia. A very fast developing field of telecommunication, the wireless (mobile and Wi-Fi) communication gets a growing role in many fields as well. The connection of mobile devices to the Internet established basically new possibilities, services for the users. Today the global nature of communications platforms (in particular, the Internet) is providing a key that is opening the door to the further integration of the world economy. At the same time, the low cost of establishing a presence on the World Wide Web, is making it possible both for businesses of all sizes to develop a regional and global reach, and for consumers to benefit from the wider choice of goods and services on offer. Globalization is therefore the key theme in developments. This technological convergence is not just about technology. It is also about services and about new ways of doing business and ofinteracting within the society. The impact of the new services resulting from convergence can be felt in the economy and in the society as a whole, as well as in the relevant sectors themselves. Because of this great 246

Security rechnologies to guarantee safe business processes

247

impact of information technologies and the level of knowledge content in products and services, the society of the XXI century is called as Information and Knowledge Society. The availability ofthe individuals independently from location and time means mobility, and that is an important attribute in this society. The knowledge content of a product or process might appear not always spectacularly, it remains hidden in lot of cases. Today the greatest added value is in the area of software, electronics and exotic materials. An important aspect is that these three areas refer not only to the end product, but also the tools and organizations that build and produce the product. This information and knowledge age has three main characteristics [2]: • Dematerialization-e.g., information is the source of 3/4 of added value in manufacturing, • Connectivity-connection computing and communication. (E.g., equal chances for people based on networking), • Virtual networks-virtual technologies, networked economy with deep interconnections within and between organizations. In order to meet the demands of the present era originating from the technologies, the networked information (info-communication) systems have an outstanding role. Managing these new types ofsystems new aspects became into focus in the information and later on in knowledge management. The structure of the organizations is in a recursive connection with the IC systems; the IC technology offers new possibilities for restructuring the organization (and its business processes) itself, in other cases the new demands of a business process force the development of a special Ie solution. The final goal ofall information systems is to provide data, information, knowledge, or different services for the users (human beings), so taking into consideration basic human aspects (e.g., psychological) while approaching the information and knowledge management has of vital importance. The need for security also originates from the users, as they use an IC system if only they trust it. So, trust is essential to Information and Knowledge Society. The trust can be achieved by using different security services. The lack of trustworthy security services is a major obstacle to the use of information systems in private, in business (B2B) as well as in public services. Trust is intimately linked to consumers' rights, integrity, authentication, privacy, and non-repudiation. Secure identification, authentication of the users and communication securities are main problems in networked systems. The chapter intends to concentrate on the problem of trust; namely what information and security services, mechanisms have to be applied to provide the acceptable level of trust for the users on different system levels, during the life cycle of networked organizations. The readers will get an overview on the possible dangers of attacks against information and communication systems parallel with the possibilities to parry them. The chapter introduces shortly the present tools, technologies that

248

Istvan Mezgar

are appropriate to increase the trust level of the users in case of different network types. The chapter does not intend to give a full overview nor a detailed description on networking, or on security, rather wants to flash the dangers of sending valuable information through networks and how to avoid these traps, and push the users into the direction of secure information systems and communication. As the chapter covers a very broad area it is not possible to introduce all these aspects in detail. References for each important part are given. 2. SMART ORGANIZATIONS-ARE THEY THE FUTURE?

2.1. Main characteristics of Smart Organizations

2.1.1. Definition of Smart Organization

Based on the results of the information and communications technologies (ICTs), a new "digital" economy is arising. This new economy needs a new set of rules and values, which determine the behavior of its actors. Participants in the digital market realize that traditional attitudes and perspectives in doing business need to be redefined. One main aspect of this is that organizations in this environment are networked, i.e., inter-linked on various levels through the use of different networking technologies. Besides the Internet new (or pilot phase) solutions are offered; wireless networks (Wi-Fi and mobile), powerline communication (using the electric power grid) and as an efficient extension of the Internet the Grid technology. The main characteristics of the digital economy for market participants are as follows: • Networking and horizontal communication, including the smart product, • Networked environment, • Knowledge based technologies, • Simplification and coordination of structure, • Customer focus and real-time, ubiquitous responsiveness to technical and market trends (what customers want, anytime, anywhere), • Flexibility, adaptability, agility, mobility, • Organizational extendibility, virtuality, • Shared values, trust, confidence, transparency and integrity, • Ability to operate globally co-operating with local cultures.

In this turbulent environment only those organizations can survive which effectively apply the results of the different disciplines. Smart organizations (SO) belong to this kind of category. "The term "smart organization", is used for organizations that are knowledgedriven, internetworked, dynamically adaptive to new organizational forms and practices, learning as well as agile in their ability to create and exploit the opportunities offered by the new economy"(3).

Security technologies to guarantee safe business processes

249

There are three characteristics of the smart organizations that make them really special: • They are motivated to build collaborative partnerships, which encourage and promote the discussion of ideas. Customer focus and meeting customer expectations is recognized as a key success factor. • Smart organizations can respond positively and adequately to change and uncertainty, so they survive and prosper in the new economy. • Smart organizations can identify and exploit new opportunities through applying the strength of "smart" resources, i.e., information, knowledge, relationships, and innovative and collaborative intelligence. In the following sections the main characteristics of smart organizations will be discussed; the organizational form (section 2.2), the application of knowledge(section 2.3) and networking technologies (section 2.4). 2.1.2. Life cycle of networked organizations

To allow this kind of dynamic re-configuration of the whole system in response to market changes, significant requirements must be met. On one side, individual enterprises must improve their flexibility and extend their connections with the other members of the system. On the other side, the production infrastructure must support fast interaction as well as information sharing between the nodes. Main characteristics of networked organizations are the basis for fast reaction, system organization, aggregation and co-ordination. The life cycle of a networked organization (NO) can be divided to the following phases: Forming, Startup Operation, Operation, Closing Operation, and Breakup. In the Forming phase the organizational units/enterprises discuss the administrative, technical and financial conditions of the cooperation. In this phase happens the first contact between the managers ofthe different organizations. At the end ofthe phase the networked organization is ready to communicate, to exchange data, information and knowledge both from technical and administrative/legal aspects (included activities: identification, design). The personal/human connections (if they are needed) also have been established between the staffs. In the second, Start-up phase the new organization start to operate, the information/data exchange processes begin. Also the tests on reliable access, the content of data etc. are going on. As a result of this phase the reliable, tested information exchange has been checked. In the Operation phase the production is going on in the NO. Information-, administration-, business- and financial processes are going on, the NO fulfils its production goals. The Closing Operation period is for closing the communication channels, final exchange of information, checking database consistencies, to invalidate passwords, etc.

250

Istvan Mezgar

Table 1 Tasks in the life cycle phases of networked organization Life cycle phases of networked production system

Tasks in organisational hierarcliy

Tasks in communication

Tasks in information handling, storage

Forming NO

Discussions of top managers, selfconfidence (we can do it), the organisation has good frame contract for co-operation

IdentifYing the partners, exchange basic administrative, legal, technical, financial information

Safe storage of negotiation materials

Start-up operation

Discussion of mediumlevel managers based on contract forms, network/ system administrators contact

Establish and test communication connections (physical, network, SW; standards)

Data conversion,

Operation

Discussion of engineers, managers according to product technical documentation

Control the communication

Safe storage and retrieve (access) of datalinformation

Closing operation

Discussions of top managers, participating engineers, and managers

Develop disconnection schedule

Check DB

Communication of network/system administrators

Disconnection of systems

Access rights and pw

Break-up NO

access hierarchy, pw issuing,

consistencies,

ownership of information

elimination,

archive materials

In the Break-up phase all types of connections are eliminated i.e., the co-operation is closed, the units of the NO are continuing their work independently or are joining to an other NO. In Table 1 some of the main tasks/activities in communication, in information management and in the organization are introduced in the different life cycle phases of a NO. Of course this table is strongly simplified, its goal is to give only a flash from the huge amount and diversity of activities that have to be processed (partially automatically) while dealing with NOs. The duration of the above phases can be very different, depends on the information infrastructure (HW, SW), organization structure (Orgware) and the education and cultural level of the staff (Manware) at the participating firms. The first and the last two life-cycle phases can span from a few hours to a few days, while the Operation phase depends mainly on the production volume and on the type of organization/cooperation. In case of virtual enterprises the whole cycle refers to the period producing a product, while in smart organization this can cover several months, even years. The reliable operation of the production, the secure communication and the secure data storage are important in NO as these can ensure the technical side of trust based on which, the trust between people and the systems can be evolved and remain for longer period.

Security technologies to guarantee safe business processes

251

2.1.3. Human role in smart organization

The selection of the right partners and taking care on these relationships can help a company focus on what creates the most value for customers and concentrate on its core activities. A NO can be considered as a temporary, culturally diverse, geographically dispersed, electronically communicating group of organizations, peoples. The attribute temporary in the above definition describes organizations, teams where members may have never worked together before and who may not expect to work together again as a group [4]. The characterization of virtual teams as global implies culturally diverse and globally spanning members that can think and act in concert with the diversity of the global environment [5]. Finally, it is a heavy reliance on computermediated communication technology that allows members separated by time and space to engage in collaborative work. Creating a NO takes more than just information technology. A study on issues of information technology and management concluded that there is no evidence that IT provides options with long-term sustainable competitive advantage. The real benefits ofIT derive from the constructive combination ofIT with organization culture, supporting the trend towards new, more flexible forms of organization [6]. Information technology's power is not in how it changes the organization, but the potential it provides for allowing people to change themselves. Creating these changes however presents a whole new set of human issues. Among the biggest of these challenges is the issue of trust between partner organizations in the NO [7]. 2.2. Organizational form

The implications ofthe above developments for organizations have led to a proliferation in terminology applied primarily to enterprises, i.e., terms such as, agile enterprise, networked organization, virtual company, extended enterprise, ascendant organization, knowledge enterprise, learning organization, smart organization. Each definition has its nuance, depending on what particular trait, or combination of traits, is given emphasis, but basically each term cover the same idea; the networked co-operation of independent, flexible organizational units. As an example for introduction the virtual enterprise (VE) has been selected. Most of the characteristics that will be described in the followings can be applied to the other organizational types as well. 2.2.1. Main characteristics

of VE

Most VE theorists refer to the holonic doctrine, first introduced by Arthur Koestler in his book "The Ghost in the Machine" [8]. This theory is based on the halon concept, where holons are defined as independent entities sharing some basic features; as openness (they are able to cooperate with each other to reach a common goal), flexibility (each of them can easily re-configure itself in response to an external stimulus), and similarity (they share the same basic principles, values and purposes). A holon is said to be "a whole into itself and a part of other wholes", so holons are defined as independent entities that are capable of coordinated behavior. A system

252

Istvan Mezgar

including entities with such characteristics, along with the necessary links to support their mutual interaction, is called a holonic system. In its most common representation, a holonic system is seen as a network graph, where nodes represent holons and arcs indicate interaction links between the nodes. In the last years the holonic doctrine has been applied to the production domain, leading to the concept of VE. This new organizational paradigm is founded on the assumption that the production environment is going to transform itself into a holonic system. To be part of such a system, individual enterprises have to change into holons, that is, to become flexible and open enough to fit the above definition. At the same time, the environment must support the integration of these enterprises within an evolving system, taking the form of a multi-layer network. This is obtained through efficient communication and transportation means, as well as through the spread of principles, values and know-how across the network. There are different forms of the distributed enterprise, the YEs are one of the most up-to-date forms of production. Based on the different definitions ofVE it can be stated that the intensive use of computer networks and the high-level organization flexibility are main parameters ofVEs. Enterprises forming a holonic production system are potentially enabled to cooperate with each other to achieve a common goal. This happens in reaction to an external stimulus, taking the form of a new business opportunity which can be better exploited by more joined enterprises than by an individual firm. In these circumstances a virtual enterprise is created. In its current definition, the VE is formed by a proper combination of specialized nodes, including financial and engineering firms, manufacturers, assemblers and distributors. This structure can be seen as a holarchy, in that it is a temporary, goal-oriented aggregation of several individual enterprises. Each VE is created to pursue a specific business objective, and remains in life for as long as this objective can be pursued. This temporary aggregation is supposed to involve enterprises from different sectors and categories. After that, the individual nodes resume their independence from each other. Node resources that were previously allocated to the expired business are re-directed toward the node individual goals, or toward other YEs it may have joined. To allow this kind of dynamic re-configuration of the whole system in response to market changes, significant requirements must be met. On one side, individual enterprises must improve their flexibility and extend their connections with the other members of the system. On the other side, the production infrastructure must support fast interaction as well as information sharing between the nodes. Some of the most signiftcant benefits expected for enterprises joining a virtual organization are: • New business opportunities become available, by combining the productive capacity and marketing strength of all nodes in the virtual organization. • Design and development capacity is increased by knowledge sharing between nodes with complementary skills. • Cost and risk factors for the development of new products are shared among the nodes.

Secur ity techn ologies to guarantee safe business processes 253

• Du e to the specialization of roles within the network , each indi vidual enterprise is enabled to focus on its core processes, thu s optimizing and improving them. T he reason s of creating YEs are such as: rapidly evolving markets, redu ction of design and manufacturing times (because of shor ter produ ct life cycle), increased efficiency of communication and transportation mean s. In the practice there are two main ways to form a VE; decomp ose a large company into smaller units, or aggregate little firm s (e.g., Small and Med ium size Enterprises-SMEs) into the form of a VE [9]. Th e YEs formed by the two approaches have different requ irement s as both the inherited characteristic and the goals of th e original production un its are very different . T he commo n requirements of environmental factors that make possible the VE realization are the fast transport and communication means, and the spread of principles, kn ow-how, and business practice to all enterprises in the VE. 2.2.2. Importance of safe communication in VE

The basic characteristic of VE is the flexibility both in information - and in material flow. All main events of its life cycle are co nnected to communication on the network. The communication requirements for a VE can be summarized in the followings:

1. Integration of different communicationforms and reS01IYCeS Co mmunication through connected telephone-, computer- and cable networks, and application possibilities of different prot ocols, con necting wired- and wireless equipment .

2. Reliable and high quality communication services R eliability covers the high on- service tim e (technical reliability), the high availability (well design ed/balanced network-resource reliability), the HW and SW secur ity, both for equipment and communication lines (access reliability), well controlled/ organized networks (organization reliability), all with reasonable cost. 3. Global time coordination It is essential the exact coo rdination of the different action s in time during the life cycle of the VE, so a "g eneral tim e" has to be declared for communication. 4. Traceable communication Traceability means to docum ent and audit the communication in a way that fulfills the requir ements of bookkeepin g (e.g., delivery report and receipt notification ) and legal aspects (e.g., digital signature).

As the goal of the present chapte r is the descrip tion of the smart organizations from security aspects, in the following the HW and SW secur ity of equipment and communication lines (access reliability) and the control! organization of networks (organizatio n reliability) will be discussed. Th e security requirements for a VE can be listed as follows:

254

Istvan Mezgar

1. Protection of all type s of enterprise data (for all company forming the VEl. Privacy and int egrity of all types of documents during all phases of storage and communication (Data and communication secur ity- C ertification, Encryption ), 2. To enable companies confidential access control, 3. Authorization and authentication of services (digital signature). These services need to be flexible and customized to m eet a wide array of secur ity ne eds, including specific high level requ irements. In ord er to fulfill the communication and secur ity demands som e basic aspects have to be taken in consideration while selecting secur ity and communication technologies: 1. Platform independent SW tool s have to be applied, 2. Stand ards have to be applied (accepted and "de facto" standards as well), 3. Appropriate architectures with ability to integrate different resources. Fulfilling all types of th e int roduced requirem ents for individual ent erprises would be very hard if not impossible, so different general network- and organ ization al stru ctures have been developed, th at have been carefully designed and tested. These struc tures can be defin ed as referen ce architectures, and they are available both for th e organization and for th e information infrastructure of VEs. 2.3. Knowledge technologies and applications

Smart organizations are knowledge dri ven according to th eir definition. This knowledge driv en characteristic includes both th e technologies and their applications. In this chapter som e new, perspective technologies will be introduced (ant algor ithm, agent technology) that can be applied in th e operation of networked organization. Som e aspects of knowledge management also wi ll be introduced , as this application field is very impo rtant for th e effective operation of networke d organizations. 2.3. 1. Knowledge technologies

In the followin gs a short introduction of different types of knowledge techn ologie s is made th at are applied in networked or ganizations: Beyond the expert systems, Knowledge-Based Systems (KBS) were the first main commercialization of artificial intelligence (AI) research. Expert systems make it possible to capture human expertise and use this knowledge to aid expert decision making, improve non-expert decision-making, and solve complex problems more efficiently. Ot her KBS technologies include artificial neural networks (ANN), fuzzy logic, genetic algorithms, ant algorithm and data mining. Intelligent agents can be applied in different fields of networked systems. Tod ay knowledge techn ologies are applied not onl y separately, but in different combin ation s as well. T he limit ations of th e separate systems have been a central dri ving force for creating int elligent hybrid systems where two or more techn ique s are combined to overcom e th e limitations of ind ividual techniques. Most complex domains

Security technologies to guarantee safe business processes 255

have many different component problems, each of whi ch may require different types of processing. The different compo nents of intelligent systems communicate their results among themselve s to produce the final result(s). Th ese combinations of different knowledge technologies (hybrid systems) open new application possibilities in many fields [10]. In the following s the ANN, the ant algorithm and the intelligent agents will be shortly introduced as imp ort ant and evolving fields of kno wledge technologies. As in SO the sharing of knowledge is important, the KIF also has to be mentioned in few words. 2.3.1.1. ARTIFICIAL NEURAL NETW ORK S. Artificial neural networks (AN N s) can be regarded as trainable univer sal approximators. ANNs have proven to be equal, or superi or, to other pattern recogniti on learning systems over a wid e range of domains. The majority of ANN models (e.g., the most frequently used back propagation (BP) model), however, can have problems e.g., with lengthy training times, dependence on the initial parameters, lack of a problem independent way to choose appropriate network topology, incomprehensive (black box) nature, unavailability of suitable training sets [11]. 2.3.1.2. ANT ALGORITHMS AN D SWARM INTELLIGENCE. R esearch in social insect behavior has provided computer scientists with powerful meth ods for designing distributed control and optimization algorithms. These techniques are being applied successfully to a variety of scientific and engineering problems. In addition to achievin g goo d performance on a wide spectru m of 'static' problems, such techniques tend to exhibit a high degree of flexibility and robustne ss in a dynamic environment. Ant algorithms and swarm intelligence systems have been offered as a novel computation al approach that replaces the tradition al emphasis on control, preprogramming, and centralization with designs featuri ng auto nomy, emergence, and distributed functioning. T hese designs are proving flexible and robust, able to adapt quickly to changing environments and to continue functioning even when individu al eleme nts fail. Swarm intelligence can be defined as the field, whi ch covers "any attempt to design algorithms or distributed problem-solving devices inspired by the collective behavior of social insect colonies and other animal societies" [12]. Ant algorithms were inspired by the observation of real ant colonies. Ants are social insects, that is, insects that live in colonies and whose behavi or is directed more to the survival of the colony as a wh ole than to that of a single individual component of the colony. An important and intere sting behavior of ant colonies is their foraging behavior, and, in particul ar, how ants can find shortest paths between food sources and their nest. While walking from food sources to the nest and vice versa, ants deposit on the ground a substance called pherom on e, form ing in this way a pheromon e trail. That is, w hen more paths are available from the nest to a food source, a colony of ants may be able to exploit the phe romo ne trails left by the individu al ants to discover the shortest path from the nest to the food source and back.

256

Istvan Mezgar

Ant colony optimization (ACO) algorithms show similarities with some optimization, learning and simulation approaches like heuristic graph search, Monte Carlo simulation, neural networks, and evolutionary computation. Within the artificial life field, ant algorithms represent one of the most successful applications of swarm intelligence. One of the most characterizing aspects of swarm intelligent algorithms, shared by ACO algorithms, is the use of the stigmergetic model of communication. (In case the communication among agents is indirect and synchronous, mediated by the network itself is called stigmergy. This form of communication is typical of social insects). This form of indirect distributed communication plays an important role in making ACO algorithms successful. There are examples of applications of stigmergy based on social insects behaviors like task allocation in a distributed mail retrieval system, a data clustering algorithm [13), the adaptive learning of routing tables in communications networks [14]. 2.3.1.3. INTELLIGENT AGENTS. Intelligent agents can be applied in many fields of distributed systems. Agents can embody the holonic method in the field of programs asagents can represent the role ofholons very well. An agent is an embedded computing entity (software and/or hardware system) situated in an environment, and it is capable to do autonomous actions in this environment in order to meet its design objectives [15]. Intelligent agents (IA) have three basic characteristics: autonomy, learning and cooperation. Autonomy applies to the principle that agents can operate on their own without the need for human control. Agents have their own internal goals and states, and they act in a manner to meet their goals. A key element of their autonomy is their proactiveness, i.e., their ability to 'take the initiative' rather than acting simply in response to their environment. Agents should be able to interact, to cooperate with their environment. In order to cooperate, agents need to hold a social ability, i.e., the ability to interact with other agents and possibly humans via some communication language. For an agent to be really intelligent, it has to learn and adapt itself as it reacts and/or interacts with its external environment. An agents is (or should be) immaterial bit of intelligence. As a key attribute of any intelligence is the ability to learn this is a key characteristic of an intelligent agent. As an addition learning can take the form of increased performance over time as well. A system can be called as agent-based when the key abstraction used is that of an agent. In principle, an agent-based system might be conceptualized in terms of agents, but implemented without any software structures corresponding to agents at all [16]. A multi-agent system is designed and implemented as several interacting agents, and is more general but at the same time significantly more complex than the singleagent one. However, there are a number of situations where the single-agent case is appropriate. An example is the class of systems known as expert assistants, where an agent acts as an expert assistant to a user attempting to use a computer to carry out some task. As it was introduced earlier, the theoretical base for networked systems is the holonic theory. The independent, flexible unit, the holon in software technology is represented

Security technologies to guarantee safe business processes 257

by an agent. In modeling distributed enterprises decisions are made by interacting autonomous units or agents and the system is based on multi-agent solutions. Global structure and behavior is emergent, resulting from the cumulative effects of actions and interactions of agents. 2.3.1.4. KNOWLEDGE SHARING. The application of knowledge based systems become more frequent, so the knowledge exchange, knowledge sharing has an increasing role. In this field KIF (Knowledge Interchange Format) is a language designed for use in the interchange of knowledge among disparate computer systems [17]. It has declarative semantics (i.e., the meaning of expressions in the representation can be understood without appeal to an interpreter for manipulating those expressions); it is logically comprehensive (i.e., it provides for the expression of arbitrary sentences in the firstorder predicate calculus); and it provides for the representation of knowledge about knowledge. KIF is not intended as a primary language for interaction with human users (though it can be used for this purpose). Different programs can interact with their users in whatever forms are most appropriate to their applications (for example frames, graphs, charts, tables, diagrams, natural language, and so forth). As a pure specification language, KIF does not include commands for knowledge base query or manipulation. 2.3.2. Knowledge management

The organizations are continuously changing, into the direction of an increasing complexity and with increasing frequency. The motivation of this change is the changing business environment in which these organizations are operating. To be able to react positive these changes organizations have to be flexible and adaptive. Radically changing organizational environments that demand even faster rate of information processing, information renewal and knowledge generation have motivated managers to retrieve, archieve, store and disseminate their organization's information by using advanced information technologies. The company's organizational performance may be characterized by an economic transition from an era of competitive advantage based on information to one based on knowledge. As an answer for the complexity and fast demands not only information but knowledge also has to be capture and process. Companies have to make decision based on uncertain and incomplete information as well and knowledge based systems can process these tasks with lower error rate. The earlier era was characterized by relatively slow and predictable change that could be handled by most formal information systems. During this period, information systems based on programmable recipes for successes were able to deliver their promises of efficiency based on optimization for given business contexts. Corporations have to act not according to pre-defined rules of the market but on understanding and adapting as the rules of the market-as well as the market itself-keep changing. The emergence and quick spread of the different types of virtual enterprises and other types of agile organizations prove this theory as these types of organizations are based on changing business rules, formulas, and assumptions.

258

Istvan Mezgar

The new world of knowledge-based industries is distinguished by its emphasis on precognition and adaptation, in contrast to the traditional emphasis on optimization based on prediction. It is important of distinguishing among data, information, and knowledge as today computers use all the three parallel. The generally accepted view sees data as simple facts that become information as data are combined into meaningful structures, which subsequently become knowledge as meaningful information is put into a context and when it can be used to make predictions. According to this view, data are a prerequisite for information, and information is a prerequisite for knowledge. There are two main approaches to knowledge management; the first one argues that it is possible to represent knowledge in forms that can be stored in computers, while the second one states that knowledge resides in the user's subjective context of action based on the information stored in the computer. So, according to the first stream knowledge management is the strategic application of collective company knowledge and know-how to build profits and market share. Knowledge includes ideas, concepts and know-how created through the computerized collection, storage, sharing and linking of corporate knowledge layers. Advanced technologies make it possible to extract additional data, information and knowledge (through machine learning) from the corporate "mind". The other interpretation of knowledge is given by Churchman [18] "knowledge resides in the user and not in the collection of information ... it is how the user reacts to a collection ofinformation that matters". Taking into consideration this approach to knowledge, Malhotra [19] proposed the following definition of knowledge management "Knowledge management caters to the critical issuesoforganizational adaptation, survival, and competence in face of increasingly discontinuous environmental change. Essentially, it embodies organizational processes that seek synergistic combination of data and information-processing capacity ofinformation technologies, and the creative and innovative capacity of human beings". The new world of Technologies (electronic & mobile) needs very high level of adaptability to incorporate dynamic changes into the business and information architecture and ability to develop systems that can be readily adapted for the dynamically changing business environment. Organizations operating in this new business environment therefore need to be adapting at generation and application of new knowledge as well as ongoing renewal of existing knowledge archived in company databases. Information systems enter into nearly all fields of company and private lives and the human-computer interaction, the role of human being (as developer, operator and user) is growing in a great extend. New interfaces have to be developed (e.g., also for disabled) and the importance of trust in information and communication systems gets a central role as well. 2.4. Network technologies for smart organizations 2.4.1. Trends in information technology

Computer network technologies as one ofthe main drivers of convergence and globalization are integrated into all fields ofthe economy, in different applications ofindustry,

Security technologies to guarantee safe business processes 259

banking, health care, etc. Network connections are not limited only for one enterprise (Intranet), or for a country, or for a certain sector of economy, but for many functions and for the whole world. This globalization trend can be identified in most sectors of the economy. The functional integration and the globalization have effected the integration of material-, and information flows, and money circulation, which are the three basic components of complex production and service processes. This deep integration of information and communication technologies into the whole company is changing the culture, the structures and the (business) processes of companies. The globalization of the economy means the keen co-operation of firms world wide, and the cooperation means intensive application of information and communication technologies. Distributed, networked information systems can fulfill the demands, and the information management methods, technologies and tools have to adapt to these challenges. The integration of computer networks and mobile technologies has made the communication channels more crowded as a "mobile citizen" has access to different data sources, information systems independently of his/her location and the phase of the day. These new infocom systems have generated plenty of new problems, but one of the main challenges is the security, both of information handling and communication. As today the globalization is based not only on multinational (giant) firms, but the SMEs (Small- and Medium-sized Enterprise) are deeply involved as well, the problem of security affects very broad group of organizations from all sectors of the economy, as well as financial and government bodies. In this section the different networking technologies will be introduced that can be applied in smart organizations. The conventional wired technology is only summarized, more details are given on the fast spreading wireless and mobile networks that start now to compete and are important actors of the market. The powerline communication and the grid technology have been just started, but they are promises for the future. These technologies can be combined well; they can be applied with different goals, so they are only partially competitors for each other. The security characteristics of each network technology will be described in the security section later on. 2.4.2. Wired network technology

Open Systems Architectures (OSA) have become an important approach to develop flexible, adaptable sets of methodologies, standards and protocols for structured communication systems. OSA is a layered hierarchical structure, configuration, or model ofa communications or distributed data processing system that enables system description, design, development, installation, operation, improvement, and maintenance to be performed at a given layer or layers in the hierarchical structure, allows each layer to provide a set of accessible functions that can be controlled and used by the functions in the layer above it, enables each layer to be implemented without affecting the implementation of other layers, and allows the alteration of system performance by the modification of one or more layers without altering the existing equipment, procedures, and protocols at the remaining layers.

260

Istvan Mezgar

Table 2 TCP/IP- and security protocols in the network Layer Number

Layers of the OSI reference model

7.

Application

FTP, SMTP, TELNET, SNMP, NFS, Xwindows, NNTp, IRC, HTTP, WAP

S-HTTP, SET

6.

Presentation

ASCII, EBCDIC, ASNI, XDR

SSL, SSH

TCP/lP Protocols

5.

Session

RPC

4.

Transport

TCp, UDP

3.

Network

IP

2.

Data link

X.25, SLIP, PPP, Frame Relay

1.

Physical

LAN, ARPANET

Security protocols S/MIME, PEM, PGp, MOSS SMTP

TLS (Transport Layer Security Protocol), WAP/WTLS IPv6 Electromagnetic Emission standard (89/336/EEC-European Economical Community guideline)

An OSA may be implemented using the Open Systems Interconnection-Reference Model (OSI-RM) as a guide while designing the system to meet performance requirements. The model employs a hierarchical structure of seven layers. Each layer performs value-added service at the request of the neighboring higher layer and, in turn, requests more basic services from the next lower layer. The names of the seven layers and the protocols are shown in Table 2. In the table the security protocols are also shown; they will be discussed in section 4.6. A good and detailed work on computer networks is (20).

Transmission Control Protocol/Internet Protocol-TCP / IP The TCPlIP is two interrelated protocols that are part of the Internet protocol suite. TCP operates on the OSI Transport Layer and breaks data into packets, controls hostto-host transmissions over packet-switched communication networks (Table 2). Internet protocol (IP) was designed for use in interconnected systems of packet-switched computer communication networks. IP operates on the OSI Network Layer and routes packets. The Internet protocol provides for transmitting blocks ofdata called datagrams from sources to destinations, where sources and destinations are hosts identified by fixed-length addresses. 2.4.3. Wi-Fi (Wireless Fidelity) technology

Local area wireless networking, generally called Wi-Fi (also known as 802.11b Ethernet) is a hot topic. Companies, universities and home users are setting up wireless access points and running notebook computers without network wires. Wi-Fi, or Wireless Fidelity, allows users to connect to the Internet from their home, from a hotel room or a conference room at work without wires. Wi-Fi enabled

Security technologies to guarantee safe business processes 261

computers send and receive data anywhere within the range of a base station with a speed that is several times faster than the fastest cable mod em connection. Wi-Fi connects the user to others and to the Internet witho ut the restriction of wires, cables or fixed conn ections. Wi -Fi gives the user freedom to change locations (mobility)- and to have full access to files, office and network connections wherever she/he is. In addition Wi-Fi will easily extend an established wired network [21). Wi-Fi networks use radio technologies called IEEE 802.11b or 802.1ta standards to provide secure, reliable and fast wireless connectivity. A Wi-F i net work can be used to connect computers to each oth er, to the Internet, and to wired networks (which use IEEE 802.3 or Eth ern et). Wi-Fi networks operate in the 2.4 (802.11b) and 5 GHz (802.11a) radio band s, with an 11 Mbps (802.11b) or 54 Mbp s (802.11a) data rate or with products that contain both bands (dual band), so they can provid e real-world performance similar to th e basic 10BaseT wired Ethernet networks used in many offices. 802.11 b has a range of approximately 100 meter. Products based on the 802.11a standard were first introduced in late 2001. Its strengths are the high speed and lower risk of radio frequenc y interference than either 802. 11b or 802.11g . Its weakn ess is that "a" is incompatible with the more popular "b" and the emerging "g" , because it strayed from the 2.4-GHz band . As WLAN is spreading, it could prove essential to serving large populat ions in concentrated area, such as downtown s, universities, and business centers. T he 802. l 1g promi ses complete interoperability with "b" and transmission rates up to five times faster in th e same 2.4- GHz band . Early produ cts are already on the market. The higher vuln erability to radio frequen cy interference from other 2.4-GHz devices (late-generation cordless phon es) is a big challenge for 802. l 1g(22). Wi-Fi network s can work well both for home (connecting a family's computers togeth er to share such hardware and software resources as pr inters and the Internet) and to r small businesses (providing connectivity between mobile salespeople, floor staff and "behind-the-scenes" departments). Because small businesses are dynamic, the built-in flexibility of a Wi-Fi network makes it easy and affordable for them to change and grow. Large companies and universities use enterprise-level Wi-Fi technology to extend standard wired Ethernet networks to public areas like meeting rooms, training classrooms and large auditoriums and also to connect buildings. Many corporations also provide wireless networks to their off-site and telecommuting wo rkers to use at home or in remote offices. It is easy to extend the existing networks with a Wi-Fi LAN to add another wireless computer to a Wi-Fi netw ork . Th ere is no need to purcha se or lay more cable or find an available Ethernet port on the hub or router, just the card has to be plugged in to the computer and it is conn ected to the net. 2. 4. 4. Mobile technology

Mobile communication is connected to using mob ile phones. Mobile phone was the device that offered for a great number of peopl e the possibility to make conta ct with others from anywhere, at anytime and for anybod y. M obile phon e is the device,

262

Istvan Mezgar

that realize the mobility on society level as in many countries more then 70% of the population has mobile phone. There are different mobile systems/network protocols, which are developing pretty fast. • CDMA (Code Division Multiple Access-2G)-CDMA networks incorporate spread-spectrum technology to gracefully allocate data over available cells. • CDPD (Cellular Digital Packet Data-2G)-CDPD is a protocol built exclusively for sending wireless data over cellular networks. CDPD is built on TCP/IP standards. • GSM (Global System for Mobile Communications-2G)-GSM networks, mainly popular in Europe. • GPRS (General Packet Radio Service-2.5G)-GPRS technology offers significant speed improvements over existing 2G technology. • iMode (from DoCoMo-2.5G)-iMode was developed by DoCoMo and is the standard wireless data service for Japan. iMode is known for its custom markup language enabling multimedia applications to run on phones. • 3G-3G networks promise speeds rivaling wired connections. Both in Europe and North America, carriers have aggressively bid for 3G spectrum but no standard has yet emerged. The introduction ofWAP (Wireless Application Protocol) was a big step forward for the mobile communication as this protocol made possible to connect mobile devices to the Internet. By enabling WAP applications, a full range of wireless devices, including mobile phones, smart-phones, PDAs and handheld pes, gain a common method for accessing Internet information. The spread of WAP became even more intensive as mobile phone industry actively supported WAP by installing it into the new devices. As WAP was designed to operate on top of any type of wireless data network WAP enables rapid application deployment and provides access to the broadest consumer base. Whether network operators are deploying CDMA, CDPD, GPRS, GSM, iDEN, PDC or TDMA data solutions, application providers can reach subscribers across multiple operator networks with a single application. WAP applications exist today to view a variety of WEB content, manage email from the handset and gain better access to network operators' enhanced services. Beyond these information services, content providers have developed different mobile solutions e.g., mobile e-commerce (mCommerce). Mobile technology affects the operation of enterprises as well. The main reasons to develop a mobile solution in the enterprise are listed in the followings: • Provide access to company email, • Provide access to Intranet applications, • Develop specific company applications, • Permanent contact with service workers, • Improve work scheduling, • Possibility for mCommerce.

Security technologies to guarantee safe business processes

263

Mobile communication extends company data, back-end information systems, and email to mobile employees broadens the accessibility of mission critical data. Mobile access modifies the way workers interact with colleagues, customers, and suppliers. 2.4.5. Powerline communications

As cable, telephone and wireless companies compete to provide high-speed Internet access to homes, a new challenger is emerging based on a decidedly old technology. The idea is to connect the Internet and network computers in a LAN, by using the world's largest existing network, the power grid. Powerline Communications (PLC)-communications over the electricity distribution grid-has become a hot topic recently. Although this technology has been in use for special applications for several decades-e.g., street lighting is frequently operated according to this principle-communication in these cases is exclusively in the narrowband range and transmission rates are correspondingly low. The first attempts to realize the power grid as a communication network were not really successful, but the technological advancements over the last few years have overcome the technical issues, most notably that of line noise or interference from electrical devices plugged into the same electricity grid, which can disrupt data-transmission. PLC works by transmitting data signals through the same power cables that transmit electricity, but it uses a different frequency. To do this, every PC needs to be attached with a PLC adapter, which also functions as a modem [23]. The operation procedure ofPLC can be divided into two phases: - Procedures which are performed outside the home (outdoor); The conventional telecommunications infrastructure is used to connect the relevant local network station with the telephone network or a specific Internet backbone. Depending on distance and local conditions, the connection is enabled by radio, copper lines or optical cables. The local network station combines data and voice signals on the power grid and sends them as a data stream to any socket in connected households i.e., to the end user via the low-voltage network. - Procedures inside the home (indoor); The access point forwards incoming data streams to the indoor network, and an indoor master in the household controls and coordinates all (externally and internally) transmitted data signals. Intermediate adapters separate data and power at the socket and forward the data to individual applications. There is no need for separate telephone or data cabling since the socket, far from being a mere electrical point, becomes a powerful communications interface which bridges the last mile for high-speed Internet access, thus enabling networking throughout the building or household. The powerline technology applied today transmits data at 4.5 Megabits per second (Mbit/s) via the electricity supply grid-in the medium-term rates of up to 20 Mbit/s are possible-and provides permanent high speed access to the Internet (always online) from every mains voltage supply socket in a building, and makes broadband capacity cost-efficiently available over the "last mile". It is no longer necessary to always dial

264

Istvan Mezgar

into the network , or indeed to install additional cabling within a building, so PLC is also an interesting alternative for an in- house data net work . Because PLC uses the existing electrical wirings hidd en in the walls of homes and buildings, users can do away with messy cables and do not need to open floorbo ards, hack walls and break ceilings to tun th e wires. PLC also enables indoo r networking for PCs and printers, plus shared Intern et access betwe en PCs in an office or home. In addition, PLC boasts a super ior distance of300 m (without using repeaters) compared to 100 m for standard Fast-Ethern et and about 100 m for 802.11 b wireless connections. For utility suppliers, PLC opens a whole new revenue stream for them, which they can deploy quickly. For service providers buying wholesale service from utility com panies, PLC also offer various benefits, including the speed and cost ofdeployment and the ability to break the telephon e company monopoly on last-mile access in many countries. Proof that the PLC conce pt also works in practice was furni shed by a series of field trials in 16 European countr ies from Portugal to Scandinavia, as well asin Hong Kon g and Singapore. These trials fulfilled all expectations of reliability, functionality and the practical application s of powerline communications. T he first installations are now already up and running or abou t to go live. 2.4.6. TIle Grid computing

"G rid" computing is an important new field, that has to be distinguished from conventional distributed computing by its focus on large- scale resource sharing, innovative applications, and high-p erfor mance orientation. " Grid" can be defined as a hardware and software infrastruc ture that provides dependable, consistent, pervasive and inexpensive access to high- end comp utational capabilities resulting flexible, secure, coordinated resource sharing amo ng dynamic collections of individuals, institutions, and resources-to sum up them as virtual organization s (24). T he real and specific problem that und erlies the Grid concept is coo rdinated resource sharing and problem solving in dynamic, multi-inst itut ional virtual organizations. The shari ng is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resource brokering strategies emerging in indu stry, science, and engineering. This sharing is highly controlled, clearly and carefully defined wh at is shared, who is allowed to share, and the conditions under whi ch sharing occurs. A set of individuals and/or institutions defined by such sharing ru les form is called a virtual organization (VO). Furth ermore, sharing is about more than simply document exchange (as in "virtual enterprises"): it can involve direct access to remote software, computers, data, sensors, and oth er resources. For example, membe rs of a consortium may provide access to specialized software and data and/or poo l their computational resources. The memb ers of a Virtu al Organization do not necessarily have to wor k togeth er on the same site, but the Grid will make it feel, to the memb ers, as if they are on the same network. The Grid architecture is a protocol architecture, with protocols defining the basic mechanisms by which VO users and resources negotiate, establish, manage, and exploit sharing relationships. A standards- based open architecture facilitates extensibility,

Security technolo gies to guarantee safe business processes 265

interoperability, port ability, and code sharing; standard protocols make it easy to define standard services that provide enhanced capabilities. The primary goal ofthe Grid at the mom ent is to allow coordinated resourc e sharin g in Virtu al O rganizations. Current Int ern et technol ogies address commu nication and information exchange among computers but do not provide integrated approaches to the coordinated use of resources at multiple sites for com putation . Business-to business exchanges focus on infor mation sharing (often via centralized severs), virtual enterprise techn ologies do the same. Enterprise distributed com puting technol ogies like COR BA and Ent erpriseJava are not able resource sharing within the organization. Grid is buildin g on these existing techn ologies, rath er than com pete with them; the Grid will act as a middl eware between high-level behaviors of the Intern et (such as its protoco ls, and th e lower levels, for example the application layer), com plement its functionality, and to add flexibility. Th e Grid can be viewed as an extension to the Web, building on its prot ocols, and offering new functio nality [25]. As the Grid is built on th e existing Internet, it will share its capabilities, such as simple data retrieval and transfer, as well as the basic file sharing functions provided by peer-to peer applications. Th e prospects for the future , however, are far greater, and could not only change the way of sharing information, but also the way computers inter pret informat ion and even, by integrating developing technologies such as Jini and Bluetooth , how this tech nology can involve the daily life. The convergence of the Internet with mobile techn ologies has been foreseen for several years, with the continual developm ent of Bluetooth , a wireless network interface, and the unfulfilled turn- up of WAP technology. With the successful launch of th e 802.11 b standard for wireless comm unication, and with the build-up sur rounding th e eagerly anticipated 3G mobi le phone techn ology, it will soon be possible to obtain mobile high-b andwidth connec tion to LAN s and the Intern et. The advantages for the Grid are obvious: it will be possible to gain high- speed access to resources, services and information wit ho ut the restriction of cables, and with high quality service. T his freedom could be integra ted with Jin i technology in order to gain extra flexibility by broadeni ng the range of devices. Of cour se there are limitations on the technol ogy today. E.g., wireless access to Int ern et require s an Access Point (AP) to be within range of wireless host (about 100 m), compression and synchronization techniques may not be suitablee for sending large quantities of technical data over a wireless link. 3, BUSINESS PROCESSES

3.1. The content of business processes

Business process (BP) can be defined as "a set of logically related tasks performed to achieve a defined business outco me." [26]. A process can be described as a struc tured, measured set of activities/task s designed to produ ce a specified output for a particular custome r or market. It implies a strong emphasis on how work is don e within an organization. A techn ique for identifying business processes in an organization is e.g., the value chain meth od.

266

Istvan Mezgar

Processes are usually identified based on their starting and end points, interfaces, and organization units involved. Examples of processes include: developing a new product, ordering goods from a supplier, etc. Processes may be defined based on three dimensions [26J: • Entities: Processes take place between organizational entities. They could be Interorganizational, Interfunctional or Interpersonal. • Objects: Processes result in manipulation of objects. These objects could be Physical or Informational. • Activities: Processes could involve two types of activities: Managerial (e.g., develop a plan) and Operational (e.g., fill a customer order). 3.2. Relation between BPR & information and communication technology

There is a close connection between information and communication technology and business processes. Business processes represent an approach to coordination across the firm while ICT promise to be the most powerful tool for reducing the costs of coordination. ICT capabilities should support business processes, and business processes should be in terms of the capabilities ICT can provide, so ICT and BP are in recursive relationship. ICT shuld be viewed as more than an automating or mechanizing force, it can fundamentally reshape the way business is done, so ITC has a strategic role in BP life cycle. In case the business environment changes, the business processes have to be redesigned or in a broader sense re-engineered. Business Process Re-Engineering (BPR) can be defined as the analysis and incremental redesign of workflows and processes within and between organizations to achieve breakthrough improvements in performance measures [26], [27J. Because of the deep and pervasive changes, organizations undertaking BPR must redesign not only their business processes, but also their products, assets, culture, thought patterns, behaviors, and I or technology spanning across functional areas. Davenport & Short in [26] describe the following capabilities that reflect the roles that IT can play in BPR: Transactional, Geographical, Automatical, Analytical, Informational, Sequential, Knowledge Management, Tracking, and Disintermediation. In the current context of the increasing recognition of ICT as a strategic resource, the leadership ofan information system function in an organization could be viewed as a powerful, and perhaps critical, element in affecting the success ofBPR. Clearly, the purpose ofBPR is the transformation ofbusiness process; and the strategic application of an IC system through its functions can make a powerful impact on a business as it is transformed. In case of re-engineering business processes the rapidly developing information and communication technologies have a pulling force by offering revolutionary new possibilities how business processes can be reorganized. Also, innovative uses of ICT would inevitably lead many firms to develop new, coordination-intensive structures, enabling them to coordinate their activities in ways that were not possible before. Such coordination-intensive structures may raise the

Security technologies to guarantee safe business processes 267

organization's capabilities and responsiveness, leading to potential strategic advantages. As ICT has a strategic role in business process re-engineering, it is very important to handle BP-related information in a trusted way. Information and communication systems have to be equipped with all security techniques and tools that can prevent the access of not authorized persons/organizations to sensitive information, data. In the following sections these technologies will be introduced. 4. SECURITY TECHNOLOGIES

4.1. Types and trends of cyber crimes

The logical approach to introduce security mechanisms is to start with the definition of the threat model of the information system. The threat model is the collection of probable attack types, so it defines the system protection requirements as well. Attacks on information and communication systems are classified into two main groups: • Passive attack can only observe communications or data. • Active attack can actively modify communications or data. In the followings the active attacks will be described, but passive attacks precede active attacks in many cases. The "Computer Crime and Security Survey" of Computer Security Institute (CSI) is based on responses from 530 computer security practitioners in U.S. corporations, government agencies, financial institutions, medical institutions and universities [28]. The survey confirms that the threat from computer crime and other information security breaches continues unabated. The total reported financial loss of 251 responders was $201,797,340 in 2003, while in 2000 this sum was $265,589,940 of249 responders. These numbers demonstrate, that the value, or the loss/damage caused by the attacks is decreasing. One reason of this shrinkage can be that the companies use security technologies today in a bigger extent then they did several years before. The 525 responders use the following security technologies (in %): digital IDs-49, intrusion detection-73, physical security-91, encrypted login-58, firewalls-98, anti-virus SW-99, encrypted files-69, biometrics-11, access control-92. The most frequent types of attacks and the financial loss caused by them are listed in Table 3. (The percentage gives the rate of responders involved in the attack; the losses are in $). It is worth to give a short description ofthe most common attack types to understand later the needed counter-measures. A detailed description of attack types can be read e.g., in [35]. Computer viruses are the best-known form of Internet security attack. A virus is a piece of software programmed with the unique ability to reproduce and spread itself to other computers. A virus may be merely annoying, or completely destructive. The most destructive viruses can erase the contents ofthe computer's hard drive, or make it completely useless. If no back-ups were made, important data can be lost or damaged,

268

Istvan Mezgar

Table 3 Most frequent types of attacks in US (28). Type of attack

In%

Virus Insider abuse of net access Laptop theft Unauthorized access Denial of service System penetration Thief of proprietary information Sabotage Financial fraud Telecom fraud Telecom eavesdropping

82 80 59 45 42 36 21 21 15 10 6

Caused financial loss 27,382,340 11,767,200 6,830,500 406,300 65,643,300 2,754,400 70,195,900 5,148,500 10,184,400 701,500 76,000

that could result serious financial losses. A victim computer can be infected a virus or a script either through e-mail, by down-loading infected software from the Internet, or by using infected media (floppy disk or CD-ROM). A special type ofvirus is the Trojan horse in the way it is transmitted, however, unlike a virus, a Trojan horse does not replicate itself. It stays in the target machine, inflicting damage or allowing somebody from a remote site to take control of the computer. A worm is an other type of virus that can reproduce itself across all the different nodes or connections. They generally cause most of their damage by plugging the network, using up valuable memory and wasting valuable processing time. If an attacker gets control of a computer, he or she can access all the files that are stored on the computer, including all types of sensitive information (personal or company financial information, credit card numbers, and client or customer data or lists). It is obvious, that, this could do significant damage to any business. If data is altered or stolen, a company can risk losing the trust and credibility of their customers. In addition to the possible financial loss that may occur, the miss of information can cause the loss of competitiveness in the market. Sometimes the biggest problem is that, as the information can be copied as well, the original owner will not realize the attack as no information loss can be detected. But the data will be present in an other location (disk) as well, and without the knowledge of the right owner the valuable information will be used by the illegal owner. Denial of service attacks are dead intervals of a computer system caused by an attacker who used one or more computer systems to force another system off-line to overload it with useless traffic. A denial of service attack is a form of traffic jam on the network-an attacker can paralyze e.g., a business's web server in this way. In the cited survey there are lot of interesting and instructive statistics and some case studies as well, but it is the trend what is most important that is confirmed by the statistics. The main conclusions of the analysis are as follows: • Overall financial losses from 530 survey respondents totaled $201,797,340. This is down significantly from 503 respondents reporting $455,848,000 last year. (75 percent oforganizations acknowledged financial loss, though only 47% could quantify them.)

Security technologies to guarantee safe business processes 269

• T he overall number of significant incidents remained roughly the same as last year, despite the drop in financial losses. • Losses reported for financial fraud were drastically lower, at $9, 171,400. This compares to nearly $116 million report ed last year. • As in prior years, theft of propr ietary information caused the greatest financial loss (570, 195,900 was lost, with the average reported loss being approximately $2.7 million ). • In a shift from previous years, the second-most expensive computer crime among survey respondents was deni al of service, with a cost of S65,643,30Q-up 250 percent from last year's losses of $18,370, 500. Th ese conclusions have to be inspiriting for the organizations and for information managers to do effective and complex steps to defend their systems, companies. 4.2. Computer system and network security

T he data presented in the previous subchapter clearly show the import ance of taking care of secur ity from physical level to the information level. Security has its own cost, but it is possible to calculate, whil e losses can not be predicted! Let's suppose to run a system without secur ity for a year. After one year compare the value of the system with its value one year before. The difference is the frame for a secur ity program 's budget. Secur ity is a consciou s risk-taking, so in every phase ofa computer system's life cycle must be applied that secur ity level which costs less than the expense of a successful attack. With oth er wo rds secur ity must be so strong, that it would not be worth to attack th e system , because the investment of an attack would be higher than the expected benefits. At different levels different security solutions have to be applied, and these separate parts have to cover the entire system consistently. In Table 4 the main practical fields of ICT secur ity are sum marized in order to better und erstand the conte nt of the following chapters. In th e field of security standards and quasi standards have an imp ortant role. In the followings some of the most relevant one s are introduced shortly, only to show the directions and status of these significant works. In order to classify the reliability and secur ity level of computer systems an evaluation system has been developed and th e criteria have been summarized in the so-called "Orange book" [29]. Its purpose is to provide technical hardware/firmware/software secur ity criteria and associated techni cal evaluation meth odologies in support of the overall ADP system secur ity policy, evaluation and approval/ accreditation responsibilities promulgated by DoD Directive 5200.28. T he ISO /IEC 10181- (30) multi-part (1-8) " Intern ational Standard on Security Frameworks for Open Systems" addresses the application of security services in an "Open Systems" environm ent , where the term " O pen System" is taken to include areas such as database, distributed applications, open distributed processing and OS!. The Securi ty Frameworks are concerne d with defining the means of providing protection for systems and objects within systems, and with the int eractions between systems.

270

Istvan Mezgar

Table 4 Main fields ofICT security

Human & SW security

Physical security

Organization security

Personal security

Network (channel) security

Computer (end point) security

Definition of security policy (e.g., access rights).

Trained and reliable staff needed.

Using reliable network tools, and frequently checked

Using tested application SW tools, and frequently checked operation system, and properly configured HW systems.

Placing the computers In secure location of the building and offices.

Physical identification technologies (fingerprints, etc.).

communication

channels and well configured network elements. Prevent direct, or close access to network cables, or application of special technologies.

Prevent direct physical access to computers by unauthorized persons, or a close access in

electromagnetic way.

The Security Frameworks are not concerned with the methodology for constructing systems or mechanisms. The Security Frameworks address both data elements and sequences of operations (but not protocol elements), which may be used to obtain specific security services. These security services may apply to the communicating entities of systems as well as to data exchanged between systems, and to data managed by systems. The ISO/IEC 15408 standard [31] consists of three parts, under the general title "Evaluation Criteria for Information Technology Security" (Part 1: Introduction and general model, Part 2: Security functional requirements, Part 3: Security assurance requirements). This multipart standard defines criteria, to be used as the basis for evaluation of security properties of IT products and systems. This standard originates from the well-known work called "Common Criteria" (CC). By establishing such a common criteria base, the results of an IT security evaluation will be meaningful to a wider audience. By the time there are available "Protections Profiles" created for computer systems and for smart cards also based on CC guidelines. The standard will permit comparability between the results of independent security evaluations. It does so by providing a common set of requirements for the security functions of IT products and systems and for assurance measures applied to them during a security evaluation. The evaluation process establishes a level of confidence that the security functions of such products and systems and the assurance measures applied to them meet these requirements. The evaluation results may help consumers to determine whether the IT product or system is secure enough for their intended application and whether the security risks implicit in its use are tolerable. The standard is useful as a guide for the development of products or systems with IT security functions and for the procurement of commercial products and systems with such functions.

Security technologies to guarantee safe business processes

271

4.3. Role of trust

To develop the proper security policy, to select the proper equipment, tools, and the best fitting methodology, algorithm needs high-level expertise as in case such a multidimensional, interdisciplinary decision problem there is no optimal, only suboptimal solution in many cases. The problem space is extremely complex, as the whole economy is based on networked information management and all sectors are strongly influenced by the LC'I, and in the Information Society the behavior and habits of the people are dynamically changing, and government supported programs can speed up certain processes. In all information and communication systems there is a common factor: the human being. This factor plays the most important role in every level and in every aspect. A human can be a designer, a developer, or a user (sometimes a hostile user-cracker) of the system. The most frequent instantiation ofthe human being is the average user who maybe is not well informed/skilled in computer science, but has an own personality and psyche. In order to move the individuals to use a certain information system they have to be convinced that it is safe to use the system, their data will not be modified, lost, used in other way as defined previously, etc. In case the individuals have been convinced they will trust the system and they will use it. In the following paragraphs the meaning and content of trust will be introduced, and the possibilities (technologies, methods, policies, etc.) of gaining this trust will be shown as well. The word "trust" is used by different disciplines, so there are many definition of the term fulfilling the demands of the actual theory, or application. In the everyday life without trust, one would be confronted with the extreme complexity of the world in every minute. No human being could stand this, so people have to have fixed points around them, one have to trust in family members, partners, trust in the institutions ofa society and between its members, and trust within and between organizations partners. Trust can be defined as a psychological condition comprising the trustor's intention to accept vulnerability based upon positive expectations of the trustee's intentions or behavior [32]. Those positive expectations are based upon the trustor's cognitive and affective evaluations of the trustee and the system/world as well as of the disposition of the trustor to trust. Trust is a psychological condition (interpreted in terms of expectation, attitude, willingness, perceived probability). Trust can cause or result from trusting behavior (e.g., co-operation, taking a risk) but is not behavior itself. The following components are included into most definitions of trust: - willingness to be vulnerable/to rely, - confident, positive expectation/positive attitude towards others, - risk and interdependence as necessary conditions. Trust has different forms such as 1. Intrapersonal trust-trust in one's own abilities; self-confidence basic trust (in others).

272

Istvan Mezgar

2. Interpersonal trust-expectation based on cognitive and affective evaluation of the partners; in primary relationships (e.g., family) and non-primary relationships (e.g., business partners). 3. System trust-trust in depersonalized systems/world that function independently (e.g., economic system, regulations, legal system, technology); requires voluntary abandonment of control and knowledge [33]. 4. Object trust-trust in non-social objects; trust in its correct functioning (e.g., in an electronic device). 4.4. Security services and mechanisms

The following services form together the sense of "trust" for a human being who uses a service, or a given equipment [34]: • Privacy ensures that only the sender and the intended recipient of an encrypted message can read the contents ofthat message. To guarantee privacy, a security solution must ensure that no one can see, access or use private information, such as addresses, credit card information and phone numbers, as it is transmitted over the Internet. • Integrity ensures the detection ofany change in the content ofa message between the time it is sent and the time it is received. In many systems, if an alteration is detected, the receiving system requests that the message be resent. • Authentication ensures that all parties in a communication are who they claim to be. Server authentication provides a way for users to verify that they are really communicating with the Web site they believe they are connected to. Client authentication ensures that the user is who they claim to be. • Non-repudiation provides a method to guarantee that a party to a transaction cannot falsely claim that they did not participate in that transaction. In the real world, handwritten signatures are used to ensure this. The means for achieving these services depends on the collection of security mechanisms that supply security services, the correct implementation of these mechanisms, and how these mechanisms are used. Three basic building blocks of security mechanisms are used: • Encryption is used to provide confidentiality can provide authentication and integrity protection. • Digital signatures are used to provide authentication, integrity protection, and nonrepudiation. • Checksums/hash algorithms are used to provide integrity protection and can provide authentication. One or more security mechanisms are combined to provide a security service and a typical security protocol provides one or more services. As there are too many security technologies, tools and equipment to be introduced in this place, only the most frequently used, or some new ones will be shortly described in the following. Detailed descriptions can be found e.g., in [34], [35], and [36].

Security technologies to guarantee safe business processes 273

4.5. Tools, methods and techniques for security 4.5.1. Achieving confidentiality

The main factor of trust is confidentiality that can be achieved by technologies that convert/hide the data, text into a form that cannot be interpreted by unauthorized persons. There are two major techniques to fulfil this goal; encryption and steganography. • Encryption is transforming the message to a ciphertext such that an enemy who monitors the ciphertext can not determine the message sent. The legitimate receiver possesses a secret decryption key that allows him to reverse the encryption transformation and retrieve the message. The sender may have used the same key to encrypt the message (with symmetric encryption schemes) or used a different, but related key (with public key schemes). Public key infrastructure (PKI) technology is widely used as DES and RSA are well known examples of encryption schemes, while the AES (with the Rijndael algorithm) belongs to the new generation. • Steganography is the art of hiding a secret message within a larger one in such a way, that the opponent can not discern the presence or contents of the hidden message. For example, a message might be hidden within a picture by changing the low-order pixel bits to be the message bits. 4.5.2. Security architectures

The goal for security in distributed environments is to reflect, in a computing and communication based working environment, the general principles that have been established in society for policy-based resource access control. Each involved entity/node should be able to make their assertions without reference to a mediator and especially without reference to a centralized mediator (e.g., a system administrator) who must act on their behalf. Only in this way will computer-based security systems achieve the decentralization needed for scalability in large distributed environments. The security architectures represent a structured set of security functions (and the needed hardware and software methods, technologies, tools, etc.) that can serve the security goals of the distributed system. In addition to the security and distributed enterprise functionality, the issue of security is as much (or more) a deployment and user-ergonomics issue as technology issue. That is, the problem is as much trying to find out how to integrate good security into the industrial environment so that it will be used, trusted to provide the protection that it offers, easily administered, and really useful. 4.5.3. Firewalls

Firewalls can make the user's network appear invisible to the Internet, and they can block unauthorized and unwanted users from accessing files and systems. Hardware and software firewall systems monitor and control the flow of data in and out of computers in wired and wireless enterprise, business and home networks. They can be

274 Istvan Mezgar

set to intercept, analyze and stop a wide range of Internet intruders and hackers. Like VPNs, there are many types and levels of firewall technology. Many firewall solutions are software only; many are powerful hardware and software combinations. 4.5.4. Virus defense

Viruses and other malicious code (worms and Trojans) can be extremely destructive to the vital information and the computing systems both for individuals and businesses systems. There are big advances in anti-virus technology, but malicious codes remain a permanent threat. The reason is that the highest-level security technology can be only as effective as the users operate them. In the chain of computer security, human beings seem to be the weakest point, so there is no absolute security in virus defense. There are some basic rules that have to be followed, and in this way the users can achieve an acceptable level of virus protection: • Do not let use your computer by anybody. • Install an anti-virus program and update it regularly. • Use different anti-virus technologies. • Open e-mail attachments only from trusted sources. • Be aware on new software, even from a trusted source. • Check CDs and floppy disks before using them. • Back up files regularly. • In case the computer has been infected by a virus contact professionals (network/system administrator, or specialized firm). 4.5.5. Identification

of persons

Biometrics refers to a science involving the statistical analysis of biological observations, phenomena and characteristics. Lately, the term "biometrics" commonly refers to technologies that analyze human characteristics for security purposes. A widely accepted definition of security-based biometrics is as follows: "A biometric is a unique, measurable characteristic or trait of a human being for automatically recognizing or verifying identity." Biometric technologies, therefore, are concerned with the unique physical parts of the human body or the personal behavioral characteristics of human beings. The term "automatic" essentially means that a biometric technology must recognize or verity a human characteristic quickly and automatically, in real time. Physiological traits (eye (iris and retina), face, finger image, hand) are stable physical characteristics and are essentially unalterable. Behavioral characteristics (signature, voice, or keystroke dynamics) are influenced by both controllable actions and less controllable psychological factors. New biometric identifiers under development include body odor, DNA, ear shape, facial thermogram. As behavioral characteristics can change in the course oftime, the enrolled biometric reference template must be updated each time it is used. Although behavior-based biometrics can be lessexpensive and lessthreatening to users, physiological traits tend to offer greater accuracy and security. In any case, both techniques provide a significantly

Security technologies to guarantee safe business processes 275

higher level of identification than passwords or smart cards alone. Because biometric characteristics are unique to each individual, they can be used to prevent theft or fraud. Unlike a password or personal identification number (PIN), a biometric trait cannot be forgotten, lost, or stolen. According to security expert, biometrics is considered as providing the highest level of security. Biometry can be used in IC systems instead ofpasswords, as with biometry the person can be identified not the device. 4.5.6. Smart cards

There is a strong need for a tool that can fulfil the functions connected to trustworthy services. Smart card (SC) technology can offer a solution for current problems of secure communication by fulfilling simultaneously the main demands of identification, security and authenticity besides the functions of the actual application. The smart card is a plastic plate that contains a microprocessor, a chip, similar to computers. It has its own operation system, memories, file system and interfaces. A smart card can handle all authorized requests coming from the "outside world". It is also called IC card. There are different SC configurations equipped with different interfaces. The crypto-card has a built-in chip for doing encryption/decryption, other cards have keyboards, the SC for secure identification has fingerprint sensor [37], [38]. Smart card can help in secure signing of digital documents as well. Smart cards can be read by SC-readers integrated or connected to PCs or any other equipment. Smart cards are important parts of physical or logical access systems also for enterprises. The application ofSCs in security field can results the next step of the technological revolution because of offering new possibilities in effective integration of the functions of security and the actual application field. In this way the SC can be the general, and at the same time personalized "key" of the citizens for the Information Society. 4.5.7. Personal trusted device

People like smart, little equipment, tools that they can keep in their hands, can bring them permanently with themselves, so they can control them both physically and in time. This physical and time controllability makes people thinking that these devices are secure (physicallynobody else can access them), so they trust them (even this approach is not always really true). In case such a device can be used for communication, it is called mobile phone. Today mobile phones represent the first generation of Personal Trusted Device (PTD) as they can be used not only for talking but for different other functions as well. The connection of mobile phones with the Internet (WAP) made a big leap in the direction to become mobile phones to PTD. The sale of functions became really wide and different mobile technologies have appeared (mTechnologies). The mobile phone will became a trusted device in e-mail or Web communication by using PKI and other crypto-systerns. The user authentication could be done based on biometry (fingerprint or voice). The costs of accessed services could be paid with digital money, and the m-purse could be reload using OTA (over the air) protocols. Moreover the application management in such devices could be done dynamically

276

Istvan Mezgar

and every user could creat his/her own profile and environment. The application possibilities of a PTD are nearly infinite, only the fantasy limits them. Emerging researches are done in this field, which could become a reality very soon. 4.6. Application of security technologies in networks

There are four different concerns that all security system can address: privacy (confidentiality), integrity, authenticity and non-repudiation. This is the goal in the case of the different networks as well, independently what type of media they use for data transmission. 4.6.1. Wired network security

At the beginning of networking there was a need mainly for the reliable operation, but the secure and authentic communication has became a key factor for today. According to Internet users, security and privacy are the most important functions to be ensured and by increasing the security the number of Internet users could be double or triple according to different surveys. The main reason of the increased demand is the spread of electronic commerce through the Internet, where money transactions are made in a size ofmillions of dollars a day. It is not just the question of our letters content or our user account; it is the question of money. Making false transactions in the real world are not so easy than make them in the insecure virtual world, where the speed of these false transactions and the effect of these are not only dangerous for the individuals but it is highly dangerous also for governments. There are several solutions to secure the network, just security is in inverse proportion to usability and the most of the security tools are patches, extra solutions and rather stand-alone techniques. There are alternatives to use secure connections, some examples from the everyday applications (Table 2). The FTP (File Transfer Protocol) application is used to provide file transfer across a wide variety of systems. Usually implemented as application-level programs, FTP uses the Telnet and TCP protocols. The server side requires a client to supply a login identifier and password before it will honor requests. The information travels in plain, and with ftp dump is possible to sniff the communication, therefore advisable to use SSH based SCP (secure copy) for file transfer. SSH is a Secure Shell, secure access method of a remote server instead of telnet. (includes secure copy service instead of FTp, and transfers securely X sessions too!) Instead of HTTP there is SHTTP (Secure Hypertext Transport Protocol) which is HTTP over SSL (Secure Socket Layer). Instead of simply e-mail there is the PGP (Pretty Good Privacy) signed e-mail. With these techniques it can be guaranteed that the information in e-mail, file or on Web page will be reached only by authorized parties. As the SSL-Secure Sockets Layer (security protocol for TCP/IP) is the most important protocol it will be discussed in more detailed way in the followings. Over the Internet, the Secure Socket Layer (SSL) protocol, digital certificates and either user name/password pairs or digital signatures are used together to provide all four types of security.

Security technologies to guarantee safe business processes

277

Public key cryptography is an encryption method that is a key component ofSSL. It uses pairs ofkeys and mathematical algorithms to convert clear text into encrypted data and back again. The pair consists of a registered public key and a private key that is kept secret by its owner. A message encrypted with the public key can be decrypted only by someone with the private key. Likewise, a message encrypted with the private key can be decrypted only by someone with the public key. SSL uses public key cryptography to exchange this key at the beginning of a secure Internet conversation, thus ensuring that it remains a secret for the duration of the conversation. SSL uses public key cryptography, bulk encryption algorithms and shared secret key exchange techniques to provide privacy over the Internet. To provide integrity, SSL uses hashing algorithms that create a small mathematical fingerprint of a message. If any part of the message is altered, it will not match its fingerprint when the message is checked at the receiving end. In this case, the sender is asked to resend the message. Because anyone can generate key pairs, it is possible for a malicious party to put up an impostor Web site and then falsity information in a transaction by providing a public key to a user. To prevent this kind offraud, digital certificates are used to provide an authenticated way to distribute public and private keys. Digital certificates are also used to authenticate the parties of an Internet conversation so that users and content providers can both be confident they know whom they are communicating with. The remaining issue to address is non-repudiation. As with client authentication, most Web applications today simply rely on the entry of a user name and password to provide non-repudiation. Applications can request a digital signature from a client, which requests that the user specifically authorize a transaction. The authorization is then encrypted utilizing the user's private key from their client certificate. Not surprisingly, a digital signature is analogous to a real signature on a check and serves the same purpose. So far though, the adoption of client certificates for use by individuals on the Internet has been slow. Different combinations ofall ofthese security techniques are used for different applications, depending on which forms of security are important and the degree to which the solution needs to be balanced with the convenience for the user. For example, certificate-based client authentication and non-repudiation are not widely used on the Web today because most users don't want to be bothered with the administrative tasks of obtaining and safely maintaining a client certificate. 4.6.2. Security technoloyies for wireless communication

A user of the wireless network can apply a variety of simple security procedures to protect the Wi-Fi connection. These include enabling 64-bit or 128-bit Wi-Fi encryption (Wired Equivalent Privacy-WEP), changing the password or network name and closing the network. These basic techniques work in both small offices and large corporations. However, additional, more sophisticated technologies and techniques can also be employed to further secure the business network. WEP and other wireless encryption methods operate strictly between the Wi-Fi computer and the Wi-Fi access point or gateway. When data reaches the access point or gateway, it is unencrypted and unprotected while it is being transmitted out on the

278 Istvan Mezgir

public Internet to its destination - unless it is also encrypted at the source with SSL or when using a VPN (Virtual Private Network). WEP protects the user from most external intruders, but to reach a more secure connection additional technologies have to be applied, as WEP also has known security holes. There are several technologies available, but currently the VPN works best. - VPN (Virtual Private Network)

Today most companies use VPN to protect their remote-access workers and their connections. It works by creating a secure virtual "tunnel" from the end-user's computer through the end-user's access point or gateway, through the Internet, all the way to the corporation's servers and systems. It also works for wireless networks and can effectively protect transmissions from Wi-Fi equipped computers to corporate servers and systems. A VPN works through the VPN server at the company headquarters, creating an encryption scheme for data transferred to computers outsides the corporate offices. The special VPN software on the remote computer or laptop uses the same encryption scheme, enabling the data to be safely transferred back and forth with no chance of interception. However, VPN access, which enables access to the corporate network, corporate e-mail and communications systems, is provided only to those who've been given authorization. - There are other security technologies thatcan apply for WI-PI [39]

Kerberos-Another way to protect the wireless data is by using a technology called Kerberos. Created by MIT, Kerberos is a network authentication system based on key distribution. It allows entities to communicate over a wired or wireless network to prove their identity to each other while preventing eavesdropping or replay attacks. It also provides for data stream integrity (detection of modification) and secrecy (preventing unauthorized reading) using cryptography systems such as DES. The Media Access Control (MAC) Filtering-As part of the 802.11 b standard, every Wi-Fi radio has its unique Media Access Control (MAC) number allocated by the manufacturer. To increase wireless network security, it is possible for an IT manager to program a corporate Wi-Fi access point to accept only certain MAC addresses and filter out all others. The RADIUS (Remote Access Dial-Up User Service) Authentication and Authorization-is another standard technology that is already in use by many companies to protect access to wireless networks. RADIUS is a user name and password scheme that enables only approved users to access the network; it does not affect or encrypt data. Because of the extraordinary success and adoption ofWi-Fi networks, many other security technologies have been developed and are under development. Security is a constant challenges, and there are thousands of companies developing different solutions. There are a variety of security solutions that effectively are put on the "top" of the standard Wi-Fi transmission and provide encryption, firewall and authentication services. Many Wi-Fi manufacturers have also developed proprietary encryption technologies that greatly enhance basic Wi-Fi security.

Security technologies to guarantee safe business processes 279

An important problem is the Wi-FI Security in public spaces. Wireless networks in public areas and "HotSpots" like Internet cafes may not provide any security. Although some service providers do provide this with their custom software, many HotSpots leave all security turned off to make it easier to access and get on the network in the first place. If security is important for the user the best way to achieve this when one is connecting back to the office to use a VPN. In case the user does not have access to a VPN and security is important, it is better to limit the use of wireless network in these areas to non-critical e-mail and basic Internet surfing. Individuals and companies that have the need to go beyond basic security mechanisms can choose to implement and combine these basic technologies to increase protection for their mobile workers and their data. As with any network, wired on wireless, the more layers of security that are added, the more secure the transmissions can be. 4.6.3. Mobile security

Mobile security is inherently different than LAN-based security. The basic demands for privacy (confidentiality), integrity, authenticity and non-repudiation are even harder as the range of users is broader as in traditional networks. As security in the mobile word is more complex and different, it needs more advanced network security models. It can be stated that mobile communication is one of the biggest changes in the security market. Mobile security measures depend on the types of data and applications being mobilized. The more sensitive the data, the more effective security measures must be introduced. Enterprises must be aware of how traditional security challenges change in relevance in a mobile world. Some special considerations for mobile security include the followings: - Problem of authentication As companies report very high numbers of mobile device theft/lost, simply authenticating the mobile device is insufficient. A process of "Two Factor Authentication" had to be introduced. This technology is used to verity both the device and the identity of the end-user during a secure transaction (i.e., two-factor authentication confirms that both the device and the user are authorized agents). Two-factor authentication is critical in protecting network integrity from the inevitability ofstolen or lost devices. - Minimize end user requirements End users are impatient when using mobile services. They want access to applications and data immediately and will resist time-consuming accessing tasks. Requiring end users to conduct complex security processes is counterproductive to the purpose of mobile computing, and further exposes the enterprise to security breach. While a successful mobile application will require some user participation, involvement should be restricted to quick, easy and mandatory tasks. Password-protect enterprise applications-an alternative to power-on password authentication, this requires users to enter a password or pen-based signature when accessing company content. This is a critical first-step in mobile security procedures.

280

Istvan Mezgir

It is critical that a mobile application supports industry-standard security protocols, including:

HTTPS-This is Hyper Text Transmission Protocol run on a Secure Socket Layer (SSL) WTLS-Standard for Wireless Transport Layer Security. This protocol provides authentication and encryption for WAP devices. WPKI-WAP PKI (used by VeriSign) to maintain security. PKI, or Public Key Infrastructure, is a protocol enabling digital certificates on wired devices. WPKI is an adaptation ofPKI for mobile devices that meets m-commerce security requirements. PKI provides an infrastructure and procedures required to enable trusted partnerships needed to authenticate servers and clients in wireless application environments. Any type of standard encryption technology-e.g., RSA, Triple DES,

- Implement WPKI authentication technology PKI, or Public Key Infrastructure, is a protocol enabling digital certificates on wired devices. WPKI is an adaptation of PKI for mobile devices that meets m-cornmerce security requirements. Because PKI functions are bandwidth intensive and require processors tuned expressly for PKI operations, using a PKI proxy server allows to balance processing between the mobile device, the mobile application server, and the proxy server. -WTLS WAP Version 1.1. includes the Wireless Transport Layer Security (WTLS) specification, which defines how Internet security is extended to the mobile Internet. WTLS is poised to do for the wireless Internet what SSL did for the Internet-open whole new markets to m-commerce opportunities. There are three steps of the WAP security model: - WAP gateway simply uses SSL to communicate securely with a Web server, ensuring privacy, integrity and server authenticity. - WAP gateway takes SSL-encrypted messages from the Web and translates them for transmission over wireless networks using WAP's WTLS security protocol. - Messages from the mobile device to the Web server are likewise converted from WTLS to SSL. In essence, the WAP gateway is a bridge between the WTLS and SSL security protocols. The need for translation between SSL and WTLS is incurred by the very nature of wireless communications: low bandwidth transmissions with high latency. Because SSL was designed for desktop and wired environments with robust processing capabilities connected to a relatively high-bandwidth and low-latency Internet connection, mobile phone users would be disappointed by the delays required to process SSL transactions.

Security technologies to guarantee safe business processes

281

Furthermore, to put SSL functionality into handsets would raise mobile phone costs and destroy the low-cost pricing paradigm that is driving industry growth. WTLS was specifically designed to conduct secure transactions without requiring desktop levels of processing power and memory in the mobile device. WTLS processes security algorithms faster by minimizing protocol overhead and enables more data compression than traditional SSL solutions. As a result, WTLS can perform security well within the constraints of a wireless network. These optimizations mean that smaller, portable consumer devices can now communicate securely over the Internet. The translation between SSL and WTLS takes milliseconds and occurs in the memory of the WAP gateway, allowing for a virtual, secure connection between the two protocols. WTLS and the WAP security model provide an extremely secure solution that leverages the best technologies from the Internet and mobile worlds. When the WAP gateway is deployed in an operator environment according to standard operator security procedures, subscribers and content providers can be assured that their personal data and applications are secure. 4.6.4. Security issues in PLC

From a cybersecurity perspective, the electric power grids are now more fragile, margins for error are significantly less. With diminishing margins and power reserves, the probability for cascading catastrophic effects is higher. There are opinions that hackers could shut down the Internet and the electric power grid if they wanted to, based on some theories of how networks work. The idea that certain nodes on a network are more important than others is nothing new-but that doesn't explain how the Internet gets shut down or (even more unlikely) how a "hacker" would shut down a power grid. There are theories to suggest some useful things about how certain nodes should be even more carefully protected from such attacks. But the highly decentralized structure of the power plants-generators are not connected to the networks, which are hooked to the Internet-means that the damage hackers can cause is limited. Power plants are complex technological organizations, so to shut down a generator, one has to open circuit breakers and instruct generators to lower the "set points," the levels at which they are transmitting power. This is not something that can be done solely via a computer network [40]. Security experts say that energy companies are becoming increasingly sophisticated with network security, and have software systems in place allowing them to monitor any suspicious activity. That's important, because while the networks controlling power grids are currently offline, the utilities will come to rely more and more on the Internet. Companies recently launched a Web-based service for its customers, which will eventually offer services including online bill payment. This is where the companies are vulnerable; hacker could break into the network and "modify" the billing system. However, there are potential security issues because a single power line from the utility company goes to multiple homes and office buildings. This means that hackers can "listen in" on the shared bandwidth. But according to a company, a service provider

282

Istvan Mezgar

that rolled out commercial PLC services in Europe, security is not an issue. Its website says that PLC is harder to tap than GSM mobile phones. 4.6.5. Security in the Grid

It is important to fix that the "Grid" can be viewed as an "extension" of the Internet, so it is rather a set of additional protocols and services that build on Internet protocols and services to support the creation and use of computation- and data-enriched environments. Any resource that belongs to the Grid also, by definition, belongs to the Internet. As a result ofthe research and development efforts ofthe Grid community protocols, services, and tools have been produced that include e.g., security solutions that support certificate management, co-ordination policies and services supporting secure remote access to computing and data resources and the co-allocation of multiple resources. With respect to security aspects of the Connectivity layer of the Grid, it is obvious that the complexity of the security problem makes it important that any solutions should be based on existing standards whenever possible. As with communication, many of the security standards developed within the context of the Internet protocol suite are applicable (e.g., user "log on" (authenticate), integration with various local security solutions, user-based trust relationships). The public-key based Grid Security Infrastructure (GSI) protocols are used for authentication, communication protection, and authorization. GSI builds on and extends the Transport Layer Security (TLS) protocols to address most of the issues listed above: in particular, single signon, delegation, integration with various local security solutions (including Kerberos), and user-based trust relationships. X.509-format identity certificates are used. The Grid will also offer a larger variety of resources, for example remote execution of software, use of computing power and secure access to remote networks, similar to Virtual Private Networks (VPN). 5. SECURITY APPLICATIONS IN SMART ORGANIZATIONS

5.1. Security in distributed environments

Distributed systems and collaborative environments, such as widely distributed supercomputers and large-scale storage systems, data sharing in restricted collaborations, network-based multimedia collaboration channels and distributed production systems give rise to a range ofrequirement for distributed access control and the overall security of the systems. In all of these scenarios, the resource (data, instrument, computational and storage capacity, communication channel) has multiple owners, and each owner will impose use-conditions on the resource. All of the use-conditions must be met simultaneously in order to satisfy the requirements for access. Furthermore, today it is the norm that the members (nodes) of such distributed networks tend to be diffuse, being geographically distributed, and multi-organizational. Therefore the security/access control mechanism must accommodate these special circumstances.

Security technologies to guarantee safe business processes

283

The goal for security in such distributed environments is to reflect, in a computing and communcation based working environment, the general principles that have been established in society for policy-based resource access control. Each involved entity/node should be able to make their assertions without reference to a mediator and especially without reference to a centralized mediator (e.g., a system administrator) who must act on their behalf. Only in this way will computer-based security systems achieve the decentralization needed for scalability in large distributed environments. The resource access control mechanisms should be able to collect all of the relevant allegations and make an unambiguous access decision without requiring entity-specific or resource-specific local, static configuration information that must be centrally administered. In order to be the security a successful part of the distributed environmentproviding both protection and policy enforcement-each principal entity should have no more nor less involvement than they do in the currently established procedure that operates in the absence of computer security. Only the form has to be changed, e.g., digital signature instead of signing a paper. In case of such system this sort of a security infrastructure should provide the basis of automated management of resources that precede the construction of dynamically, and just-in-time configured systems to support different user defined application-oriented requirements. The expected advantage of computer-based systems is in maintaining access control policy, but with greatly increased independence from temporal and spatial factors (e.g., time zone differences and geographic separation), together with automation of redundant tasks such as credential checking and auditing. The security architectures represent a structured set of security functions (and the needed hardware and software methods, technologies, tools, etc.) that can serve the security goals of the distributed system. In addition to the security and distributed enterprise functionality, the issue of security is as much (or more) a deployment and user-ergonomics issue as technology issue. That is, the problem is as much trying to find out how to integrate good security into the industrial environment so that it will be used, trusted to provide the protection that it offers, easily administered, and really useful. 5.2. Human aspects of security in smart organizations

Trust among members of networked systems is critical. Without trust, commitment to the goals of the organization can waver, as members perceive the alliance as weak or disintegrating, fractured by misunderstanding or mistrust [41]. Trust is particularly important in a networked organization that requires constant and close attention to shared commitments to safety and reliability, as well as a shared willingness to learn and adapt. It has been suggested that trust permits a networked organization to focus on its mission, unfettered by doubts about other members roles, responsibilities and resources, and that with trust, synergistic efforts in inter-organizational missions are possible.

284

Istvan Mezgar

Trust plays an important synthesis role as well, as with trust, NO with its flexible organizational structures can leverage the ability and willingness to learn, thereby enhancing performance and attention to reliability over time. Networked organizations with high levels of trust among their members can effectively utilize interactions and communication processes at their interfaces so members can learn together, and can develop shared mental models of reliability and a shared culture of safety. Finally, high levels of trust also contribute to strengthening connections among member organizations. Trust among members is an important precondition in order to change those connections to partners, thus decreasing risks [41]. Sabherval introduced the role of trust in Outsourced Information System Development (OISD) projects [42]. In this environment the best fitting definition of trust was "confidence that the behavior of another will conform to one's expections and in the goodwill of another". The analysis concentrates on trust between groups of people working together. Approaching from the users side there is an emotional (feeling of security, confidence) and a cognitive (beliefs, expectancies) component. According to the classification of Harrison [43] this relation can be described with the Trusting Intention and Trusting Beliefconstructs. These two components are in relation with the institutional phenomena (System Trust). During the development phase of an information system the willingness to depend, trusting beliefs and situation-specific trusting behaviors of future users are present (Trusting Intention, Trusting Belief and Trusting Behavior constructs). For the managers of information systems the belief, the intention and the behavior are the most important components of trust in the contact with their inferiors. In this contact the relationship between trust and power is also important, as managers have power originated from their position. The sometimes instable power situation between employees and managers can be controlled by well-defined rules and control mechanisms of the firm (System Trust). 5.3. Application of security in the life-cycle phases

Trust can appear in different roles in networked organization. The main fields where the types oftrust can be applied are in the organization hierarchy, in the communication and in the information handling, storage. Bringing together the life cycle phases of NO and the proper types of trust needed for each phase makes possible to select security services that support the development of the actual trust type. As a next step the security mechanism can be selected that generate the results needed for the actual security service. In this way a proper algorithm can be selected that helps to form the feeling of trust for a human being while using a computer based networked system. As it can be seen from Table 5 that by establishing secure communication (by applying encryption and digital signature security mechanisms) the basic trust can be developed for the staffs of the co-operating partners. There are other fields of security in that also steps have to be made to develop a secure environment (virus defense, firewalls,

Security technologies to guarantee safe business processes 285

Table 5 Life cycle phases of NO and the needed trust-types and the realization mechanisms Life cycle phases of networked production system

Types of trust needed

Security services to be applied

Security mechanisms

Forming NO

Intrapersonal Interpersonal System

Authentication Confidentiality

Encryption

Start-up operation

Interpersonal System Object

Authentication Confidentiality Integrity Non-repudiation

Encryption Checksums/hash algorithms

Operation

System Object

Access control Authentication confidentiality Integrity Non-repudiation

Encryption Digital signatures

Closing operation

Interpersonal System Object

Access control Authentication Confidentiality Integrity Non-repudiation

Encryption Digital signatures

Break-up NO

Interpersonal System

Access control Authentication Confidentiality Integrity Non-repudiation

Encryption Digital signatures

physical security, human training, etc.) that raise the level of trust both in humans and organizations [44]. 6. CONCLUSIONS

The networked-based organizations, like the smart organization, are main elements of the Information and Knowledge Society. These organizations apply ICT very intensive both for internal and external cooperation in order to react flexible to the changing business environment. Their business processes also have to be reengineered in these cases. ICT has a strategic role in business processes as ICT influence the success of BPR. The infocom systems applied by the companies have their human part as well; the users. As it is pointed out by different analysis based on real-life statistics, when users do not trust a system/service they do not use it. Security services provide this trust for the users, so the importance of security is increasing very fast. The organizations have to adapt their IC systems to this requirement as well, even by slightly changing their culture or organization structures. The main tools in generating trust for users and organizations are the elements of complex security systems containing hardware, software. The paper focused on communication security by applying different security mechanisms with which trust directly can be developed between individuals and systems. A minimum requirements of security mechanisms was given (encryption, digital signature) based on analysis of networked organizations life cycle phases and the needed types of trust in each of these phases. The networked systems with different sizes will playa definite role, but originating from their openness and flexibility their information and communication systems will be always a security risk. The managers of information technology have to adapt these

286

Istvan Mezgar

technologies, tools and devices into their systems to provide high security level that can induce trust in all humans involved into the different phases of the life cycle of the networked organizations. REFERENCES [1] European Commission. 1997. Green Paper on the Convergence of the telecommunications, media and information technology sectors and the implications for regulation. Brussels. [2] Ungson, G. R. and Trudel, J. D. (1999). "The Emerging Knowledge-based Economy." IEEE Spectrum, May. [3] Filos, E. and Banahan, E., Will the Organisation Disappear? The Challenges of the New Economy and Future Perspectives, in: Camarinha-Matos, Afsarmanesh, Rabelo (eds): E-Business & Virtual Enterprises, Dordrecht: Kluwer, 2000, pp. 3-20. [4] Lipnack, J. and Stamps, J. (1997). Virtual teams. Reaching across space, time, and organisations with technology. New York: John Wiley & Sons. [5] DeSanctis, G. and Poole, M. S. (1997). Transitions in teamwork in new organisational forms. Advances in Group Processes, 14, 157-176. Greenwich, CT:JAI Press Inc. [6] Gamble, Paul R. (1992). The virtual corporation: An IT challenge. Logistics Information Management, 5(4),34-37. [7] Wong, T T, Henry C; and W Lau, (2002). The Impact of Trust in Virtual Enterprises, In Knowledge

and Information Technology Management in the 21st Century Organizations: Human and Social Perspectives,

Editor; A. Gunasekaran, Idea Group Publishing, Hershey, PA (USA), London (UK), Chapter X, pp. 153-168. [8] Koestler, A. The Ghost in The Machine, Arkana Books, London, 1989. [9] Mezgar, 1., Communication Infrastructures for Virtual Enterprises, position paper at the panel session on "Virtual Enterprising-the way to Global Manufacturing", in the Proc. ofthe IFIP World Congress, Telecooperation, 31 Aug.-4 Sept. 1998, Vienna/Austria and Budapest/Hungary, Eds. R. Traunmuller and E. Csuhaj-Varju, pp. 432-434. [10] Mezgar, 1., Monostori, 1., Kadar, B., and Egresits, C. S., Knowledge Based Hybrid Techniques Combined with Simulation; Application to Robust Manufacturing Systems, in ACADEMIC Press theme volumes on "Knowledge Based Systems Techniques and Applications", Ed.: Professor C. T Leondes, Academic Press, San Diego, 2000, Vol. 3, Chapter 25, pp. 755-790. [11] Monostori, 1. and Barschdorff, D. (1992). Artificial neural networks in intelligent manufacturing, Robotics and Computer-Integrated Manufacturing, Vol. 9, No.6, 421-437. [12] Bonabeau, E., Dorigo, M., and Theraulaz, G. From Natural to Artificial Swarm Intelligence. Oxford University Press, 1999. [13] Dorigo, M., Di Caro, G., and Gambardella, 1. M. (1999). Ant Algorithms for Discrete Optimization. Artificial Life, 5(2):137-172. [14] Di Caro, G. and Dorigo, M. (1998). AntNet: Distributed Stigmergetic Control for Communications Networks. Journal of Artificial Intelligence Research aAIR), 9:317-365. [15] Nicholas R. Jennings, An Agent-Based Approach For Building Complex Software Systems, April 2001lVo144, No.4 Communications of the ACM, pp. 35-41. [16] Jennings, N. R. and Wooldridge, M. Applications ofIntelligent Agents, in: Agent Technology; Foundations, Applications and Market, 1998, Springer Verlag, pp. 3-28. (Eds.: N. R. Jennings and M. Wooldridge). [17] Genesereth, M. R. and Fikes, R. E (Editors), (1992), "Knowledge Interchange Format", Version 3.0 Reference Manual., Computer Science Department, Stanford University, Technical Report Logic92-1. [18] C. West Churchman. The Design of Inquiring Systems: Basic Concepts of Systems and Organization., New York, Basic Books, 1971. [19] Malhotra, Y. Knowledge Management for [E-]Business Performance. Information Strategy: The ExecutivesJournal, v. 16(4), Summer 2000, pp. 5-16. [20] Tanenbaum, A. S., Computer Networks, Third Edition, Prentice-Hall, 1996. [21] The Wi-Fi Revolution, UNWIRED-Special Report of Wired Magazine, Issue 11.05-May 2003. [22) Engst, A. and Fleishman, G., The Wireless Networking Starter Kit, Peachpit Press, Berkeley, 2003. [23] Highspeed Internet on the" power grid, ASCOM, http://phaidra.ascom.com/digitalasseto LFiles/682/file157864 _OjD LFileNarne/Highspeed.Jnternet.E.pdf

Security technologies to guarantee safe business processes 287

[24] Foster, Ian, Internet Computing and the Emerging Grid, Nature, 7 December 2000, http://www. nature.com/nature/webmatters/grid/grid.html [25] Foster, Kesselmanand Tuecke-"The Anatomy of the Grid-Enabling ScalableVirtual Organisations" (2000, White Paper), http://www.globus.org/research/papers/anatomy.pdf [26] Davenport, T. H. and Short, J. E. (1990 Summer). "The New Industrial Engineering: Information Technology and Business Process Redesign," Sloan Management Review, PI'. 11-27. [27] Teng, J., Grover, v., and Fiedler, K. "From Business Process Reengineering to Organizational Transformation: Charting a Strategic Path for the Information Age", California Management Review, Vol.36, No.3, 1994, PI'. 9-31. [28] FBI 2003, The 2003 CSIIFBI Computer Crime and Security Survey "Computer Security Issues & Trends", 2003, Vol VIII. No.1, May 29,2003, http://www.gocsi.comlforms/fbi/pdf.html [29] Trusted computer system evaluation criteria, Orange book, DoD 5200.28-STD, Department of Defense, December 26,1985, Revision: 1.1 Date: 95/07/14. [30] ISOIIEC 10181-1:1996 Information technology-Open Systems Interconnection-Security frameworks for open systems: Overview. [31] ISO/IEC 15408, 1999, Evaluation Criteria for Information Technology Security. [32] Rousseau, 0. M., Sitkin, S. B., Burt, R. S., and Camerer, C. (1998). Not so different after all: a cross-discipline view of trust. Academy of Management Review, 23 (3), 393-404. [33] Luhmann, N. (1979). Trust andpower. Chichester: Wiley. [34] Menezes, A. P, van Oorschot and S. Vanstone. 1996. Handbook of Applied Cryptography, CRC Press. [35] Anderson, R. 2001. Security Engineering: A Guideto Building Dependable Distributed Systems. New York. John Wiley & Sons, Inc. [36] Schneier, B. 1996. Applied Cryptography. John Wiley & Sons, Inc. [37] Balaban, 0. 2001. "Fortifying the Network." CardTechnology. May 2001, pp. 70-82. [38] Koller, L. 2001. "Biometrics Get Real". CardTechnology, August 2001, pp. 24-32. [39] WI-FI security, Wi-Fi Alliance, http://www.weca.net/ [40] Koprowski, G., Hacking the Power Grid, http://www.landfield.com/isn/mail-archive/1998/Jun/ 0033.html [41] Handy, C. (1995). Trust and the virtual organisation. Harvard Business Review, 73(3), 40-50. [42] Sabherwal Rajiv, (1999). The Role of Trust in Outsourced IS Development Projects. CACM 42(2): 80-86. [43] Harrison, D., McKnight N., and Chervany, L. (1996). "The Meanings of Trust" University ofMinnesota Management Information Systems Research Center (MISRC), Working Paper. 96-04. [44] Mezgar, I. and Kineses, Z. (2001). Secure Communication in Distributed Manufacturing Systems, in AgileManufacturing: 21st CenturyCompetitive Strategy, Ed.: A Gunasekaran, Elsevier Science Publishers, Amsterdam, 820 pages, pp. 337-356.

BUSINESS PROCESS MODELLING AND ITS APPLICATIONS IN THE BUSINESS ENVIRONMENT

BRANE KALPIC, PETER BERNUS, AND RALF MUHLBERGER

1. INTRODUCTION

Globalisation as the process of creating of a common, worldwide and open market is one of the key features of the external environment of business systems today. Globalisation as the result of the rapid development of information and communication technologies (fast access to accurate, reliable and adequately structured data), transport systems and consideration of common standards (which provide the worldwide comparability and compatibility of the products) (Westkamper, 1997) also allows the fusion of local and national markets into a global one and is one reason for partnership and integration between customers and suppliers, and cooperation or even mergers of previous competitors. Unpredictability and changeability in the internal and the external environment, is experienced by enterprises as turbulence (Warnecke, 1993), and requires responsiveness and flexibility in the organisation and in the execution of processes as well. Customer orientation and time needed to turn an idea into a final product are increasingly important elements of competitiveness. Quality, technical sophistication and price competitiveness of a product is no longer sufficient on the market. The product must be able to fulfil individual customer demands as reflected in the increasing individualisation of the production (economy of scope). Information and knowledge are becoming strategic resources in addition to traditional ones, such as raw materials, energy and food, which used to be the basis of 288

Business process modelling and its applications in the business environment 289

progress of national economies for decades (Warn ecke, 1993). T herefore, information and communication technologies can be considered today as strategic technologies, and knowledge is considered as the key capital of enterprises. The rapid changes and developm ent in the area of new materials, methodologies, technologies, and techniques (deep integration of customers and suppliers in the produ ct life-cycle, network and virtual enterprises, project management, concurrent engineering, modern information systems, various approaches in the product development and design, new produ ction and logistic concepts, new production paradigms, etc.) have resulted in a rapid reducti on of development time, rising complexity and function ality and reduction of cost even in the most demanding products. All the above features of a contemporary business environment require a restructuring of business processes, achievement of their efficiency and effectiveness, improvement of their management, their higher-level integration and automation, and reusability and redeployment of knowledge integrated in processes. Therefore, there is a need for an adequate description (modelling) of business processes, their analysis and knowledge capturing and redeployment techniques, tools and methodologies. This chapter presents business process modelling as the response to the aforementioned requirements. T he chapter starts with the introduction of the theoretical background of business process modelling (BPM), its basic concepts and different applications in the business environment. Section 2 gives a definition of 'business process' and 'business process model' and present s a simple abstract model of artificial systems, which can be used to define different types of business processes and categories of process mod els. The section also discusses the relationships between models, modellin g languages and modelling tools as defined in the GERAM framework (IF1P-1FAC, 2003). Furthermo re, the application of CIMOSA (Section 2.5) and of Workflow Modelling languages is presented, as well as Workflow Management as a special application of BPM and Business Process Managem ent (Section 2.6). Section 3 discusses 1S09000:2000 standard requir ements related to business pro cess, as well as general guidelin es and an interpretation of standard requirements regarding: • the definition of business process interactions, • the identification and differenti ation ofproduct realisation and support processes, and • organisational, resource- and information models of the business enterprise. Sections 4 and 5 discuss the application of BPM in the field of business process reeng ineering (BPR ), as the role of BPM in Knowledge Management (KM) . The autho rs believe that BPM is an imp ortant tool for KM in the business environme nt, through captur ing informal knowled ge in a pragmatic, formalised and struc tured form that could be disseminated and shared throughout the organisation.

290

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

Communication with the external environment (external information )

Management Information System

~

Management and Control System

Physical syste m information (interna l information) Ener Raw material Information/ Data

Physical system (manufacturing/service)

Figure 2.1: Cybernetic model of an artificial system (Doumeingts et aI., 1998). 2. BUSINESS PROCESS MODELLING

2.1. The model of an artificial system

The structural and behavioural characteristics ofartificial systems can be studied using a simple cybernetic model (see Figure 2.1).The model consists ofthree main components (Chen and Doumeingts, 1996): • The Physical system is the component of the artificial system responsible for performing processes and activities intended to transform system inputs into system outputs (goods, services and by-products) by the application ofthe system's resources (human, technical, financial, etc.). Thus the 'physical system' is responsible for satisfying the system's mission; • The Management & control system (often called the 'decision system') is the component of the artificial system responsible for the co-ordinated functioning of the physical system according to the artificial system's mission and objectives. The management of the physical system is done through 'orders' (orders may be the result of a negotiation-thus the inverted commas-or purely delivered by a control system) (Bemus and Nemes, 1999). These 'orders' are the product of decision-making processes. Decision-making processes follow a logic controlled by a set of system objectives, constraints and decision variables; • The Management information system connects the physical system and the management & control system and delivers feedback as well as aggregates information suitable for decision support. Decision-making processes also exchange information with the external environment and this is done through the management information system.

Business process modelling and its applications in the business environment

291

The same division of an artificial system into Service and Management & Control parts is present in Enterprise Reference Architectures such as PERA (Williams, 1994) and GERA (IFIP-IFAC, 2003). 2.2. Business processes and business process modelling

2.2.1. Business process

The Oxford English Dictionary (1999) defines 'process' as a series of actions or operations conducing to an end, or as a set of gradual changes that lead toward a particular result. Thus, according to Section 2.1, business processes (i.e. processes performed by the 'physical system') are a set of activities intended to transform system inputs into desired (or necessary) system outputs by the application of system resources. It is customary to enrich this definition with characteristic properties that stress the business nature of a process. According to Davenport (1993) and ISO 9000:2000 family ofstandards (2000) a 'business process' isa structured and measured, managed and controlled set of interrelated and interacting activities that uses resources to transform inputs into specified outputs (goods or services) for a particular customer or market. Davenport also proposes a differentia specifica ofbusiness processes: every process relevant to the creation of an added value is a business process. 2.2.2. Business process model

2.2.2.1. WHAT IS A MODEL? A model! is a set of facts about an entity (captured in some structured and documented form), provided that: • there is a known mapping between the captured facts, and the real world entity (its constituents and properties) • all consequences of these facts agree with relevant properties of the modelled entity • no consequences of the captured facts are in contradiction with relevant properties of the modelled entity and • all relevant properties of the modelled entity are either explicitly represented in the model, or can be inferred from these facts. Thus a simple list of facts about an entity A is not necessarily a model of A. The set of facts becomes a model only if all relevant facts are captured. Depending on the nature of facts consequences may be derived using logical rules of inference, or mathematical equations. In simple terms: "Model M models entity A, if M answers all relevant questions about A". Depending on the types of questions that the model is supposed to answer (the 'relevant' questions) many types of models can be developed, each representing and aspect, or view, of the same entity. For every type of model there is a set of inference rules, therefore in practice the developer of the model does not have to include these with the model, provided that: 'In many engineering disciplines, the word 'model' is the equivalent to what mathematical logic calls a 'theory'. The definition above uses this engineering terminology.

292

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

Real world

- 1-1-ETI

worker John

computer

Semantlcal gap

ETl

Model worker John

Figure 2.2: Mapping of the real world into the model.

• the document clearly identifies to what model-type it belongs, and • the given model-type's inference rules are uniformly available and understandable to all-i.e. those who develop, validate or use the model (the 'users' of the model). Unfortunately this second requirement is not alwaysmet in BPM, and this has a number of negative consequences. E.g. a business process analyst may request people who are routinely performing a business process to verify that the analyst's model is a correct representation of reality. These people might unknowingly accept an incortect model as correct (e.g. in case all explicitly represented facts are correct), and not realise that some facts may be inferred from this model that are in contradiction with reality. Models can be built for various purposes; for documenting (so that the same or similar system may be (re)constructed based on the model), for the analysis of a system (or part ofa system) and its properties (so that a particular aspect ofa system can be studied). Modelling is an abstraction (and a mapping) of the real world into a formal representation, where the relevant facts2 are expressed in terms of some formalism (called a modelling language"). There is always a difference between the real world and model. Only a real world system is a perfect representation of itself; models are only approximations of the real world entity. The difference between a system and its model may be considered as a form of semanticalgap (see the Figure 2.2). E.g. people who are part of a system have the potential to use a unique system as a reference to share meanings, whereupon those who only see a limited set of models of this system have the potential to develop meanings different from those developed 'within the system'. This is because even if a set of formal models is a correct representation of the system in question, they are also a correct representation of other potential systems. 2To the reader familiar with mathematical logic: the word 'fact' is used here in its everyday meaning, covering propositions, constraints, rules. etc. 3 For the purposes of the user, a modelling language may be defined as a set of modelling constructs (and rules that govern how they can be combined to form a valid model).

Business process modelling and its applications in the business environment

293

Therefore, for a representation to qualify as a model, it is necessary, that there should not exist unintended interpretations. This last requirement is especially important when models are created about a future system (i.e. a new or a modified existing one).

Enterprise models are formal representations of the structure, functions (activities, processes), information, resources, behaviour, goals and constraints, etc. ofa business, government or other enterprise. An Enterprise Model is model of 'enterprise objects" and their dependencies (Gruniger, 1997). Models may have different manifestations, they can be expressed using different formalisms, be processable or not, and may incorporate more or less common sense, and may be expressed on different levels of abstraction and detail. Practitioners often refer to 'formal' and 'less formal' models, but according to the above definition of what a model is, these 'less formal' models are always incomplete representations. Incomplete representations can serve a useful purpose, e.g. for clarifying explanations, but should not be used for analysis or as a specification. It is practically not possible to create a single all-embracing model of an enterprise. Due to the complexity and size of enterprises, instead of a single model a set ofmodels is developed. The enterprise is therefore described by a collection of interrelated, special purpose models, each concentrating on an aspect or view ofthe enterprise (Bemus et aI., 1996). There are various enterprise models like process, data, resource, product, computer network topology, organisation, technical and engineering enterprise models, etc. The selection of the type of models to be developed, the need for it to be complete and consistent, as well as the level of detail and abstraction, are driven by an understanding of the current state of affairs and by the pragmatic needs of planned or anticipated future stages of development/evolution. Traditionally, the prime goal of enterprise modelling is to support process analysis, integration, automation and computer control. Enterprise modelling is also becoming popular in the area ofbusiness design, where the way of doing business is represented as a model (defining co-operative arrangements, enterprise networks, virtual enterprises) where the model provides insight into potential strategic behaviours ofplanned business arrangements. Business Process Models are a specialised category of enterprise models, and focus on the description of business process features and characteristics. For example, business process models are used for the definition of the functionality and structure of a process (sub-processes, activities and operations), the sequence of activities and their relationships, the cost and resource usage characteristics, etc. Business process models, may be used to achieve (Vernadat, 1996): 2.2.2.2.

BUSINESS PROCESS MODELS AS SPECIFIC TYPES OF ENTERPRISE MODELS.

• reduction (or better understanding) of process complexity • improved transparency of the system's behaviour and through it better management of business processes 4 An 'enterprise object' in this context is an enterprise or any of its constituents-whether material, information, human, technical, and irrespective of whether manifested as software or hardware-or any aggregation of these.

294

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

• better understanding and uniform representation of the entity in question • capitalisation of acquired business knowledge and improvement of its reusability • process improvement (to improve the characteristics of business processes). A support of the model development process is usually necessary on two accounts: 1) Riference models should be available (standards, reusable blueprints, best practice captured in form of models) so that models should not need to be build from scratch; 2) Enterprise modelling tools should be used that support the creation, analysis, maintenance and distribution of these models. 2.2.3. Categories of business process models and business process types

2.2.3.1. CATEGORIES OF BUSINESS PROCESS MODELS. The purpose of modelling determines what features/properties of business processes need to be represented. There are two major categories of business process models: activity models and behavioural models. Activity models concern the functionality of the business process i.e. the 'things to be done' or 'tasks' (activities and operations performed within the process). Activity models are primarily concerned with the ways in which business activities are defined and connected through their products and resources. Therefore, activity models characterise a process by describing a) its structure (sub-processes and activities) b) required inputs and delivered outputs for each sub-process or activity, c) control relationships, and d) resources needed for activity/process execution and highlight the roles that objects play in them. Activity models are constructed using the functional decomposition principle (for more detail see section 2.5.). These process models do not represent sequences of control (state transitions, before-after-relations, exception handling) or temporal properties (timing of process activities). Therefore, activity models are constructed if the reason for modelling is the desire to understand or design a process in terms of how it is constructed out of elementary activities and how these activities are interconnected. Activity models abstract from time and state transitions, therefore they are useful if • the analyst/designer wants to identify the interfaces between activities ofa process, and deliberately delay the commitment regarding state transitions and timing, aiming to determine these details in subsequent design step (and thereby leaving the possibility open for many different implementations); or • the nature of the process is such that every execution is likely to be different in terms of state transitions or timing. This is the case with many policy-driven and/or creative processes, i.e. every process that does not have a control flow that can be pre-determined by design.

Behavioural models capture the flow of control within a process-the rules of the sequence in which activities are (or must be) performed. This can be done explicitly (describing a procedure), or implicitly (describing rules of transition, also called behavioural rules). Behavioural models do not necessarily define the objects and resources

Business process modelling and its applications in the business environment

295

used or produced by the process-the need to do so depends on the reason for developing the model. These models are particularly well suited for the design or analysis of business processes in which the timing and/or sequencing of the events is critical (for example, the in the development of simulation models). Behavioural models are executable representations of a process (similar to a computer program), thus they can also be used for process control (or process tracking), in which case they also need to represent the objects exchanged and resources used. In addition to the representation of the control flow, behavioural models might also incorporate:

• exception handling mechanisms-definition of possible process scenarios and their relations • temporal aspect-the dimension of time (e.g., activity durations-minimum, maximum, average or standard times, delays between process activities, triggering frequencies, and possibly the probability distribution of the above, etc.) • co-operative activities-the definition ofmessage exchange (e.g. data/information views described as objects, naming the objects exchanged, defining their structures and states) and material exchange (volumes, batches, etc). Message exchange may be defined using either of two ways-the mechanism of sharing and the mechanism of passing. Co-operative activities use predefined operations (request, receive, send, broadcast, acknowledge) that may be built into the modelling language • process synchronisation-synchronisation may be synchronous or asynchronous and achieved through events, messages or object flows • pre-conditions and post-conditions to be satisfied/completed by the process and its constituents. 2.2.3.2. BUSINESS PROCESS TYPES. Manufacturing and other business processes (e.g. engineering, design, production, etc.) performed in the physical system (see the structure of the cybernetic model) can be described by activity- or behavioural models. While activity models can always be developed, behavioural models are feasible only for processes that follow known procedures or known rules or transition, and are therefore called structured processes (Vernadat, 1998). Unstructured processes can only be described as an activity model, i.e. defining functions by their inputs, outputs and mechanisms and circumscribing the contents of the function (using and explanation suited to the mechanism at hand). Ill-structured processes can only be described by their desired outputs, and noting the range of inputs that might be necessary, as well as circumscribing the task in a way that is suitable for the mechanism (which in case of ill-structured processes is invariably human). Typically, the inputs and outputs to unstructured and ill-structured processes can only be defined as policies, objectives, goals and constraints rather then mechanistically provided' control signals'. The system ofmanagement is a mixture ofstructured, unstructured and ill-structured processes. Therefore, a fully structured process model for their definition is not possible.

296

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

On the highest level of management, some process structure may be defined, helping to co-ordinate the activities of humans who co-operate to manage the enterprise. For these models to be followed and uniformly interpreted it is expected that the definitions are interpreted by managers with a defined level of expertise and competency, and commonly believed assumptions. As the description ofmanagement tasks becomes less structured (and even if a structured description exists) such a description is only a guideline, with the only constraint from the enterprise's point of view being that the task is performed to produce the desired outputs or deliverables and that while performing the management task the human involved will have considered nominated crucial inputs to come to the decision. The exception to this relaxed definition is the interface between the unstructured management tasks where the enterprise still may wish to enforce procedurally defined communication protocols. It is only at the lowest level of management that management tasks become control functions, thus the control system can be defined through structured processes and procedures or behavioural rules. As a consequence of the above discussion, a great deal of care must be taken when developing a model in support of the design of a management system (see the section 3.1). At every level of structuring the description into a model one must ask the question whether further detailing the task is legitimate and useful, meaning whether the task is structured, unstructured or ill-structured. Mistakes in this regard are costly, because they may discredit the model in the eyes of its users. 2.3. Generalised enterprise reference architecture and methodology (GERAM) In order to discussbusiness process models and their role in the wider scope ofenterprise modelling the GERAM Framework is briefly presented below (GERAM: Annex A, ISO 15704:2000). While many other popular frameworks exist, this framework generalises their common characteristics. For mapping other popular frameworks onto GERAM, such as ARIS, Zachman, CIMOSA, PERA, GRAI, C4ISR/DoDAF see (Noran, 2003). 2.3.1. GERAMframework

GERAM (Generalised Enterprise Reference Architecture and Methodology) is about those methods, models and tools which are needed to build and maintain the integrated enterprise. GERAM also represents a tool-kit of concepts for designing and maintaining all types of enterprises for their entire life-history. Figure 2.3 represents the components of the GERAM framework (IFIP-IFAC, 2003). The GERAM framework identifies as its most important component GERA (Generalised Enterprise Reference Architecture) defining the basic concepts to be used in enterprise engineering and integration. GERAM distinguishes between the methodologies for enterprise engineering (EEMs) and the modelling languages (EMLs) that are used by the methodologies to model the structure, content and behaviour of the enterprise entities in question. Methodologies propose to create and use enterprise models (EMs) to represent all or part of the enterprise's operations, including its manufacturing and service tasks, its

Businessprocess modelling and its applications in the business environment

GERA

EEMs

Generalised Enterprise Reference Architecture

297

f---

Enterprise Engineering Methodologies

GEMCs

f----+

Generic Enterprise Modelling Concepts

EMLs

-

Enterprise Modelling Languages

EETs

Partial Enterprise Models

EMs Enterprise Models

-

Enterprise Engineering Tools

PEMs

----

EOSs erprise Operational System

-

EMOs Enterprise Modules

Figure 2.3: GERAM framework.

organisation and management, and its control and information systems. These models can be used to guide the implementation ofthe operational system of theenterprise (EOSs) as well as to improve the abilities of an existing enterprise. The methodology and the languages used for enterprise modelling are supported by enterprise engineering tools (EETs)5. The semantics of the modelling languages may be defined by ontologies, meta-models and glossaries that are collectively called generic enterprise modelling concepts (GEMCs). For enterprise models to be consistent these ontological models must be consistent-e.g. a meta-schema may describe all concepts used in a set of modelling languages, where each uses only a subset of these concepts. If the meta-schema is extended with all logical rules and constraints then the semantics of the modelling languages becomes fully defined and the definition is called an ontological model. Since ontological models are usually developed by logicians, logicians prefer to use the mathematically correct term 'ontological theory' instead of the engineering term 'ontological model'. The modelling process may be enhanced (made faster and improving quality) through using partial models (PEMs), which are reusable reference models of human roles, of processes and associated information, and of technologies. The implementation of enterprise models is supported by enterprise modules (EMOs) which are actual building blocks-physical or software resources-such as humans SIfthe entity in question consists only of software then the term CASE (Computer-aided Software Engineering) Tool is used instead.

298 Brane Kalpic, Peter Bemus, and Ralf Muhlberger

Generic

Views

Parti al

}

Particular

Subdrvisjon acco rding

to ~nt' tlCtl }'

Instant iation Ide ntifica tio n

C ustomer service

Req uiremen ts

I'rtlimin"f)' dtJi~nL..-

Sulxli"i...ion

Management } and control

Concept

>cconlmR to pu'1'OS< of acrl\"lf)

So ftware } Subds,ision

Hardware

accoNmg to physical

nu nifesr:t tion

_

Design

Resource

{),/"il,d d'si.~n

# -if---

-

-

O rga nisation I n fo rmation

Implemen tatio n

Sulxli,".ision } accnruUlK UJ

Function

model co ntent

O perat ion Deco mmissio n

Life-cycle phase s

Mach ine } Subdivision accordi ng Iluman ~

""v¥""'-

to means of implemen tatjon

""v¥ ____

Refe rence Arc hitect ure

Particul ar Arch itecture

Figure 2.4: Modelling framework of GERA.

with skills, equipment, etc. and which are used to build (manifest) the actual operational enterprise (EOS) as a socio-technical system. Some of these modules may be preexisting (humans with skills that the enterprise can hire, products, software) and some may have to be built (by training humans, commission hardware and software) or configured. 2.3.2. Generalised enterprise reference architecture (GERA)

CERA defines a set of generic concepts recommended for use in enterprise engineering and integration projects. These concepts can be classified as human oriented (including individual, organisational and communication aspects), process oriented and technology oriented concepts. CERA identifies three dimensions for the definition of the scope and content of enterprise modelling (see Figure 2.4):

• Life-Cycle Dimension-describes a controlled modelling process of enterprise entities according to the involved life-cycle activities; the CERA life-cycle model defines a total of six life-cycle activity types or life-cycle 'phases' of an entity (some other frameworks may define less or more life-cycle phases in the definition of the entity's life-cycle depending on the level ofdetail ofthis classification). The life-cycle concept represents a useful abstraction in understanding the life-history of any entity (which

Business process modelling and its applications in the business environment

299

could be difficult to understand because ofits complexity and individual idiosyncratic properties). According to ISO 15288 (System life-cycle processes) the life-history of an entity can be subdivided into stages, and each stage is usually characterised by the predominance of one of the life-cycle processes. Thus, the life-cycle is a temporal and is subdivided into phases, while the life-history is temporal, and is subdivided into stages. • Genericity Dimension-describes a controlled particularisation (instantiation) process from generic, through partial, to particular, • View Dimension-describes a controlled visualisation of specific views of the enterprise entity-entity model content (function, information, resource, organisation), purpose (mission delivery, management & control), implementation (human, machine) and physical manifestation (hardware, software) views. Any combination of these defines a legitimate scope of modelling, but depending on the modelling purpose the detail of these models may be different. E.g. the function view may be filled by an activity model, or by a behavioural model, or both (provided these two are consistent).

2.3.3. Business process modelling languages and tools Enterprise modelling languages are defined and formalised as the Generic Enterprise Modelling Concepts in one of the following ways: • by natural language explanation of the meaning of modelling concepts (glossaries) • in some form of meta-model (e.g. entity relationship meta-schema) or • in ontological theories-as formal models of the concepts that are used in enterprise representations, and are usually expressed in a (possibly extended) form of First Order Logic. An ontology is a formal description of entities and their properties; it forms a shared terminology for the object of interest in the domain, along with definitions for the meaning of each of the terms. The definition of ontology consists of (IFIP-IFAC, 2003):

• Terminology-providing a shared terminology for the enterprise, that every application can jointly understand and use; • Syntax-defines all legal constructs of the language. The syntax definition makes it possible for a parser to examine a proposed expression, or a complete model, and accept it as a legal (or reject it as illegal); • Symbology-defines a set of symbols for depicting terms and concepts, often in a graphical form; • Semantics-defines the meaning of the expressions written in the language. There are two usual ways to define the meaning of a language in a formal way: denotational (model theoretic) semantics, and operational sernantics'' (it is necessary to define 6 Further

discussion of these is beyond the scope of this chapter.

300 Brane Kalpic, Peter Bemus, and Ralf Muhlberger

for this purpose a set of axioms and inference rules). The informal specification of a language's semantics usually includes the formal presentation of syntax and is accompanied by a natural language description and explanation of concepts. Ideally, a modelling language must have a formal syntax and semantics. In terms of the level of syntactic and semantic formalisation, modelling languages can be classified as: a) formal, b) semiformal, and c) informal languages. Modelling languages also differ based on their expressive power. E.g. some business modelling languages may not be suitable for the description of all relevant facts of the subject area, and are not appropriate for certain analysis tasks. There is no one language, which is equally suited for all modelling purposes (structure or behaviour description, activity relationships and dependencies, cost analysis, simulation or emulation purposes, etc.). Also any subject area of modelling may be covered by more than one modelling language (IFlP-IFAC, 2003). This fact causes significant confusion for practitioners, because a) many languages need to be mastered, b) in the process of developing a model the practitioner may realise too late that the expressive power of the language is limited, forcing informal and idiosyncratic extensions to the language, c) the language is suited for a given life-cycle phase, but not to a subsequent one, thus model content must be translated from one language to the other. In practice today, many different business process modelling languages are used, e.g. SADT (Structured Analysis Design Techniques), IDEFO, IDEF3, ARIS-Event Driven Process Chain (EPC), UML (Unified Modelling Language), Yourdon Data Flow Diagrams (DFD), CIMOSA function view language, FirstStep (enriched CIMOSA implementation built into the FirstStep tool), GRAI Grid, GRAI Nets, SA/RT (real-time structured analysis), Workflow Languages, Petri-nets (simple, coloured, timed), IEM (Integrated Enterprise Modelling), etc. Because of the nature ofthe (visual) perception of a human being, the majority of modelling languages has a graphical symbology. The Draft International Standard ISO DIS 19440 7 'Constructs for Enterprise Modelling', developed jointly by CEN 8/TC 310/WG 1 and ISO TC184/SC5/WG1, defines (both in English and using UML meta-schemata) a comprehensive set of requirements that enterprise modelling (and specifically process modelling) languages need to satisfy, The standard originates from the CIMOSA languages, extended with decisional modelling constructs, but a complete definition of the syntax and semantics of these languages is not part of this document, and especially the detailed design (coding) level is not part of this standard. At the same time, a number of commercial systems have been developed that implement proprietary Workflow Languages that are suited for the implementation level description of business processes and thus can be executed in a Process Execution Environment (Workflow Environment). A detailed description of the state of the art of Workflow Systems is given in Section 2.6. As of today (2003) many Enterprise Engineering/Enterprise Modelling Tools allow the specified process models to be exported to Workflow Systems (by dedicated 7 As

of April 2003. HCEN: European Standardisation Organisation.

Business process modelling and its applications in the business environment

301

translation). The result ofthis translation is a workflow, which then must be completed by adding implementation-level details. For the efficient development and implementation ofbusiness process models, modelling languages must be supported by adequate enterprise engineering (modelling) tools. Enterprise engineering tools should support the entire life of these models (from the design, up to its implementation, redesign, distribution and storage). Enterprise engineering tools should provide user guidance through the modelling process (information gathering, model building), support model analysis (simulations, evaluations, etc.), enable the connection of process models with the actual business process, and keep models up-to-date. The ideal modelling environment should be modular and extensible (rather than based on a closed set of models), so that alternative methodologies can be used in conjunction with the already existing ones (e.g. through enriching modelling language constructs, or adding new views, as appropriate). On the market, different modelling tools (software vendors) for the same modelling language can be found. E.g., the IDEF family oflanguages is supported by the following tools 9 (Tool/Language-Vendor): AIO WIN (IDEFO-KBSI), ProSim (IDEF3KBSI), AIWIN/BPWin (IDEFO, IDEF1X, IDEF3-Computer Associates), CORE (IDEFO, EFFBD lO-Vitech Corporation), Workflow Modeler (IFEFO, IDEF1X, Workflow-Meta Software), Systems Architect (IDEFO/1X/3, UML-Popkins Software). Because ofthe limited expressive power ofany particular process modelling language and/ or the functionality ofthe supporting modelling tool a set ofcomplementary modelling languages and tools is usually needed. This also results in the need to exchange models between different tools. The exchange of process model information is very limited today, the reason for this is the diversity of tool native formats (which are not interoperable with other modelling tools) and of modelling language constructs (even though there may be a well defined language syntax and semantics, languages used for the same purpose may be based on incompatible ontologies). The Process Interchange Format Working Group has proposed the development of a process interchange format (PIF) to help automatically exchange process descriptions among a wide variety of business process modelling tools and support systems. Instead of having to write ad-hoc translators for each pair of such systems, each system will only need to have a single translator for converting process descriptions in that system into and out of the common PIF format. Then any system will be able to automatically exchange basic process descriptions with any other system (PIF Working Group, 2003).

PIF aims to support the sharing of process descriptions in such a way that that they can be automatically translated back and forth between PIF and other process representations with as little loss of meaning as possible. If translation cannot be done fully automatically, human effort needed to assist the translation should be minimised. 91t is not the intention of this chapter to present a complete list, only examples arc given. JOExtended functional flow block diagrams (proposed for next UML extension).

302

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

2.3.4. Enterprise reference models

In typical business environments there exist a number of common (business) processes, which are similar or the same-no matter what is the function or mission of the enterprise. Therefore, the adoption of reusable enterprise models (also called reference models or partial models), is a most significant improvement of efficiency and quality in the planning ofnew, or redesigning of existing, processes (Mertis and Bemus, 1998). There are three types of reference models, which lend themselves for reuse: generic models (capturing the common aspects of a type of enterprise), paradigmatic models (where a typical, particular case is captured in model form and that model is subsequently modified to suite the new situation) (Bemus et aI, 1996), and building block models (a set of elementary model fragments that can be freely combined as components to form a complete model). We now introduce the terms process type and process instance. A process type is a structure consisting of activities (e.g. 'the processes of product development') each defined by its name and signature (inputs and outputs and conditions of execution), as well as the relationships between these (e.g. input-output relationships and possibly rules for execution) (Schmidt, 1998). A process instance is the execution in time of transformations on a set of concrete objects as defined according to the rules defined by the process type. A process instance is therefore the real process following the rules and structure of a given process type. A process instance has a life-history of its own, which at any point in time consists of a past and a future: a) the past is a partially ordered set of events that happened during the execution up to the given point in time, and b) a future which is the set of partially ordered sets of all possible events (possible futures) that could eventuate (where exactly one of these sequences will become the past at a given later point in time). Given a process type, it is an interesting question what all the possible executions are-e.g. to find out whether there are any circumstances which might cause an instance of that process type to get into a deadlock, or any other undesirable state. Given a process instance, it is also interesting to find out what all the possible futures are, e.g. in order to control the process instance in some intended, or optimal, way. E.g., the LOTOS process modelling language was developed to model the behaviour of computer network communication protocols, and to allow for the analysis of all possible executions. LOTOS is based on Process Algebra-for a detailed discussion of the language refer to ISO 8807: 1989 and van Eijk et al (1989). Interestingly the language has not been used for business process modelling (in the knowledge of this chapter's authors), in spite of its features that would make it a candidate for such use. 2.4. Business process modelling principles

2.4.1. Process decomposition

Business processes can be very complex, being composed of several hundred activities and numerous relationships among them. Therefore, some process structuring is usually required (using process decomposition or aggregation).

Business process modelling and its applications in the business environment 303

Section 2.5 will discuss the C IM O SA proce ss modelling language that allows the specification of business processes, together with the objects it manipulates! ' , the resources used and the organisation (allocation of responsibilities) as well as the translation of this specification to computer representations that can be executed in a process execution environment. A process execution environment allows proce sses to execute so that the pro cess can communicate with bot h hum an and automated resources (machines, application programs, databases), through appropriate interfaces. Such an environment maintains the description of process types, and keeps track of each executi on of these (process instances). Multiple instances of the same proce ss type may execute at the same time . 2.4.2. Thegranularity (depth) ofprocess models

In the development of a business process mo del practitioners are often faced with the question : given the life- cycle phase for which the mo del is developed, how in-depth should be this description, i.e. wh at should be its granularity? A genera l answer to the question cannot be given becau se the level of granularity in process description (mo delling) depends on the model's purpose. According to Uppington and Bem us (1998) the level of granularity in I3PM is driven by the need to und erstand the current state of affairs and by the pragmatic needs of the subsequent life-c ycle phase of the change process and the per sonnel involved. In the case ofa single company there are many shared cont extu al elements that allow a simple model to be produced, which is still pragmati cally complete. In the con text of an indu stry, either th e context of use must be defined in more detail (such as defining the necessary assumed kn owledge and experience of the users of the referen ce model), or the model itself should be more detailed. H owever, for any parts of the model (such as the name of an activity, the name of a pro cess or data element) there must be an adequate explanation included in the model , w hich ensures that the lowest level elements of the model are uniformly interpreted by everyo ne using this model. The necessary level of process granularity is also connec ted to the nature of the pro cess. E.g. if the activities performed by huma n resource are creative in nature, or the hum an resource is highly qualified, then it is better to stop at high er level activities in process deco mposition . This also means that pro cess models might be developed so that the level detail revealed depends on the skill level of the hum an user. 2.4.3. Model1ing approach

Two different approaches in pro cess model development are usually referred to : the bottom-up and the top-down approach (KBSI, 2001a). In the bottom-lip approach, the buildin g of business proce ss models is started from the most detailed description (operations or activities) up to a more general descrip tion (sub-processes or pro cesses). This way of bu ilding of pro cess mod el is often proposed for the developm ent of AS-IS models (i.e., the modelling of existing processes). " C. lled object-views in CIMOSA .

304

Brane Kalpic, Peter Bemus, and Ralf MuWberger

Figure 2.5: Business Process Decomposition.

The top-down approach develops the process description from the definition of basic features, or so-called high-level functional defmition, into a detailed description, or low-level, definition. This approach is often proposed for the creation and definition of radically new TO-BE models or models of a future state or system behaviour. Often the combination of these approaches is used in process modelling practice. For example, in the development of an AS-IS model first the high-level or general description of the process is carried out (rough definition of processes and roles of employees, to define the context and scope of modelling), followed by the detailed definition ofprocess activities and lower process entities on the basis of actual tasks that are performed in the company. 2.5. CIMOSA process modelling language

CIMOSA includes constructs (AMICE, 1993) (Kosanke, 1992) to model processes as well as information, resources and organisation. These constructs exist for the requirements, architectural design and detailed design life-cycle phases of CERA (called requirements, design and implementation in the CIMOSA modelling framework). In this section only the process modelling aspect of CIMOSA is discussed. CIMOSA has a set of features to organise the process model into manageable modules. In CIMOSA, an enterprise is viewed as a collection of domains (see the Figure 2.5). A domain is a functional area achieving some goals of the enterprise (e.g., sales, purchasing, R&D, production, etc.). A domain is composed of stand-alone processes, called domain processes and interacts with other domains by the exchange of requests or objects. Each domain process is triggered by some event (solicited or unsolicited), and is composed of a chain of activities, producing defined deliverables. Domain processes

Business process modelling and irs applications in the business environment 305

ignore organisation al boundaries; therefore, the scope of domains should not be confused with the scope of an organisational unit. A domain process could be furth er decomposed into business processes'", and eventually into ent erprise activities. Rel ationships between activities are defined as behavioural rules. On th e level of detail that corresponds to the CERA prelimin ary design ph ase a C lMOSA activity is the lowest level of process decomposition , such that each activity can be allocated to one resource capable of performing that activity. On the level ofdetail that corresponds to the CERA detailed design phase ClMOSA activity may be furt her described as a procedure (making C lMOSA on this level a workflow modellin g language). This procedure consists of process logic and [unaional operations. A functional operation is a command or requ est und erstood by the resource that is to perform the activity. Thus, a detailed design level C IMOSA processmodel can be used for model-based control of business processes. Note that such a procedure is a generalisation of what is commonly understood as a computer program. A computer program is executable by a computer and its external devices, whereupon a ClMOSA procedure may be executed by an automated resource (e.g. a software application , a too l, a robot, a communication device, etc.), or a hum an resource. Developers of C lMOSA procedures ass ume that all resources are interconnected by an integrating itifrastructure that allows the execution of the procedure and the transparent delivery of messages between resources and C lMOSA procedures. 2.6. Workflow management

Wh ereas business pro cess mod elling on th e requirements specification and design level concentrates on understandin g the workin gs ofan organization to analyse and improve its processes, workflow management focu ses on a highly automated implementation of processes using the organisation 's software infrastructure . Workflow technology there fore not only includes an execu table modelling language (Workflow Language), but also the technology that enacts these process model s. 2.6. 1. A bstraction ofprocess managemem

To und erstand the role of a Workflow Management System (W fMS) in an enterprise's lCT environment, it helps to review the role ofmanagement systems in general within software development. A strong analogy can be drawn betwe en workflow and database management in terms of management systems as the abstraction of a specific class of functions out from applications. Histori cally, applications used to operate on data within main memory only. As these data were lost when the application exited , application programs were modified to use the file system as a data store. This involved data struc ture specific code with in the application that manages the interaction with a file (reads or wri tes specific data). O ther applications now had access to the same data due to the shared nature of file systems, which caused a number of data management problems. Any application changing the file structure, e.g. due to the need to add som e 12According

to this chapter's definitio n of

J

'business process'. a domain process is a top- level business process.

306 Brane Kalpic, Peter Bemus, and Ralf Muhlberger

extra information, affected not only the file itself, but also the data management code in all other applications which needed to be aware of the file structure. Eventually it was realised that rather than repeating the same file access and data management functionality within every application it would be more efficient to have a separate application, the database management system, that can be asked to manage the data and serve it to any application that needs access. Similarly, many applications have business process logic embedded in them. Examples include code that passes control to the next application to be executed in a process and code that specifies the order of execution through menus. WtNlSs store a schema for such processes in the form of workflow templates, i.e. process types marked up with the extra information required for the system to

• identify specific actors for whom to schedule tasks • invoke applications to enact the tasks of the process • pass information between different tasks of the process Workflow instances are created by the workflow enactment engine, which then also tracks the ongoing execution of tasks during the life of the process instance. It is to be noted, that because workflows are implementation level process models, some tasks are carried out by application programs, while other tasks are carried out by some equipment or by humans, thus a complete workflow system should have suitable interfaces to any functional entity that understands requests directed to it. Furthermore, given that human tasks are involved, the model-based control implemented by workflow execution should not consider the human as a machine, thus workflow programmers must give consideration to the type of process logic that is suitable for such heterogeneous execution environment. 2.6.2. Architecture

The basic architecture of WtNlS is well described by the original Workflow Management Coalition Reference Model (D.Hollingsworth, 1995). Although the division of interfaces has been revised, the framework of describing a workflow system as a combination of six modules is nevertheless very useful. These modules are: • Design tool • Enactment engine • Management and monitoring tools • Work list handler • Invoked applications • Remote enactment engines An issue in workflow management, that is still under development, is the provision of views over the execution status of each process instance, allowing for an increased level of security while permitting clients and low-authority staff to monitor aspects of the process relevant to them. This helps increase the awareness of corporate knowledge by staff, and provides a valuable service to clients, e.g. as demonstrated by the packet

Business process mod elling and its applications in the business enviro nme nt

307

tracking mec hanisms of FedEx imp leme nte d using workflow techn ology. O pe ning complete access to th e managem ent & mon itoring layer of a W tMSs may not be a wise idea if such a view mechanism is not available, given that details of business processes are increasingly th e focus of corpo rate differentiation th us compa nies wo uld not want to have the se publi cly viewable, e.g. by th e orga nization 's competitors. T he UJork list handler is the main int erface to a wo rkflow enviro nment for pe ople wo rking in a pro cess driven organization. T he applicatio n needed to enact a task within th e process is invoked when a task is selected from th e wo rk list. Remoteenactment engines are included in th is architecture to sup port distribut ed wo rkflow execution, such as in virtu al enterprises or, in general, inte r-enterprise processes. To provide more flexibility in the suppo rt of a business that requ ires both repetitive as well as ad-hoc/ creative processes, inte gration bet ween prod uction WfMSs (such as InM's MQ Workflow and collabora tive computing workfl ow tools such as provided by Lotus Notes) are another aspec t of th e remote enactment service facility of workflow architectures. See (Lin et a!., 1997) for a description of such integra tio n. 2.6.3. Design principles and issues

The abstraction of proc ess managem ent out from the un derlying applicatio ns leads to a basic design principle for wo rkflow model s, i.e. that the mod els be clearly abstracted and separate from the applicatio n logic. This design pr inciple addresses th e issue of granularity within workflow modelling, i.e, th e gra nularity is driven by th e application s th at are impl emented ben eath the pro cess management layer. From a structural pe rspec tive, there are a numbe r of flaws that can occur in process designs that suc h design pri nciples will not help to avoid. T he two key problems are dead loc k and lack of synchro nization. These have been addressed in wo rkflow mod elling thro ugh work such as (Sadiq et a!., 200 1). 2.6.4. HfJrkfl oll'from a data perspective

The added benefit that workflow manageme nt systems provide is that they can act as a platform for the int egration of the disparate data sources (Muh lberger and O rlowska, 1996) that a business process involves, regardless of th e techn ology that is used for th e managem ent ofthat data. A workflow may interact with a nu mber ofdata sources, from het erogeneous database managem ent systems, multidatabases, file systems and any other typ e of data source, through a communications layer and th e applications that interact with those data sources . T he commun ications layer itself can also involve bridging techn ologies, such as CORBA or DCOM layered on top of o ther communications prot ocols suc h as TCPlI P. Workflow thu s becom es a type of distributed information management technology that can be used to manage data for interoperable systems. Prod uction wo rkflows in part icular, on which this Sectio n is fo cused , involve the coordinatio n of orga nisatio nal information pro cessing systems that are usually based on database manageme nt systems but can enco mpass othe r, non - D BMS architec tur es, having mul ti-tier, client-server architectures wi th a cen tral workflow server respon sible for manageme nt of bu siness processes. These wo rkflows, in con trast to ad- ho c or administrative document

308

Brane Kalpic, Peter Bemus, and Ralf MuhJberger

management workflows, have well defined procedures, rigorous and multiple repetitive process instance executions that may span several heterogeneous information systems. Treating workflow as a generalisation of multidatabase':' transactions highlights the other issues that need to be addressed in a workflow, and indeed all process management. WfMSs are complex software products that should provide a number of functions/services to different groups of users. They should have extensive features to define the internal task structure, control the execution of activities involving different types of processing entities, and support reliable forward recovery services (for system failures) and backward recovery services (for external actions such as cancellation of a customer request). In other words, workflow management should provide for a form of atomicity, consistency, isolation and durability, similar to the classic database transactions' ACID properties. These requirements need to be relaxed however, due to the long-running nature of workflow processes compared to database transactions. For example, for consistency, the underlying systems only need to be integrated for any data affected by the process instance, and then only along the flow of execution of that instance, rather than all information sources maintaining all dependencies across systems at all times. 2.6.5. Workflow modelling languages

Due to the implementation focus of workflow management, there are as many workflow language dialects as there are workflow enactment engines, design tools and workflow researchers. This is due in part to the independent development of workflow management systems, and also to the value added that differentiation brings to every workflow vendor. From a process abstraction perspective, however, the options that are available to model a process are more limited. Common constructs used are: • Simple tasks, i.e. that describe an actual application or action performed by an actor in the system. • Block (aggregate) tasks which form a logical grouping of simple tasks. • Sub-process tasks, which invoke a new process instance in place of a simple or block task. • Control flow connecting tasks in an asymmetric, directed order. • Splits, which can be a forking of the process path or describe exclusive or inclusive selection of paths. These are known as AND-Split, XOR-Split and OR-Split respectively. • Joins to recombine parallelism in the design of a workflow template. As for splits, these may be AND-Joins, XOR-Joins or OR-Joins. 13Multidatabase management provides the logical integration of pre-existing databases through an integrated global schema and a transaction manager that generate global query plans issued to the participant database systems (transparently to the user), and without modification of the participant databases. In fact, users need not know that they are participating in a federated information architecture.

Business process modelling and its applications in the business environment

309

The above information is commonly represented using a graphical notation. Extra information (not usually represented in the graphical notation) needs to be added to complete the workflow program: e.g., which application a task invokes when it is executed; which actor (or the role defining actors) that can execute a task; and in some modelling languages even the data that is passed between tasks in a workflow process. Furthermore, the level of constraints that a workflow designer could specify also varies from implementation to implementation. 3. WHAT ISO 9000:2000 STANDARD REQUIREMENTS MUST BUSINESS PROCESS MODELS SATISFY?

The ISO 9000 family of standards has been developed to assists organisations to implement and operate effective quality management system (QMS). The ISO 9000 standards specify requirernents'" for a QMS where an organisation needs to demonstrate its ability to provide products and services that fulfil customer-and applicable regulatory requirements, to enhance the satisfaction of customers and other interested parties, and improve the performance of the organisation (ISO/TC 176,2000). The ISO 9000:2000 requirements can be interpreted in the context of enterprise modelling: this standard may be understood as a policy/requirement level enterprise reference model applicable to all types of enterprises 15. Consequently when enterprise models are developed (as may be captured as function & process, information, organisation and resource modelling views) they must satisfy the required ISO 9000:2000 quality properties. The ISO 9000:2000 standards define requirements for business process performance monitoring, identification of the organisation's strengths and weaknesses, assessment of the QMS's maturity level, continuous improvement, complementing quality objectives with other objectives related to growth, founding, profitability, etc. ISO 9000:2000 therefore extends the traditional QMS to more general management system of the organisation. The ISO 9000:2000 family of standards is based on eight quality management principles: • customer focus • leadership • involvement of people • use a process approach • use a systems approach to management • continual improvement • factual approach to decision making and • mutually beneficial supplier relationships. 14The 1509000:2000 series of standards consists of requirements (1509001 :2000) and guidelines (IS09000:2000 and IS09004:20(0) 15 According to the GERA life-cycle pliases.

310

Brane Kalpic, Peter Bemus, and RalfMuhlberger

Two of these principles (process approach and system approach to management) capture general requirements directly related to the identification, definition and description of business processes. The ISO 9000:2000 standards, as a requirement specification and policy level standards, do not provide more detail and elaborated guidelines and reference models how to fulfil these standard requirements (even though the ISO 9000:2000 and ISO 9004:2000 guidelines are a good start). Therefore, organisations are many times left to their own devices in the selection of detailed design and implementation approaches to fulfil the standard's requirements. The introduction of business process related requirements is new in IS09000:2000, and is the first time that a wide scale deployment ofBPM is required from the those who adopt a QMS. As a response, organisations not familiar with BPM often develop in-house business process modelling languages, methodologies and approaches. These languages are usually characterised by a weak defmition of the modelling language's syntax and semantics, and consequently by low uniformity and unequivocality of process descriptions. Non-systematic approach to business process description could lead to a limited (re)usability of the created business process models, and might not even satisfy the criterion to be called a model in the strict sense of the word (see the Section 2.2.2.1). The aforementioned requirements of the new ISO 9000:2000 standards and the recognition of obstacles to the implementation ofBPM has lead the authors to develop general guidelines that can be followed to improve the efficiency, and support the practical adoption, ofBPM in industry, with the aim of satisfying the business process related requirements ofISO 9000:2000. 3.1. Business process modelling related requirements of the ISO 9000:2000 standards

From the eight quality management principles in ISO 9000:2000, two may be considered as BPM related principles (ISO/TC 176, 2000):

• process approach which ensures that the desired result is achieved more efficiently when activities and related resources are managed as part of a process (where the management of activities and related resources is not limited to, or constrained by, functional, divisional or unit borders), and • system approach to management, which requires the identification, understanding and management of interrelated and interacting processes as a system. To implement a QMS, the ISO 9000:2000 requires that the organisation must: • Identify the processes needed for the QMS and ensure their application throughout the organisation; • Determine the sequences and interactions of these processes; • Determine necessary criteria and methods, which ensure that both the operation and control of those processes are effective;

Business process modelling and its applications in the business environment

311

• Ensure the availability of resources and of information necessary to support the operation and monitoring of processes; • Monitor, measure and analyse business processes. In Sections 2.2.3.1 and 2.2.3.2, we described two types of process models (activity and behavioural), as well as three main process categories (structured, unstructured and ill-structured processes). The 1509000:2000 standard requirement for the determination of the 'sequence of processes' could unintentionally create an expectation and assumption, that all processes are suitable for a uniform description/modelling (using a single process modelling language) and that they can always be described by behavioural process models. Unfortunately, this expectation isn't uncommon in present-day practice. In addition to structured processes (e.g. accounting procedures, technological procedures, etc.), many unstructured (e.g. new product development process) and illstructured processes (e.g. innovative processes) exist in an organisation. Considering a) business process properties (and consequently their suitability for modelling) and b) standard requirements about the determination of process sequences, it can be concluded that: • Interpreted in a strict way, the requirement of the ISO standard to determine the 'process sequences' can not always be met; • Interpreted in the spirit of the standard, 'sequences of processes' would better be understood as the modelling of the structure and relations of processes-where the structure is the composition of processes out of more elementary processes and activities, and the relations include information- and material exchange (interfaces), and/ or succession sequences and events, and/or relations in time. The decision about which one of these relations to model depends on the process category, and thus appropriate process model types (and, accordingly, modelling languages) need to be used. Furthermore, the model(s) ofbusiness process structure and relations may have to capture additional characteristics that are essential for process design, prediction, analysis, planning, scheduling and control; and • In general, the use of a combination of behavioural and activity process models (modelling languages-see the Figure 3.1) is necessary to model business processes. In addition to the description of operational processes (processes at the 'physical' level in the cybernetic model of artificial systems), a great deal of care must be taken in the identification and definition of management processes. Given that the majority of management processes are either unstructured or ill-structured, in the selection of an adequate model type these process characteristics must be taken into account. As a response to the particular nature ofmanagement processes, the GRAI laboratory has introduced the GRAf methodology for a management-oriented description of an enterprise. The GRAI methodology proposes to develop a high level model of management processes using a GRAI-Grid. The graphical modelling language GRAf-Grid does

312

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

Product

Figure 3.1: al Activity process model, hi Behavioural process model.

not aim at the detailed modelling of management processes but a) identifies decision roles (also called decisional centres) where decisions are made and communicated, and b) defines the decisional hierarchy through connections and interactions among these decisional centres (Dumeingts et aI., 1998). The processes ofdecision centres may then be further detailed as management processes, and modelled using a functional or a behavioural modelling language. As afirststep in developing this high level model of a management system, the decisional centres (see the individual squares of the GRID in the Figure 3.2.a.) are identified. As a second step, each decision centre's objectives, constraints, decisional variables, required inputs and delivered outputs may be added; these together are called a decisional frame of any individual decision centre. As a third step, the task to be performed to create a decision may be described (e.g. in natural language, as a list), which completes the high level decisional model. The detailed model of decision making processes can be developed-depending on the nature of the process-using a functional (activity) model (KBSI, 200la) or a behavioural model (KBSI, 2001b). However, often only a detailed natural language description is used. The detailed model may of course be different in granularity from decision centre to decision centre, depending 1) on the level of formalisability of the decision process in question, 2) on the intended skill levels of human resources to fill these management roles, and 3) on the formalisation needs ofthe links among decision centres. Assume, that a decision has been made to achieve some objectives by some time in the future (defined by a time horizon), and suitable activities are planned for this purpose. According to quality management principles the execution and results of this plan need to be monitored. If performance indicators (the feedback from the physical system) show that there is a deviation (or likely deviation) from achieving the objective, adjustments must be made either to the decisions or to the objectives. Performance indicators must be developed and suited for the set of objectives at hand and are part

..."" ""

.

..

.;..

. ,. . .

.

.

~

(Inlra-and Inter systemic)

Ou tputs

Decisi on Iramework for other DC ensu red through DC l

DC 1

Decisi on fra mework tor DC1 (OBJECTIVES, CONSTRAINTS AND DECISION VA RIABL ES)

Figure 3.2: al The GRAI-Grid concept, hi Co-ordination links between DCs.

real li me

h=lw p=ld

he f m p=lw

h=l y p=3m

ho rizon period

Information links between dec is ion cent res

decision framework

314

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

of the information links that flow among decision centres, the physical system, and the external environment. The structure of this information flow (especially if this information needs to be stored in a database) has to be modelled using a language suitable for data modelling (e.g. using Entity Relationship modelling, the IDEF1X modelling language, Class diagrams/UML, or similar). A GRAI-Grid model is a road map of key decisions (decision-making processes), their interactions (shown as decisional frameworks and information links) and relations (shown as a hierarchy of decision centres). A decisional model is valuable for the design of an efficient and effective organisation. The GRAI-Grid can be used for the description ofthe functions ofthe management and control system on the high level (see the Figure 3.3). Starting from this model there is a choice to further model, on the detailed ('micro') level, the activities of decision centres. The GRAI methodology proposes the use of GRAI Nets, but other process modelling languages might also be used (IDEFO, CIMOSA, Workflow modelling language, etc)-whichever suits the given process category. The selection of an appropriate modelling language would be based on a) what the model will be used for, b) what tools are available that can manage these models, and c) what languages are best fitted to the people who will use the models. To improve the efficiency of business process modelling, the organisation should adopt an enterprise integration methodology, and develop or deploy a business process (or rather enterprise) modelling workbench that supports a set of well-formalised and interrelated modelling languages and tools. Such a workbench should be populated with the reference models that were adopted by the enterprise, and with particular enterprise models: AS-IS models, and various versions of TO-BE models (for the discussion of the variety of TO-BE models see (Hysom, 2003)). 3.2. Business process interactions

The ISO 9000:2000 standards refer to the process approach and processes interactions as follows (ISO/TC 176, 2000): "For organisations to function effectively, they have to identify and manage numerous interrelated and interacting processes. Often, the output from one process will directly form the input to the next process. The systematic identification and management of the processes employed within an organisation and particularly the interaction between such processes is referred to as the process approach." While in theory all business processes (their structure, interaction of process activities, object flows, etc.) could be described in a very detailed and consistent way using functional (activity) or behavioural process models, the definition of process interactions does not necessarily require a fully detailed specification (description) of every process in question. The focus is on the definition of process interactions through the identification and description of the exchanged objects. The development of a complete and consistent (functional or behavioural) model ofall interacting processes could be very difficult (or even not feasible) in practice. However, to fulfil the demands of the standard to define process interactions, a mixture of high-level and detailed functional models may be satisfactory.

'"

......

."

-t

"erlf)' a nd , '. lida lt pr udu ct (a , pruj cct d t'lh l' nhll')

..\ ):r l:'("a nd dc chl e on produ ct r equ ir eme u tv \I ICd ficlJllnn

a nr lla h ili t ~

\I

purc n:t\('

:l

Sc he d ule project :\Ionifor projl·c.

.-

=

aclh'iril~\ -V

~

=

n=

n evetep det ail ed pruj ect pi a

<:= ::>

<::=

VI.' dr, a nd \alidah"llroj ec'Y, Commi tt ees]

( S lel' r i fl ~

n

()("\'d op ( / App rove (> project - . . \.I a !ot t' r m a st er pla n Illan \

n

u cvclo p '\ P p nl\ ,~"::{. project ----.. revo kep rn , p ropos al r» prop usal s

Technical mana gement decisio ns

n

Co ntr ol d tlh e red sen ices and ran' m lllcrl al \

:~~r~~~:~e o r

J)C\dOIJmake o r purch e w IJolicll" . CI Ual il) cu nt rul pla ns,

.....

==::

",(1; ......, n ," , n i ,,:., i' l n

.-

n~

nt'\:'di~

Se lect pl an op tions

U

Dete rm ine ac tivi ties

<:= P

-i

performance indic-d[()"" Strnll.·gic m: ti,ili",":\-pro1:$ullnlCs

rc,",'uree... and cupabih tie..

I :-. tr.lllX'rform~ncc

'0

Anoca re a nn a OJus l re so urces Alloc al e b ud J:t l A", I~ 11 roles a nd n\ Jornihilil ie\

A 'li'i(,,\\ res ources and

C'lI pa h llil ll" d eterm in e pm ject rl' a\ ihili t ~

.-

.

.-

....-

\Vork alloca tion declsions

n

payments , Cc ntrul pr oject op rralinn\

Ap llr o\ e ord ers and

Ohrrlhurr \\ nrkfuad s Ul'n lop ur acq uire pr oj ect res o ur ce s. bal an ce bud ~cl

p.

I

budget. investment, rc..ourcc all(lC;\1ic19 plan ...

Pmpo..crc,"-mn:c dcv.plau •

Monitor

.-

To manage resources

F1nan cial lran1.Ul"1 ion record s Mln u'l'\ of rnc: ('lill ~' I·roj ecl ducurn entv

Int ern a 1eevouree Ina ila hilir)' A~~ rct.:a l ('d finan cial r ccord v \\'('l'k l)1I11onlhl) r c porh of prujecr te am

KCJ:ular chee k pu ln ts and mllcvtonc n 'p orh o f p rnj l'ch

Ih '(luircm r n b 21nal, \I \ report

prcj ccr Ent cr p rtve Refer en ce 'J udd \

Reco rd, uf dernnnvt rutrd eup ah llit les Rcsuurce :1\ :.IlIahlllr\' fr o m ERP ' )\1 4;'111

Financial and nou-flnanciul strategic perform ance indicators

I-

Product &. producuon history I(CS(lUrCC and production cost ..t'ltisli..:s

Internal info.

Figure 3 .3 : A GRA I-Grid model (high level model ) of the decision cent re of two enterpr ise ent ities: a pare nt ente rprise and a project , sh ow ing decision cen tres and their relation ship s.

Exter na l rc eouree

Dctcrmmc rompany sir-n eg)':

To plan

mi.,si~n. \'i~l . str ateg ic c:::: po Propose deployment and <= pOhJ\.'dl\C'S, policic... principles . ..... development strategy o f

--.

,'\'es,:ofia lf,' cl ient req ul rerncnt v Deckl e hls,:h lcvc l p rudu ct 'pecsifiral inn\

Propo sed sale ... market ing 11110 procurement plan,

propose product strat egy

Propose market ~1r.\lCgy al1,1

To manage products

PROJ ECT ENTE RPRISE

Cu sto me r eun flr m a tiun uf product

S u p plier. vcrvlce prm. id e d l'Uil s

C lll' ni re q ues ts C uvto mcr da t.. ' M arker imes lis,:allun da la

...

Mallet unal yxi.. n.'Jl0rb -I Benchmarking report s Best pract ice information sou rces Fundamental re search report ..

External info.

PAR ENT ENTE RPRISE

316

Brane Kalpic, Peter Bemus, and RalfMuhlberger

.

Environment L

~

Iecdba

jo

PI + o~

vII

'

!Lj1 3

I

P2 I

I

\.l~ :P3

. tan sf< rmatioh 1npUl

rrv sn I>n l

Figure 3.4: The process interaction matrix.

Furthermore, functional modelling languages many times demand a detailed definition and specification of modelled processes. Therefore, the designed system of interacting processes could appear very complex and without emphasising the description of process interactions. To emphasise process interactions, the authors propose the application of a simple process interaction matrix (see Figure 3.4). A process interaction matrix is defined by basic syntactic and semantic rules: • Individual business processes (internal processes as well as interacting external ones) are represented as boxes in the matrix's diagonal (P j to P n) and are numbered from the top left to the lower right corner (the numbering sequence does not imply a sequence of execution or timing); • Any process may use certain inputs and creates outputs. Inputs of individual process are represented as boxes, situated in the same row where the process box is located (left and right form the process box); outputs ofindividual process are also represented as boxes, situated in the same column where the process box is located. The box at the intersection of Pjs column and the row of process P z is an output of P, used by P z (output of process P1 as an input). Thus, the matrix represents the interaction of P, and P z. Figure 3.4 shows the interaction of process P, and P z with process P 3 , where X, represents the interaction between process P, and P 3 (the output of process P, and input of process P 3 ) while X z represents the interaction between P z and P 3 (the output of process P z and input of process P 3 ) . Figure 3.4 also represents the interaction between process P3 and some external process (that is part of the external environment). X 3 is the object exchanged in this interaction (the output of process P 3 is the input of an external process); • Process interactions (or more precisely, interacting objects) may be a) transformation objects (material or information), or b) control objects (information entities, like laws, policies, standards, etc.) guiding or constraining a process. As a convention, the

Business process modelling and its applications in the business environment

317

names of material objects are written in normal text style, information objects in bold and control objects in italic style; • If necessary, each process box (e.g. PI) may be represented in more detail in a separate matrix. The process interaction matrix is similar in expressive power to the IDEFO modelling language, with the exception that resource-(mechanism) inputs are not distinguished from ordinary- or control inputs, and arrow bundling/branching is not supported within one matrix (however, the decomposition of interface objects can be done on a detail-matrix). On the other hand the minor addition of a graphical notation to distinguish material, information and control objects has been found useful in practice. The advantage of this matrix variant of IDEFO is its efficient use of space on a page, and the possibility to construct it using a simple text editor or spreadsheet program. Note that the GRAI-Grid presented in the Section 3.1 is also a kind of process interaction matrix, because it defines the interactions between decisional centres or decisional process respectively (processes on the management- and control level). 3.3. Product realisation and support processes

The ISO 9000:2000 standards require the identification of "the organisation's product realisation processes, as these are directly related to the success of the organisation. Top management should also identify those support processes that affect either the effectiveness and efficiency of the realisation processes or the needs and expectations of interested parties" (ISO/TC 176, 2000). Many organisations encounter difficulties in the attempt to identify, and differentiate between, product realisation and support processes. To define a line between these two groups of processes some strategic management concepts may be used. Strategic management emphasises the importance of the identification, development, accumulation and maintenance of the organisation's core capabilities and competencies in order to maintain a long-term source of organisational competitiveness and competitive advantage, and consequently, a successful market position of the organisation. The ISO standard's definition of product realisation processes could lead to a 'narrow' understanding and meaning. Product realisation processes are usually associated with the development, manufacturing, sales or distribution processes. However, the identification of a company's core capabilities and competencies (and their associated processes) may reveal a larger set that includes both traditional product realisation processesand other processes ofkey importance. After all, a core product, or end-products are only the material manifestations of an organisation's capabilities. To understand the relationships between product realisation processes and an organisation's core competencies, the definition of some basic notions, such as capabilities and competences, should be given first. According to ISO 9000:2000, a capability is defined as the ability of an organisation, system or process to realise a product that will fulfil the requirements for that product. By generalising this definition, a capability can be defined as a firm's ability to execute

318

Brane Kalpic, Peter Bemus, and RalfMuhlberger

business processes and activities to produce and deliver a required product through the deployment ofthe firm's resources. Therefore, a capability is a permanent or temporary aggregation of non-specific and/or specific assets needed to execute certain business processes (Kalpic et al., 2003). Capabilities may be functional (e.g. the ability to develop new products) or cross functional (e.g. quality-or integrative capabilities, such as the ability to manage a network organisation). Capabilities that directly contribute and improve the value perceived by the market/customers are called the core competencies of the organisation (Prahalad and Hamel, 1990). A core competence is a company-specific capability (capability of strategic importance), which makes the company distinct from its competitors, and defines the essence of the company's business. Firm-specific (core) capabilities may also be considered through the perspective of the firm's competitive advantage. Namely, governance over core capabilities should result in competitive advantage for the firm. Companies could posses many competencies, some of them are core and some of them are non-core. Irrespective of the strategic importance of core competencies, organisations are often not clear about what is and what is not a core competence. It is essential to be able to make such a distinction, because any neglected core capability may result in the loss or weakening ofthe company's competitive position. The first criterion is whether the activities that are part of the competence really contribute to long-term corporate prosperity. The second criterion of being a core-competence is that the competence must 'pass' the tests and meet the criteria below (Hamel and Prahalad, 1994):

• customer value-a core competence must make a disproportionate contribution to customer-perceived value, • competitor differentiation-the capability must be competitively unique, • extendibility-a core competence is not merely the ability to produce the current product configuration (however excellent that product line may be), but it also must be able to be used as a basis of potential new products. For example, according to these criteria, a very efficient home-developed accounting system, supported by a company-specific software, cannot be considered as a corecompetence of a manufacturing company. Even though this is a specific asset of the organisation, it does not directly contribute to the value of products or services perceived by the customer, therefore these accounting capabilities cannot be considered as a core competence. As a conclusion: the identification of an organisation's core competencies and associated processes is equally important as the identification and definition of the (traditionally perceived) product realisation processes. Therefore, it is the core competencies and all associated (support) processes that need to be identified, defined and managed through their entire life-cycle. According to Hamel and Prahala (1994), core competencies are the soul of the company and as such, their management must be an integral part of the management process of company executives.

Business process modelling and its applications in the business environment

319

3.4. From business process modelling to enterprise modelling

The ISO 9000 family of standards, in addition to business process modelling, requires the identification, definition and description of other enterprise entities as well. Standard requirements for the definition of authorities and responsibilities, required and possessed categories of individual capabilities, or process key-performance indicators is an extension that leads from business process modelling to enterprise modelling. For a systematic categorisation of modelling-related standard requirements, the GERA entity model content views could be used. GERA defines four different model content views for the user oriented process representation (IFIP-IFAC, 20(3):

• Function View (presented in detail in the discussion of functional, behavioural and decisional types of process models). • lriformation View collects the knowledge about objects of the enterprise (material and information) as they are used and produced in the course of the enterprise's operations. The information to be modelled is identified from the relevant activities and is structured into an enterprise information model in the information view for information management and for the control of the material and information flow. • Resource View represents the resources (humans and technical agents as well as technological components) of the enterprise as they are used in the course of the enterprise's operations. Resources are assigned to activities according to their capabilities and structured into resource models. • Organisation View represents the responsibilities and authorities ofall entities identified in the other views (processes, information, resources). This view also represents the structure of the enterprise's organisation by aggregating the identified organisational units into larger units such as departments, divisions, sections, etc. 3.4.1. OrJianisational view related standard requirements

With the emergence of decentralised organisations, flat hierarchies, etc., explicit knowledge about the roles of individuals, and who is responsible for what, is indispensable for any enterprise, especially for those operating according to new management paradigms (Vernadat, 1996). Typically, humans may assume different roles, as for example: chiefexecutive, market & sales, technical (R&D), finance, production planning, logistics, information system designers, quality inspectors, etc. Also alternative organisational structures may be deployed, for example elements of an organisation may be linked hierarchically or heterarchically and demonstrate properties of holons, webs, nets, temples or clusters. Further organisational structuring may occur on a functional, process or geographic basis. Individuals and groups of individuals will be assigned a number of roles and responsibilities. These assignments need to be carried out concurrently and cohesively, where each may involve different reporting lines and control procedures. It is important to understand when, by whom and how decisions are made in the enterprise as well as who can fulfil certain tasks or to replace others. The requirement

320

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

Review ofrequir ements relat ed to product (contra ct review) - type A over 500.000 to 500.000

EUR

BOARD President

1,000.000

1,000.000

A

A

,\

,\

A

C

PR

EUR

Member C

Marketing Marketing VP

Strat egic contracts

C

EUR

Purchasing coo rdin.

Custom er comp laint

Unsettled debts

PR R

Economics Economics VP Finances

R

R

R

R

R

Legal affairs

R

R

R

R

R

SBU President ofSBU Suppl y chain Mngr

A

R

R

R

E

PR

PR

PR

FED

E

PR

R&T Mngr Produ ction Mngr Quality assurance

AS

lEGE:'iD: Pro po sal - PRo Revi ew - R. App roval - A. Execu tion - E. Con trol - C. Assessmenl- AS, Request - RQ

Figure 3.5: Authoriry matrix.

to define and communicate responsibilities and authorities within the organisation is also included in the ISO 9000:2000 standards. Responsibilities and authorities cannot be defined completely and consistently before processes are described and the decisional system is designed, because responsibilities and authorities constitute only one view of enterprise processes. The systematic use of activity, behavioural and decisional process modelling provides a description of processes and activities of the physical (customer service & product) and management and control levels as well as the relations between these, and through this, the explicit allocation of responsibilities and authorities. The description of responsibilities and authorities could be done by employment of different matrix. Organisational matrix usually put on one-axis decisions or tasks to be carried out and on the other relevant organisational entities. Figure 3.5 shows a simple example of the matrix of authorities; acronyms are used for shorthand (PRproposal, R-Review, A-Approval, etc.). The organisational view may be represented using traditional organisation charts (a tree)-at least to describe the organisational hierarchy. However, if the organisational

Business process modelling and its applications in the business environment

321

chart is represented in a matrix (see Figure 3.5) it will assign management (decision) tasks to organisational units and thus may be considered a kind of (simplified) functional model of an enterprise. The organisational chart, as the most visible end-result of organisational design, structures divisions into departments, departments into sections, and so on, and allocates manager for each of these. 3.4.2. Resource view related standard requirements

In addition to the explicit definition of the role of humans in the enterprise (the definition of the organisation), the required capabilities (for any single position) and possessed capabilities (for any individual) have to be known as well. The ISO 9000:2000 standard requires that personnel shall be competent on the' basis of appropriate education, training, skills, experience, talents or backgrounds, and intellectual, psychological or physical capabilities. A competence is a demonstrated capability to apply knowledge or skillsand accomplish the task and delivering appropriate results. Therefore, the organisation must: • Determine the necessary competence levels for personnel performing work that directly or indirectly affects product quality. Note that this requirement is equally applicable to personnel involved in production & service delivery and management & control; • Allocate personnel to jobs matching the competencies of the individual with the capabilities required by the job; and • Provide training or take other action to satisfy these needs (foster the continuous development of employee competencies to the required level). To identify, define, and actively manage the capabilities and competencies of personnel, the organisation can define main categories of capabilities and describe them by relevant attributes. The standard also requires that the organisation should manage professional promotions for their employees. Therefore, organisations have to design professional development plans (career planning) for any individual and actively execute and manage those plans. Career planning and execution of promotion plans could not be efficient if the enterprise has no processes to achieve this, e.g. through the division of tasks and jobs in a way that promotes gradual professional development. Figure 3.6 shows an example ofan R&D department's professional development map describing the gradual progress of individuals based on their demonstrated capabilities. An employee in the R&D department may starts his/her career at the assistant level to acquire basic skills and concepts of product development. Later on, the next typical transition would be to continue on to the position ofconstructor and developer. At that stage the basic professional training and development is complete and the professional

322

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

High

Other (senior) management positions

~__j~-~~~~:::::::::J\----"Researcher Low

Professional Scope

High

Figure 3.6: Professional development: attainment of human resource competencies (Abel, 2003).

career starts branching where the individual may continue to become a) a highly focused and competent professional (researcher/technical expert), b) manager (project manager) or c) marketing manager (product manager). An R&D department may also be considered as the main "recruiting" department, which is the source of professionals for different other departments, such as sales and marketing or different (senior) management positions (e.g. new plant manager). In the definition of typical positions and professional migration steps (transitions), shown in Figure 3.6, one should aim at a balance between the scope of the given managerial- and professional role and the required managerial-, leadership- and professional capabilities/competencies. Both the enterprise engineering process and the operational environment rely on a significant amount of technology. Technology, is either production oriented and therefore involved in producing the enterprise products and customer services, or management and control oriented, providing the necessary means for communication and in-formation processing and information sharing. Therefore, beside modelling human capabilities, at least two other fundamental types offunctional entities (resources), have to be modelled: a) devices or machines (including IT, manufacturing or other types of technological devices) and b) applications (i.e. software packages). Both types of functional entities could be defined in terms of their technical characteristics and constraints (e.g. data access time for database server, maximum feed rate of a machine, number of units processed per unit of time, etc.), types of functional

Business process modelling and its applications in the business environment

323

operations offered, the level of machine autonomy, etc. A resource may be described using the following example set of characteristics (Vernadat, 1996): • identification • type • nature (consumable or non-consumable) • capacity • availability • roles • functionality • locations • shareabiltiy • mobility • reliability estimates • cost per unit, etc. Resources may also be grouped into classes according to their nature. Generic classes can be defined listing essential resource characteristics. Subclasses of these are more detailed and inherit the characteristics from generic classes as well as add further specific ones. Capability models would have to be developed for aggregate resources according to the intended/existing resource structure (e.g. shop floor models, system architectures, information models, infrastructure models), communication models (e.g. network models), etc. Despite the importance of resource management, few modelling methods applicable to enterprise modelling offer resource constructs. Only the CIMOSA resource modelling view (CIMOSA Association, 1996) and Integrated Enterprise Modelling (Spurg et a!., 1993) provide means to model in detail the structure of resources and their related characteristics. 3.4.3. Information view related standard requirements

In Section 2.1 a simple model of an artificial system was discussed. This model defines the following functions for the Management Information System (MIS): • connects the physical system and the management and control system (decision system). • exchanges information with the external environment. • delivers feedback and • aggregates information suitable for decision support. The requirement for the use ofan MIS is also expressed in the IS09000:2000 standard: "the organisation shall apply suitable methods for monitoring and, where applicable, measurement of the management system processes. These methods shall demonstrate the ability of the processes to achieve planned results".

324 Brane Kalpic, Peter Bemus, and Ralf Muhlberger

In the definition or design of key peiformance indicators (KPI), organisations are often too much focused on the development of a set of financial indicators, which show the growth of revenue, profit rate, market share, etc. Traditional financial indicators reflect the result of the company's previous activities and efforts-they may also be called 'lagging' indicators. Non-financial indicators are useful to monitor the structural development of the company, the organisation's potential and its corporate health-these indicators may be called 'leading' indicators. Therefore, in addition to financial indicators a set of nonfinancial indicators have to be developed and used. The importance ofthe use ofboth types ofperformance indicators is also recognised by the IS09000:2000 standards. The standards require that in addition to financial indicators the organisation should also measure process performance, and other success factors identified by management, such as the satisfaction of customers, of people in the organisation and of other interested parties. There are a number of methods and techniques to achieve this, e.g. benchmarking, process assessments, etc. The definition and design of KPI could be supported by one of the contemporary methodologies developed for this purpose, such as the EFQM model or Balanced Scorecard (BSC) methodology. Kaplan and Norton (1996) in their BSC methodology identify four categories of performance indicators (learning and growth, business processes, customer relationship and financial performance). The EFQM Excellence Model (1999) organises 32 subcriteria into 9 major criterion groups. BSC also allows the creation of a tangible link between the organisation's vision and its translation into strategic objectives, key success factors, key projects and performance indicators. To provide management with control over these key performance indicators the support of an MIS is essential. A major part of the MIS is a software application that facilitates data collection (form a wide range of internal and external data sources), its presentation (a highly automated generation of KPI values from relevant databases) as well as the interpretation, and the distribution of relevant KPls to individuals. In the design of software applications for the MIS an adequate methodology should be used. Various Data Warehouse 16 (DW) methodologies available today could be applied in the analysis, design and implementation of an information system that enables data to be transformed into meaningful business information and overcome today's companies' richness of data (availability of different applications and systems such ERp, other transaction-oriented systems, CRM, e-business, etc.) and poverty of information. A DW methodology is composed of four major phases:

• Organisational analysis, delivering information system requirements (business and technical), as well as the identification ofbusiness objectives and a technical feasibility analysis. 16The data warehouse is the central repository for data consolidation, cleansing, transformation and storage in a format best organised for reporting, extraction and data mining. Virtually all data warehouse best practice methodologies embrace variants of what is often referred to as a "hub and spoke" architecture. Data warehouses function as the hub, or staging area, for feeding application-specific data marts.

Business process modelling and its applications in the business environment 325

• Design phase, composed of the following activities: conceptual design delivering a conceptual data model, logical design and physical database design from the data model, detailed specification ofthe process model for extraction, transformation, and loading, design ofreports, and the design ofadditional aspects such as the security and metadata models. For modelling the information aspect, many different modelling languages could be used, such as: (extended) entity-relationship model, IDEFIX, EXPRESS, UML class diagram, SQL or CIMOSA information modelling language. • Construction phase, where implementation teams build the Database Management System, code and populate the warehouse with data, and develop the applications for end-user analysis and reporting. • Verijication and validation of the system (e.g. data quality validation, user acceptance testing regression/system load testing, etc.). 3.5. The ISO 9000:2000 and business process reference models

The 1509000 standard requires that the organisation shall plan, develop and control the processes needed for product realisation. Many product realisation processes are fairly standard, structured and repetitive in nature, and can be described by behavioural or activity process models (e.g. sales processes, order conformation, shipment processes, customer complaint resolving process, etc). These processes are usually well defined, described and formalised in so-called 'quality procedures' (QP) in the organisation's QMS. However, enterprises could incorporate in their repertoire of models other product realisation processes, even if they are unstructured or ill-structured (asfor example, the product design and development process). QPs for these are usually not described, or if they are, not in sufficient detail to provide guidelines or procedures for their execution. For such processes QPs would typically exist only as high-level requirements for review, verification, validation, monitoring, inspection, and test activities and activities attached to determination of quality objectives. Some of these processes (e.g. product design and development) are performed and managed in a fashion similar to project management. To improve quality, reliability and efficiency, as well as to provide support for project design (planning and scheduling), implementation and operation (execution and control), organisations should develop reference models for these processes. Project reference models (including the processes performed in the project) do not necessarily have to standardise on a single particular process but rather they should propose a model, based on which each individual project may design its own, tailored process. The existence (and adoption) of such a reference model promotes commonality without forcing uniformity on a process that by its nature is expected to be different each time it is executed. For example, a reference model for product design and development processes may determine: • the design and development stages, and the key activities in each stage, • the review, verification and validation processes that are appropriate to each design and development stage,

326

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

• the responsibilities and authorities for design and development tasks, • inputs and the outputs of the design and development process. An activity model may be able to capture the commonality in every such process, whereupon the actual procedure (sequences ofactivities) and the life-history (development in time) of every single development project is potentially unique. Behavioural reference models can be developed for projects that have a repetitive nature, i.e., where product development is following a predominantly predictable path. In the authors' experience reusable process reference models benefit the company though: • supporting project planning, and scheduling of activities and resources, etc.therefore development activities do not start from scratch; • improving the efficiency and quality of project planning and scheduling; • providing a repository of knowledge and experience for the project planning phase (through the formalisation and reuse of this knowledge); • providing the user/project manager with a checklist of important activities during the project's development (bill of activities); • helping to create a common communication platform, and providing for the entire organisation a greater chance of understanding what is represented in the project plan. 4. BUSINESS PROCESS MODELLING IN BUSINESS PROCESS REENGINEERING

In many companies operations are still performed in a very traditional way and information is gathered using 'paper and pencil' methods. This causes low responsiveness to customers demands, long process lead times, delays, information losses, excessive overhead and unnecessary production expenditures, and consequently low customer satisfaction (Vernadat, 1996). New business trends have forced many companies to review and simplify their operation procedures (Hammer and Champy, 1993). The Business Process Reengineering (BPR) movement promotes the fundamental rethinking and radical redesign ofbusiness processes (within and between organizations) to bring dramatic improvements in critical, contemporary measures of performance, such as cost, quality, service, and speed (Hammer and Champy, 1993; Davenport and Short, 1990). Teng et al. (1994) ague that the reason for an increased attention to business processes is largely due to the Total Quality Management (TQM). They conclude that TQM and BPR share a cross-functional orientation. However, Davenport (1993) observes that quality specialiststend to focus on incremental change and gradual improvement of processes, while proponents of reengineering seek radical redesign and drastic process improvement. The popularity ofBPR notwithstanding, there are many misconceptions about the essence ofreengineering. Hammer and Champy (1993) note: what organisations often do under the banner ofreengineering is simply a reorganisation project, a staffreduction exercise, or is an incremental efficiency programme. The essence of reengineering is

Business process modelling and its applications in the business environment

327

not 'reorganizing' or 'downsizing'. Reengineering looks at what work is required to be done, eliminating work that is not necessary, and finding better, more effective ways of doing what is needed; its focus is not on how the organization is structured. If a company embarks on a reeingineering programme, then organizational structures should be defined only after the production and service delivery processes have been (re)designed. In other words, the organizational structure is designed so it can best support these processes. Therefore, reengineering is not simply about making an organization more efficient, but about creating value for the customer. Note that an indiscriminate customer focus can also be a danger of reengineering, because the purpose is not exclusively the service to the customer; equal value should be placed on the continued health and competitiveness of the organisation (so the company can continue serving customers in the future). 4.1. Ten-step approach to BPR

Many different BPR methodologies are proposed in the literature. The ten-step approach presented below integrates the guidelines of some major methodologies proposed by a range of authors (Vernadat, 1996; Davenport and Short, 1990; Malhotra, 1998).

1. Identify processes and set objectives for improvement. First, one has to identify those business processes tha are targeted for improvement. Priority for BPR is given to those processes that directly contribute to the organisation's mission and vision. Therefore, the strategic identity of the organisation should be profiled and a strategy must be defined. BPR must be driven by a business strategy, which implies specific business objectives relevant for the BPR process (such as Cost Reduction, Time Reduction, Output Quality improvement, etc.). 2. Get management commitment to re-design processes. Managers are accountable for results and are therefore empowered to act with much discretion with respect to business process reengineering. Leadership is critical to the success of any BPR effort. 3. Form a cross-junctional team. Process redesign usually has significant impacts across organisational boundaries and generally has impacts or effects on external suppliers and customers. For this reason, the process reengineering team must be cross-functional, to include members from all impacted organisations or organisational units. 4. Modelling the AS-IS process. To develop an AS-IS process model the acquisition of basic information about the process in question has to be performed first. As the first source of information formal documents (e.g. documents ofthe company's quality management system) could be used. (See Section 4.2 for common problems in gathering process information, as well as discusses cases where the preparation of an AS-IS model is not expected to add value to the BPR exercise). 5. Identify areas for improvement. Business processes need to be analysed to eliminate non-value added activities, simplify and streamline limited value added processes, and examine all processes to identify more effective and efficient alternatives to the process, data, and system baselines.

328

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

6. Design the ideal TO-BE process. Determination how much of the TO-BE process can actually be implemented starting from the AS-IS process, followed by the evaluation of alternatives (e.g. through a preliminary functional! economic analysis) and selection of a preferred course of action. In the design of the TO-BE process: a) awareness of ICT capabilities should influence the process design and b) review of the existing or design of the new process key performance indicators must be carried out. The new process design must clearly show how the company and its customers will beniftt from its implementation-the mere fact that the new business process is correct (i.e., it will produce the expected deliverables) is not a sufficient argument for its adoption. 7. Verify the TO-BE process. TO-BE process should be verified (showing that the model is based on the company's business requirements), tested for correctness (verify that he process would work if implemented, using business process simulation, ABC tool, acting-out sessions conducted by stakeholders, etc.), and improved (if needed). Note that this verification process is not the same as validation after process implementation (see step 9), because the execution of the process under real life circumstances may bring out further needs for correction or improvement. 8. Propose an implementation planandget management commitment. The implementation plan, beside the definition of main process implementation activities (including timing and responsibilities), should define any required organisational changes, the definition of required resources and capabilities, etc. The implementation plan should recognise that the first implementation is likely to result needs for modification, therefore a tesing period should be planned for. 9. Install and validate the newprocess. Installation of the new process should be done according to the implementation plan and managed as a project. Ultimately, process validation is carried out by the process owner, based on the results of key performance indicators. 10. Monitor the newprocess forfuture changes as needed. The actual design should not be viewed as the end of the BPR process. Rather, it should be viewed as a prototype, with successive iterations performed in an ongoing process. In spite of well documented BPR methodologies and management awareness and initial commitment, statistics shows that about 70% ofBPR projects still fail. According to literature (Malhotra, 1998; Bashein et aI., 1994) some of the biggest obstacles faced by BPR projects are: • lack of sustained management commitment and leadership, • unrealistic scope and expectations, • resistance to change, • "Do It to Me" attitude (lack of active involvement), • unsound financial conditions, • too many projects under way, etc. 4.2. How to develop an AS-IS process model

In the creation of the AS-IS process model (both for BPR and for documentation in a QMS), companies face some typical problems.

Business process modelling and its applications in the business environment

329

A QMS (quality manual and quality procedures) are a main source of documented description ofthe organisation's business processes. These descriptions are usually composed of text and simple charts. However, everyday practice has shown to the authors, that based on QMS documents it is very difficult (or impossible) to reconstruct the contents of the process or to fully understand its functions, sequence of activities, their dependencies, required inputs and delivered outputs, or to identity the key decisions or allocation of authorities over these decisions. The reason for this is that the use of textual descriptions and simple charts does not guarantee an understandable, transparent and unequivocal description of business processes. Therefore, the use of business process modelling supported by formal modelling languages, tools and methodologies are needed to provide a systematic, standard, unequivocal, interpretable and complete description of information about processes, the involved entities, functionality and behaviour. Such formal models play an essential role in the quality description of business processes. The incomplete process information captured in usual QMS documents, or in other organisation-specific documents, has to be extended through additional interpretation, which usually needs interviews with stakeholders and the process owner(s). During the development of the AS-IS model, the authors have experienced how difficult it is for people to express their implicit knowledge of the process. Therefore capturing of information about the process is a difficult and time-consuming task. At the same time, process models are an adequate base and communication vehicle (between process owner and the person who performed the modelling) for the exchange, presentation and agreement on the interpretation of facts about the process (Kalpic and Bemus, 2002). The authors, based on their own industrial experience gained through running different BPR projects and the creation ofAS-IS process models, point to the importance of a basic understanding of the modelling language syntax and semantics by all stakeholers, to achieve an efficient exchange of information about the process captured in the model. This may seem as an obvious statement, but since enterprise models are mostly graphical, they can be interpreted by untrained stakeholders as just illustrative 'figures' or 'pictures' and as a consequence part of the information in these graphically represented models may not be conveyed (and this fact may remain unnoticed). Therefore, a short introduction of syntax and semantics of the modelling language used in process modelling should be given first. The design of the process model is usually an iterative process where, based on the model, the exchange of information and understanding of its content is to be achieved between the process owner and process designer/analyst. The process designer/analyst and process owner iterate this modelling process until the process model is mutually agreed on and confirmed as a credible and relevant snapshot of the existing process. There are arguments for not preparing an AS-IS model, in case the present process is known to be ofsuch inferior quality that little would be learnt from a model ofhow it is performed at present. However, often this knowdge is not shared by stakeholders, and they need an understanding ofwhat the problems are. In such a case an AS-IS modelling

330

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

exercise may be started, with the intention for all stakeholders to agree on the lacking qualities of the existing process. Practice shows that AS-IS modelling in such cases is usually not carried to the end (not to the level of detail that one might expect from a TO-BE model). This is because stakeholders will automatically include corrections in the model, and what started out as an AS-IS model, becomes the implementation of steps 5 and 6-ideas for improvement and the preparation of a TO-BE model. If management is aware of the likelehood of this happening, starting an AS-IS modelling exercise may still be of use for training purposes, e.g. if participants are not familiar with the formalisms intended to be used for modelling. However, this latter goal can also be acheived through training that is based on existing good examples. 4.3. Use documented best practice as an input to the BPR process

To improve the quality of TO-BE process models, and the effectiveness and efficiency of the design process, BPR practicians might use some available business process reference models. Reference models could be used as a) general and high-level guidelines, developed from examples of the best practice in the given industry (e.g. to gain insight into the most critical points ofthe process), or b) requirements which must be met (e.g. in case the business processes must be adjusted to an ERP system implementation). Business process reference models may be acivity or behavioural models, depending on the type of the process and the purpose of the reference model's use. In the development of business process reference models or in BPR, authors have noticed that in many casesactivity process models are more satisfactory then behavioural ones. Namely, activity reference models express a general nature of the processes (for instance they identify the interfaces and co-operation between elementary activities) and can be instantiated according to the particular needs. Behavioural models are useful for the purposes of simulation and certain analysis tasks, but can only be produced if the business activity is procedural in nature. This means that some activity models (those which are fully implemented in an automated way) may be further detailed using a behavioural model. Also high-level behavioural models may be constructed to describe certain procedures, lowest level activities of which, however, are not procedural and are thus need to be treated as elementary from the behavioural point of view. For the success of the business process modelling, business process reengineering or just simple redeployment of the best practice described by reference models, easy accessibility and distribution of business process models is one of the key factors. Organisations can use a variety of information infrastructure and technologies (usually already available and present in organisations) such as Intranet, web technology, etc. Using such a distribution mechanism process models can be made available to all stakeholders, and their access can be made platform (software and hardware) independent. 5. BUSINESS PROCESS MODELLING AND KNOWLEDGE MANAGEMENT

As economies move into the information age and post-industrial era, information and knowledge become important, if not the most important, resources to organisations (Bell, 1973).

Business process modelling and its applications in the business environment

331

Knowledge is widely recognised as being the key asset of enterprises. Therefore, knowledge and its use are regarded as the primary source of competitive advantage of enterprises and base for an enterprise's long-term growth, development and existence. The awareness of the strategic importance of knowledge has also been reflected, recognised and investigated in the strategic management field. E.g., the resource-based view (RBV) of strategic management regards knowledge, privately held by the enterprise, as a basic source of competitive advantage. It is argued that a company's competitive strength is derived form the uniqueness of its internally accumulated capabilities (Conner and Prahald, 1996; Schultze, 2002). The RBV approach therefore implies that not all knowledge is equally valuable. A (knowledge) resource that can freely be accessed or traded in the market has limited ability to serve as a source ofcompetitive advantage (however this knowledge/resource could improve the organisation competitiveness) . Because of the evident importance of knowledge, it is not a surprise, that present times are described by phrases like 'knowledge society' or 'knowledge economy'. 5.1. What is knowledge?

In the literature, several different definitions of knowledge can be found. Oxford English dictionary (1999) defines knowledge as the "facts, feelings, or experiences known by a person or group of people". According to Baker et al. (1997), knowledge is present in ideas, judgements, talents, root causes, relationships, perspectives and concepts. Knowledge can be related to customers, products, processes, culture, skills, experiences and know-how. Bender and Fish (2000) consider that knowledge originates in the head of an individual (the mental state of ideas, facts, concepts, data and techniques, as recorded in an individual's memory) and builds on information that is transformed and enriched by personal experience, beliefs and values with decision and action-relevant meaning. Knowledge formed by an individual could differ from knowledge possessed by another person receiving the same information. Similar to the above definition Baker et al. (1997) define knowledge in the form of a simple formula: Knowledge

= Information + [Skills + Experience + Personal Capability]

This simple equation must be interpreted to give knowledge a deeper meaning: knowledge is created from information as interpreted and remembered by a person with given skills, experience and personal capabilities, and is the ability to use this information to guide the actions of the person in a manner that is appropriate to the situation. It is noteworthy that this does not imply that the person is aware of this knowledge or that he/she can explain (externalise) it. These distinctions are important to consider when planning to discover what knowledge is available, or intending to establish knowledge transfer I sharing.

332

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

5.2. Need for knowledge management

Why is KM one of the hottest topics of the past decade, when the basic techniques of KM, which help people to capture and share their knowledge, experience and expertise, have been known and applied for a long-time? The authors believe that the great interest in KM is conditioned by several drivers. Fist, the birth ofKM, which occurred in the early 1990s, grew from recognition of how difficult it is to deal with complexity in an environment of ever increasing competition spurred by technology and the demands of sophisticated customers (Bennet and Bennet, 2002). Second, the idea of KM has created considerable interest because it gives a deeper explanation to managers' interest in core competencies, their communication, and their transfer. It also creates awareness of knowledge as an important economic asset, and of the special problems of managing such assets (Spender, 2002). Third, many companies have a world-wide distributed organisation, and the dissemination of company knowledge requires suitable techniques ofknowledge management (such as knowledge acquisition and sharing). This situation is made even more difficult in organisations that operate in culturally diverse environments. Fourth, the pace of adoption of the Internet technology, especially the establishment of lntranets, Extranets, Web portals, etc., has created a networking potential that drives all of society and corporations to work faster, create and manage more interdependencies, and operate on global markets (Bennet and Bennet, 2002). Finally, as a fifth driver, in the 1990s companies became aware of the threat and risk of losing valuable key organisational knowledge, that is often present only in employees' heads (knowledge which is not explicit, externalised or formalised and is consequently not available for use by other individuals). At the same time, the demand for quicker growth of knowledge (competence) of employees has become a new driver to manage organisational knowledge. Consequently, KM is expected to provide (Holsapple and]oshi, 2002): • an organisational response on awareness ofthe importance ofinformation and knowledge, • an answer to how organisational knowledge can be used more efficiently, • techniques to create, capture, formalise, organise, integrate, tailor, share, spread and reuse organisational knowledge, and • techniques to make available the right knowledge to the right people in the right representations and at the right time. Beside the technical and methodological issues of the implementation of KM, the socio-cultural aspects of KM must also be carefully considered. According to Baker et aI. (1997) KM should continually improve the effectiveness of available knowledge by focusing on the key people, processes and technology. Companies should develop an organizational ethos, which applies the concept of knowledge management as the norm. According to Bender and Fish (2000), KM is not a programme but a new way ofworking that needs to be embedded into an organisation's culture through its overall strategy and design of operations.

Business process modelling and its applications in the business environment

333

Even though the con cept of KM has emerged only recently, there is a number of initiatives that organ isations have earlier adopted and that are useful components to implement KM . The result ofth e learn ing organization, business pro cess re-e ngineering, business pro cess modelling, quality management and business int elligence movements can be used as a foundation for a comprehensive adoption of KM and the building of knowledge-based companies. 5.3. The nature of knowledge and its sharing

KM literature defined two main knowledge categories : explicit and tacit. Polanyi (1966) defines tacit know ledge as kno wledge, which is implied, but is not actually documented, neverth eless the individual 'knows' it from experience, from other people, or from a combination of sources. Explicit knowledge is externally visible; it is documented tacit knowledge (lunnarkar and Brown , 1997). Skryme and Amidon (1997) define explicit knowledge as formal, systematic and obj ective, and it is generally codified in words or numbers . Explicit knowledge can be acquired from a number ofsources including company-internal data, businessprocesses, records of policie s and procedures as well as from external sources such as through intelligence gathering. Tacit know ledge is more intan gible. It resides in an individual's brain and form s the basis on which individuals make decision s and take actio n, but is not externalised in any form. Polanyi (1958) also gives another detailed and substantial definition of kno wledge categories. He sees tacit knowled ge as a personal form ofknowledge, w hich individuals can only obtain from direct experie nce in a given dom ain. Tacit kn owledge is held in a non-verbal form , and therefore, the holder cannot provide a useful verbal explanation to ano ther individual. Instead, tacit kn owledge typically becom es embedded in, for example, routines and cultures. As opposed to this, explicit kn owledge can be expressed in symbols and communicated to other individuals by use of these sym bols. Bej ierse (1999) states that explicit knowled ge is characterised by its ability to be expressed as a word or number, in the form of hard data, scientific formulas, manuals, computer files, do cum ents, patents and standardised procedures or universal works of reference that can easily be transferred and spread. Impli cit (tacit) kn owled ge, on the other hand, is mainly people-boun d and difficult to formalise and therefore difficult to transfer or spread. It is mainly located in people's 'he arts and heads'. Considering the aforemention ed definitions, the authors define explicit knowledge as knowledge, which can be articulated and written down. Therefore, such knowledge can be extern alised and con sequentl y shared and spread. Tacit knowledge is developed and derives from the practical environ ment; it is highly pragmatic and specific to situations in which it has been developed . Tacit kn owledge is subco nscious, it is understood and used but it is not identified in a reflective, or aware, way. Althou gh tacit kno wledge is not directly externalisable, it is sometimes possible to create externalisations' " that may be used by someo ne else to acquire the same tacit knowledge. Tacit kno wledge could be made up of insights, judgem ent , kn ow-how, mental models, intuition and 171.e., th ese externalisations do not cont ain a record of the knowledge itself, rathe r they wo uld co ntain information that anothe r person could (under certain circumstances) use to construc t th e same knowedge out of his/ her already possessed intern al knowledge.

334

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

Internal

Extcrnaliscd II

Explicit (awa re)

Inform al awa re ness

~~

("~:;:'l{ e:;;;;:;;;;;:==::::J to act In situatio ns; no awareness A b lllt~·

For mal explana tions & models Inform al explanatio ns & Incomplete models O bservabl e actions, demon str ations and recoun ts

} Formal

ot forma l -- -----}

ot

formalisablc

Domains of knowled ge

Various processes that a pen on could use to share internal knowledge ;\ penon 01 8 " become aware of Iorm erly un awa re kn owled ge, and can deve lop or learn techn iq ues to Iortnallse II (the boundaries 01 0' 0'(').

Figure 5.1: Knowledge categories.

beliefs, and may be shared through direct conversation, telling of stories and sharing common expenences. The above definitions give rise to a categorisation that can be used to make practically important differentiations between various forms of knowledge. The authors propose to divide knowledge into sub-categories according to the following criteria (see Figure 5.1): • Is the knowledge internalised in a person's head or has it been externalised (internal/externalised)? In other words, have there been any external records made (in form of written text, drawings, models, presentations, demonstrations, etc.)? • Is there awareness of this knowledge (explicit/tacit)? Awareness means here that the person identifies this knowledge as something he/she is in the possession of and which could potentially be shared with others. In other words, the person not only can use the knowledge to act adequately in situations, but also conceptualises this knowledge (this awareness may be expressed by statements as "I can tell you what to do", "I can explain how to do it"). The lack of awareness manifests is statements like "I can not tell you how to do it, but I can show". • Does the externalisation have a formalised representation or not (formal/not formal)? Formalisation here means that the external representation of the knowledge is in a consistent and complete mathematical/logical form (or equivalent). Note that each domain of knowledge may contain a mixture of tacit and explicit constituents.

Business process modelling and its applications in the business environment 335

5.4 . The knowledge process and knowledge resources

A comprehensive survey of the KM literature shows the various knowledge management framework s and KM activities. Some of frameworks are composed of very low- level activities and in some frameworks seems that eleme ntary activities group into higher-l evel activities. N onaka and Takeuchi (1995) defines four processes:

• Internalization is the process in which an individual internalises explicit knowledge to create tacit knowledge. In Fig.5.1 this correspo nds to turnin g aware knowledge into tacit knowled ge-Nonaka does no t differenti ate between formal and informal awareness. • Externalisation is the process in wh ich the person turns their tacit knowledge into explicit knowledge through docum entation, verbalisation, etc. In Fig 5.1 this process corresponds to turning tacit knowledge into aware knowledge and subsequently communicating it (internal ~ extern alised). • Combination is the process where new explicit knowledge is created through the combination of othe r explicit know ledge. In Fig 5.1 this process is internal to explicit knowledge, and does not differentiate cases, such as formalising know ledge, i.e. the transition informal awareness ~ form al awareness. • Socialisation is the process oftransferr ing tacit knowledge between individuals throu gh observations and working with a mentor or a more skilled/ knowledgeable individual. In Figure 5.1 this correspon ds to tacit know ledge ~ observable actions, etc. Devenport and Pru sak (1998) identify four knowledge process: knowledge generation (creation and kno wledge acquisition), knowledge codification (storing), knowledge transfer (sharing), and knowledge application (these processes can be represented as various transitions between knowledge categories in Figure 5.1). Alavi and Marwick (1997) define six KM activities: a) acquisition, b) indexing, c) filtering, d) classification , cataloguing, and integrating , e) distributing, and f) application or knowledge usage, while H olsapple and Whinston (1987) indentfy more comprehensive KM process, comp osed of the following activities: a) procure, b) organise, c) sto re, d) maintain, e) analyse, f) create, g) present, h) distribut e and i) apply. Holsapple and Joshi (2002) present four major categori es ofknow ledge manipulation activities:

• acqumng acu vity, whi ch identifi es know ledge in the external environment (form external sources) and transforms it into a representation that can be internalised and used (the two steps ofintern alisation, external ~ aware internal and aware intern al ~ tacit are not differentiated); • selecting activity identifying needed know ledge within an organisation's existing resources; this activity is analogous to acquisition, except that it manipulates resources already available in the organisation; • internalising involves incor porating or making the knowledge part of the organisation;

336 Brane Kalpic, Peter Bemus, and Ralf MuWberger

• using, which represents an umbrella phrase for a) generation of new knowledge by processing of existing knowledge and b) externalising knowledge that makes knowledge available to the outside of the organisation. The above processes are applicable to the organisation asan entity, rather then addressing knowledge processes from the point of view of an individual. As a conclusion: organisations should be aware of the complete process of knowledge flow, looking at the flow between the organisation and the external world and the flow among individuals within (and outside) the organisation. This latter is an important case, because in many professional organisations individuals belong to various communities, and their links to these communities is equally important to them as the link to their own organisation. 5.4.1. Knowledge resources

Knowledge manipulation activities operate on knowledge resources (KR) to create value for an organisation. On the one hand, value generation depends on the availability and quality of knowledge resource, as well productive use of KR depends on the application of knowledge manipulation skills to execute knowledge manipulation activities. Holsapple and Joshi (2002) developed a taxonomy of KR, categorising them into schematic and content resources. The taxonomy identifies four schematic resources and two content resources appearing in the form of participant's knowledge and artefacts. Both schema and content are essential parts of an organisation's knowledge resources. Content knowledge is embodied in usable representations. The primary distinction between participant's knowledge and artefacts lies in the presence or absence of knowledge processing abilities. Participants have knowledge manipulation skills that allow them to process their own repositories of knowledge; artefacts have no such skills. An organisation's participant knowledge is affected by the arrival and departure ofparticipants and by participant learning. As opposed to this, a knowledge artifact does not depend on a participant for its existence. Representing knowledge as an artefact involves embodiment of that knowledge in an object, thus positively affecting its ability to be transferred, shared, and preserved (in Figure 5.1 knowledge resources correspond to recorded externalised knowledge). Schema knowledge is represented or conveyed in the working of an organisation. It manifests in the organisation's behaviours. Perceptions of schematic knowledge can be captured and embedded in artefacts or in participant's memories, but it exists independent of any participant or artefact. Schematic knowledge resources are interrelated and none can be identified in terms of others. Four schematic knowledge resources could be identified: a) culture (asthe basic assumptions and beliefs that are shared by members of an organisation), b) infrastructure (the knowledge about the roles that have been defined for participants), c) purpose (defining an organisation's reason for existence), and d) strategy (defining what to do in order to achieve organisational purpose in an effective manner).

Business process modelling and its applications in the business environment

337

In addition to its own knowledge resources, an organisation can draw on its environment that holds potential sources of knowledge. Through contacts with its environment, an organisation can replenish its knowledge resources. The environmental sources do not actually belong to an organisation nor are they controlled by the organisation. When knowledge is acquired form an environment source, it becomes an organisational source. 5.5. Business process modelling and knowledge management

Many knowledge management systems (KMSs) are primarily focused on solutions for the capture, organisation and distribution of knowledge. Rouggles (1998), for example, found that the four most common KM projects conducted by organisations were creating/implementing an intranet, knowledge repositories, decision support tools, or groupware to support collaboration. Spender (2002) states that the bulk of KM literature is about computer systems and applications of 'enterprise-wide data collection and collaboration management', which enhance communication volume, timeliness, and precision. Indeed, current KM approaches focus too much on techniques and tools that make the captured information available and relatively little attention is paid to those tools and techniques that ensure that the captured information is of high quality or that it can be interpreted in the intended way. Teece (2002) points out a simple but powerful relationship between the codification of knowledge and the costs of its transfer. Simply stated: the more a given item of knowledge or experience has been codified (formalised in the terminology of Figure 5.1), the more economically it can be transferred. Uncodified knowledge is slow and costly to transmit. Ambiguities abound and can be overcome only when communication takes place in face-to-face situations. Errors of interpretation can be corrected by a prompt use of personal feedback. The transmission of codified knowledge, on the other hand, does not necessarily require face-to-face contact and can often be carried out by mainly impersonal means. Messages are better structured and less ambiguous if they can be transferred in codified form. Based on the presented features of business process modelling (and in the broader sense enterprise modelling) and the issues in knowledge capturing and shearing, BPM is not only important for process engineering but also as an approach that allows the transformation of informal knowledge into formal knowledge, and that facilitates externalisation, sharing and subsequent knowledge internalisation. BPM has the potential to improve the availability and quality of captured knowledge (due to its formal nature), increase reusability, and consequently reduce the costs of knowledge transfer. The role and contribution ofBPM in knowledge management will be discussed in more detail in Section 5.6.1. 5.5. 1. BPM and KM are related issues

While the methods for developing enterprise models have become established during the 1990s (both for business process analysis and design) these methods have

338

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

concentrated on how such models can support analysis and design teams, and the question of how these models can be used for effective and efficient sharing of information among other stakeholders (such asline managers and engineering practitioners) has been given less attention. If enterprise models, such as business process models, embody process knowledge then it must be better understood to what extent and how existing process knowledge can be externalised as formal models, and under what conditions these models may be effectively communicated among stakeholders. Such analysis may reveal why the same model that is perfectly suitable for a business process analyst or designer may not be appropriate for end users in management and engineering. Thus the authors developed a theoretical framework which can give an account of how enterprise models capture and allow the sharing of the knowledge of processes-whether they are possessed by individuals or groups of individuals in the company. The framework also helps avoid the raising of false expectations regarding the effects of business modelling efforts. 5.6. The knowledge life-cycle model

Figure 5.2 introduces a simple model ofknowledge life-cycle, extending (detailing) the models proposed by Nonaka and Takeuchi (1995), and Zack and Serino (1998). Our extension is based on Bernus et al. (1996), which treat enterprise models as objects for semantic interpretation by participants in a conversation, and establishes the criteria for uniform (common) understanding. Understanding is of course most important in knowledge sharing. After all if a model of company knowledge that can only be interpreted correctly by the person who produced it, is of limited use for anyone else. Moreover, misinterpretation may not always be apparent, thus through the lack of shared interpretation of enterprise models (and lack of guarantees to this effect) may cause damage. This model (Figure 5.2) represents relations between different types of knowledge, and will be used as a theoretical framework. In order for employees to be able to execute production, service or decisional processes they must possess some 'working knowledge' (e.g. about process functionality, required process inputs and delivered outputs, organisation, management, etc.). Working knowledge is constantly developed and updated through receiving information from the internal environment (based on the knowledge creation process) and from the external environment (thought the process of knowledge acquisition). Working knowledge (from the perspective of the knowledge holder) is usually tacit. Knowledge holders don't need to use the possessedknowledge in its explicit, formalised form to support their actions. They simply understand and know what they are doing and how they have to carry out their tasks-having to re-sort to the use of explicit formal knowledge would usually slow down the action. According to the suitability for formalisation such working knowledge can be divided into two broad groups: formalisable and not-jormalisable knowledge. (Note this is not the same as the formalised/not-formalised distinction, because there may be tacit knowledge that is held in an unaware way, but with suitable enquiry into that knowledge ways to make it aware and make it explicit-either in a formal or not formal way-may be found.)

'-0

'" '"

/

Externalisation processes

Formalisatio n skills

Figure 5.2: The knowledge life-cycle model.

Culturally shared (situation) knowledge

For mal Knowledge (aware)

Explicit Knowledge (aware)

T acit know ledge

discovery processes

. Experience develops (acquiring and creating Knowledge)

Internalisation processes

Externa lisatio ns Formal model Culturally shared interpretation skills (situation) knowledge

~","'" -,

For mal

Impact on reality (business processes) (applicatio n ofknowledge}

Formal int erpreted Knowledge (aware)

340

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

Such division of knowledge (into formalisable and not formalisable) into two broad categories seems to closely correspond to how much the process can be structured, i.e. to be decomposed into a set of interrelated lower level constituent processes. These characteristics can be observed when considering knowledge about different typical business process types. The formalisation and structural description of innovative and creative processes, such as some management, engineering and design processes (or in general the group ofad-hoc processes), is a difficult task, due to the fact that the set ofconstituent processes is not predefined, nor is the exact nature oftheir combination well understood by those who have the knowledge. Consequently, knowledge about this type ofprocesses could be considered tacit knowledge (because they are not formalisable unaware processes), i.e. not suitable for formalisation/structuring. In contrast to the characteristics of the group of ad-hoc processes the group of illstructured and structured (repetitive or algorithmic) processes can be formalised and structured at least to a degree; consequently the knowledge about these processes is may become explicit formal knowledge. Examples of such processes are management, engineering and design on the level of co-ordination between activities as performed by separately acting-individuals or groups, and repetitive business and manufacturing activities. The formalisable part of knowledge (knowledge about structured and illstructured processes) is extremely important and valuable for knowledge management, because this may be distributed and thus shared with relative ease. Namely, the process of transformation of the formalisable part of tacit knowledge into formal knowledge (the formal part of explicit knowledge) represents one ofthe crucial processes in knowledge management. The authors believe that the cost of knowledge management (measured by the level of reuse and return of investment to the enterprise) in case of formal explicit knowledge would be lower than in case of tacit (unaware/implicit)-or even in case of unstructured explicit-knowledge, simply because the sharing of the latter is a slow and involved process. To be able to perform the aforementioned formalisation process we need additional competencies known as culturally shared or situation knowledge (e.g. knowledge shared by the community that is expected to uniformly interpret the formal models ofthe target processes). Culturally shared knowledge plays an essential role in the understanding of the process or entity in question and in its formalisation and structuring. E.g. the definition ofan accounting process can only be done by an individual who understands accounting itself, but this formalisation will be interpreted by other individuals who must have an assumed prior culturally shared and situational knowledge that is not part of the formal representation (Bemus et al., 1996). As we already mentioned, one of key objectives of KM is the externalisation of participant's knowledge. Regarding the type of knowledge (tacit and explicit) different tools and approaches in knowledge capturing may be used: • Tacit knowledge (whether formalisable or not) can be transferred through live in situ demonstrations, face-to-face storytelling, or captured informal presentations (e.g. multimedia records, personal accounts of experience, or demonstrations). Note that

Business process modelling and its applications in the business environment 341

tacit formalisable knowledge may be discovered throu gh research process and thu s made explicit. Subsequently such knowledge may be captured as describ ed in the bullet point below. • Explicit knowledge can be captured and presented in external presentations (through th e process of knowledge captur ing also known as knowledge codification). An external presentation may be formal or notformal. A textu al descrip tion , like in quality procedure docum ent s (IS0 9000) is not formal , while different enterprise models (e.g. functional business process model s) are examples of formal extern al representations of knowledge (know ledge externalisations). Formal and informal external representation s are called knowledge artifacts. The advantage of using form al models for process description is the quality of the captured knowledge. To actually formalise knowledge,.formalisation skills are needed (in this case business process modelling skills) . The above process of knowledge extern alisation has to be complemented by a matchin g process of knowl edge internalisation that is necessary for the use of available knowled ge resources. According to the type and form of externalised knowledge, various internalisation processes (and corresponding skills)are necessary. In general, the less formal the presentation /representation , the more pri or assumed situation-specific knowledge is necessary for correc t interpretation . Co nversely, mo re form al representations allow correct interpretation through th e use ofmore generic knowledge and require less situation-specific knowledge. Thus for malisation helps enlarge th e community that can share the given knowledge resource. An il!formal external presentation of know ledge accompanied with its interpretation (e.g. interpretation of the present ed story) can directly build worki ng (tacit) knowledge, however the use of these presentations is only possible in limited situations, and it is difficult to verity that cor rect interpretation took place as well as the completeness of know ledge transfer. H owever, the verification of correc t interpretation and completeness is only possible through direct investigation of th e und erstanding of the individuals who internalised this type of knowledge. T his is a serious limitation for knowledge sharing throu gh informal means. A formal external presentation, such as a business process model developed in the IDEFO (ICAM DEFinition) modelling language (Men zel and Mayer, 1998), must be first interpreted to be of use. To inter pret the content of infor mation captured in this model, formal model interpretation skills are needed. Th ese skills are generic and not situation dependent, therefore even culturally distant groups of peop le can share them . Still, such form al representation must be furthe r interp reted by reference to culturally shared, prior assumed knowledge so that the content of the formal knowledge (information captured in the business process model) can be und erstood and interpreted in the intended way, and thu s integrated into working knowledge (to improve competencies). However, to test for correct interpretability it is possible to test whether the primitive concepts in the model (i.e. those not furth er explained/ decomposed) are

342

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

commonly understood. If this is the case then the formal nature ofthe model guarantees uniform interpretability. Completeness can be tested without the direct investigation of the understandings of those individuals who internalise this formal knowledge (i.e. the developer of the formal model can test himself or herself, whether the model is complete-provided the primitive concepts used are uniformly understoocl'"). The reuse of formal externalised knowledge could have an impact on the execution of process in terms of their efficiency, according to the well known fact that formally learnt processes must undergo an internalisation process after which they are not used in a step-by-step manner. Therefore, the transfer of the acquired formal knowledge into tacit knowledge is a 'natural' learning process and is necessary for efficiency. The internalisation of externalised formal knowledge thereby closes the loop of the knowledge life-cycle. Beside the importance of the formalisation/structuring process of knowledge, easy accessibility and distribution of business process models is one of the key factors for a successful deployment of EM practice in organisations. 6. CONCLUSION

This article reviewed business process modelling with an emphasis on industrial practice. There are several approaches to BPM, which causes fragmentation of effort and parallell developments in the area. Enterprise Integration/Enterprise Architecture schools place an emphasis on modelling business processes on the concept and requirements levels, irrespective of the level of automation that might be intended. Workflow modelling schools concentrate on process modelling from the point ofview of the possibility to automate business processes. Unfortunately these schools propose different modelling languages to be used and thus the top-down and bottom-up approaches do not meet in a staightforward manner. However, many of the obstacles are artificial. 1) Requirements level activity models (such as may be expressed in IDEFO) capture the business process in such a way which allows several possible behavioural implementations, and if a behavioural model needs to be developed (e.g., in view ofpossible automation-also called model based control, or workflow execution) the activity models define the requirements for such models and can be used as an input for behavioural/workflow modelling; 2) Various versions of behavioural models are fairly similar in their expressive power, therefore the choice between these is dictated by the intended tools of implementation. Unfortunately very few of these languages allow resource modelling, which limits their usefulness, because a crucial question that management wants answered is: what are the resource requirements of processes and how do these compare with the capabilities of existing resources? An important aspect of business process modelling, which is not in widespread use yet, but in the authors' view crucial, is the modelling of the management and control system (the decision system) of the enterprise. Decisional models can be used 18This test is commonly ignored by developers of formal models, probably because they assume that primitive concepts are all known through the users' formal education.

Business process modelling and its applications in the business environment

343

to define the way the enterprise's activities are co-ordinated, on every management horizon-strategic, tactical, operational and real-time. This co-ordination includes inter-enterprise as well as intra enterprise management tasks. Thus, the traditional area of business process modelling is extended to unstructured and ill-structured processes. A combination of activity models and behavioural models can cover all enterprise processes, and satisfythe IS09001 :2000 requirements for business process management, including all core processes of the enterprise. It was also pointed out that process modelling and the modelling of the information, resource and organisation views of the enterprise are intricately interconnected, and business process modelling practice should address all four view of the GERA modelling framework. Furthermore, the explicit definition of the enterprise's decision system also defines user requirements for the Management Information System (which might either be implemented using contemporary Decision support tools and Data Warehousing technology, or using more traditional techniques). The article pointed out that the scope of using reference models in BPM is not limited to behavioural models, but useful activity models can be used even is cases where the actual implementation of these reference models may be different in each case, such as in new product development projects. Finally, the role of business process modelling was discussed in business process reengineering and knowledge management. REFERENCES Abel, 0. F. (2003) Changing Leadership Responsibilities and the Development of Tomorrow's Leaders. IEDC Bled. Alavi, M., and Marwick, P (1997) One Giant Brain. Boston (MA) : Harvard Business School. Case 9-397108. AMICE (1993) CIMOSA: Open System Architecture for CIM. 2nd extended and revised version. SpringerVerlag. Baker M., Baker, M., Thorne.]., and Durnell, M. (1997) Leveraging Human Capital. Journal of Knowledge Management. MCB University Press. 01:1 pp. 63-74. Bashein, B.]., Markus, L., and Riley, P (1994) Business Process Reengineering: Preconditions for Success and how to prevent Failures. Information Systems Management. 11(2) pp. 7-13. Beijerese, R. P (1999) Questions in knowledge management: defining and conceptualising a phenomenon. Journal of Knowledge Management. MCB University Press. 03:2 pp. 94-110. Bell, 0. (1973) Coming of Post-Industrial Society: A Venture in Social Forecasting. New York. Bennet, D., and Bennet, A. (2002) The Rise of the Knowledge Organisations. in Holsapple, C. W (Eds.) Handbook on Knowledge Management 1. Berlin: Springer-Verlag. pp. 5-20. Bender, S., and Fish, A. (2000) The transfer ofknowledge and the retention ofexpertise: the continuing need for global assignments. Journal of Knowledge Management. MCB University Pres. 04:2 pp. 125-137. Bemus P, Nemes, L.,and Moriss, B. (1996) The Meaning of an Enterprise Model. in Bemus, P, Nemes, L. (Eds.) Modelling and Methodologies for Enterprise Integration. London : Chapman and Hall. pp. 183-200. Bemus P, and Nemes, L. (1999) Organisational Design: Dynamically Creating and Sustaining Integrated Virtual Enterprises. in: Proceedings oflFAC World Congress. London: Elsevier. Vol-A pp. 189-194. Chen, D., and Doumeingts, G. (1996) The GRAI-GIM reference model, architecture and methodology. in Bemus, P, Nemes, L. and Williams, T.]. (Eds.) Architectures for Enterprise Integration. London: Chapman & Hall. pp. 102-126. CIMOSA Association (1996) CIMOSA Technical Baseline. Germany: CIMOSA Association. Conner, K., and Prahalad, C. K. (1996) A resource-based theory of the firm: Knowledge versus opportunism. Organization Science. Vol. 7 pp. 477-501.

344

Brane Kalpic, Peter Bemus, and Ralf Muhlberger

Davenport, T. H. (1993) Process innovation: reengineering work through information technology. Boston (MA) : Harward Business School Press. Davenport, T. H, and Short, J. E. (1990) The New Industrial Engineering: Information Technology and Business Process Redesign. Sloan Management Review. 1'..1'...11-27. Davenport, T. H., and Prusak, L. (1998) Working Knowledge: How Organizations Manage What They Know. Boston (MA) : Harvard Business School Press. PI'... 16. Doumeingts, G., Vallespir, 13., and Chen, D. (1998) Decisional Modelling GRAI Grid. in Bemus, P, Mertins, K. and Schmidt, G. (Eds.) Handbook on Architectures of Information Systems. Berlin: Springer-Verlag. 1'..1'...615-618. EFQM (1999) The EFQM Excellence Model. Brussels: European Foundation for Quality Management. van Eijk, P H. J., Vissers, C. A., and Diaz, M. (Eds.) (1989) The formal description technique LOTOS. Amsterdam: Elsevier Science Publishers B.V Gruniger, M. (1997) Integrated ontologies for enterprise modelling. in: Proceedings oflCElMT'97. Torino (Italy). Hamel, G., and Prahalad, C. K. (1994) Competing for the future. Boston (MA) : Harvard Business School Press. Hammer, M., and Clumpy, J. (1993) Reengineering the Corporation. New York: Harper-Collins Publishers. Hysom, R. (2003) Enterprise Modelling-The readiness of the Organisation. in Bemus, P, Nemes, L. and Schmidt, G. (Eds.) Handbook on Enterprise Architecture. Berlin: Springer-Verlag. PI'... 373-416. Holsapple, C. w., and Joshi, K. D. (2002) A Knowledge Management Ontology. in Holsapple, C. W. (Eds.) Handbook on Knowledge Management 1, Berlin: Springer-Verlag. PI'... 89-128. Holsapple, C. w., and Whinston., A. B. (1987) "Knowledge-based Organizations." Information Society. (2) PI'... 77-89. ISO JTC l/SC 7 (1989) IS08807 Information processing systems - Open Systems Interconnection LOTOS - A formal description technique based on the temporal ordering of observational behaviour. ISO/TC 176/SC2 (2000) IS09004:2000 Quality management systems-guidelines for performance improvements. ISO JTC l/SC 7 (2002) ISOIIEC 15288:2002 Systems engineering - System life cycle processes. Junnarkar, B., and Brown, C. V (1997) Re-assessing the Enabling Role ofInformation Technology in KM. Journal of Knowledge Management. MCn University Press. 01:2 1'.1'..142-148. IFIP-IFAC Task Force on Architerturcs for Enterprise Integration (2003) GERAM: Generalised Enterprise Reference Architecture and Methodology. http://www.cit.gu.edu.au/~bernus/taskforce/ geram/versions/geraml-6-3/v1.6.3.html. Kalpic, B., and Bemus, P (2002) Business Process Modelling in Industry-the Powerful Tool in Enterprise Management. Computers in Industry. Elsevier. 47(3) PI'... 299-318. Kalpic, 13., Pandza, K., and Bemus, P. (2003) Strategy as a Creation of Corporate Future. in Bemus, P., Nemes, L. and Schmidt, G. (Eds.) Handbook on Enterprise Architecture. Berlin : Springer-Verlag. PI'... 213-254. Kaplan, R., and Norton, D. P. (1996) The balanced scorecard: translating strategy into action. Boston: Harvard Business School Press. Knowledge Based Systems, Inc. (2001a) IDEF Methods: IDEFO Overview-Function Modelling Method. http://www.idef.com/ideflJ.htm1. Knowledge Based Systems, Inc. (2001b) IDEF Methods: IDEF3 Process Flow and Object State Description Capture Method Overview, http://www.idef.com/idet3.htIul. Kosanke, K. (1992), CIMOSA-A European Development for Enterprise Integration. Part 1: An overview: In Enterprise Integration Modelling. Cambridge (MA) : The MIT Press. PI'... 179-188. Malhotra, Y. (1998) Business Process Redesign: An Overview. IEEE Engineering Management Review. 26 (3) PI'.. 27-31. Menzel, C, and Mayer, R. J. (1998) The IDEF family of Languages, in: Bemus, P, Nemes, L. and Williams, T. J. (Eds.) Architectures for Enterprise Integration. London: Chapman & Hall. pp. 102-126. Mertins, K., and Bemus, P (1998) Reference Models. in: Bemus, P, Mertins and K. Schmidt, G. (Eds.) Handbook on Architectures of Information Systems. Berlin: Springer-Verlag. PI'... 615-618. Nonaka, I., and Takeuchi, H. (1995) The KnowIedge--Creating Company: How Japanese Companies Create the Dynamics of!nnovation. New York: Oxford University Press. Noran, O. (2003) A Mapping of Individual Architecture Frameworks (GRAI, PERA, C4ISR, CIMOSA, ZACHMAN, ARIS) onto GERAM, in Bemus, 1'.., Nemes, L. and Schmidt, G. (Eds.) Handbook on Enterprise Architecture. Berlin: Springer-Verlag. PI'..' 65-212.

Business process modelling and its applications in the business environment

345

Oxford University Press (1999) The Oxford English dictionary. Version 2.0. PIF Working Group (2003) The Process Interchange Format (PIF) Project. http://ccs.mit.edu/pifl. Polanyi, M. (1958) Personal Knowledge. University of Chicago Press. Polanyi, M. (1966) Tacit Dimension. New York: Doubleday. PrahaIad, C. K, and Hamel, G. (1990) The core competence of the corporation. Boston (MA) : Harvard Business Review. 68(3) pp. 79-91. Rouggles, R. (1998) The State of the Notion: Knowledge Management in Practice. California Management Review. 40(3) pp. 80-89. Schultze, U. (2002) On Knowledge Work. in: Holsapple, C. W (Eds.) Handbook on Knowledge Management 1, Berlin: Springer-Verlag. pp. 43-58. Schmidt, G. (1998) GPN - Generalised Process Networks, in: Bernus, E, Mertins, K. and Schmidt, G. (Eds.) Handbook on Architectures ofinformation Systems, Berlin: Springer-Verlag. pp. 191-208. Skyrme, D, Amidon, D. (1997) The Knowledge Agenda. Journal of Knowledge Management, MCB University Press. 01:1 pp. 27-37. Spender,]. C. (2002) Knowledge Fields: Some Post-9/11 Thoughts about the Knowledge-Based Theory of the Firm. in: Holsapple, C. W (Eds.) Handbook on Knowledge Management 1. Berlin: Springer-Verlag. pp. 59-72. Spur, G., Mertins, K., andJochem, R. (1993) Integrierte Unternehmensmodellierung. Berlin: Beuth Verlag. Teece, n]. (2002) Knowledge and Competence as Strategic Assets. in: Holsapple, C. W (Eds.) Handbook on Knowledge Management 1. Berlin: Springer-Verlag. pp. 129-152. Teng, T. c., Grover, v., Fiedler, K. D. (1994) Business Process Reengineering: Charting a Strategic Path for the Information Age. California Management Review. pp. 9-31. Uppington, G., Bernus, E (1998) Assessing the necessity of enterprise change: pre-feasibility and feasibility studies in enterprise integration. International Journal of Computer Integrated Manufacturing. 11(5) pp. 430-447. Vernadat, F. (1996) Enterprise Modelling and Integration-Principles and Applications. Chapman & Hall. Vernadat, F. (1998) The CIMOSA Languages. in: Bernus, E, Mertins, K. and Schmidt G. (Eds.) Handbook on Architectures ofinformation Systems. Berlin: Springer-Verlag. pp. 243-164. Warnecke, H.]. (1993) The Fractal Company. Berlin: Springer-Verlag. Westkamper, E. (1997) Integrated Production with Virtual Elements. in: Proceedings of 29th CIRP International Seminar on Manufacturing Systems. Osaka (Japan). pp. 50-56. Williams, T.]. (1994) The Purdue Enterprise Reference Architecture. Computers in Industry. Elsevier. 24(2-3) pp. 141-158. Zack, M. H., and Serino, M. (1998) Knowledge Management and Collaboration Technologies. http:// www.lotus.corn/solutio.

KNOWLEDGE BASED SYSTEMS TECHNOLOGY AND APPLICATIONS IN IMAGE RETRIEVAL

EUGENIO DI SCIASCIO, FRANCESCO M. DONINI, AND MARINA MONGIELLO

1. INTRODUCTION

Visual Languages can be basically classified in two main categories: languages that provide a formalism for visual representation and languages for visual programming. To the first class belong languages that provide a logical interpretation of visual information such as images or pictorial objects. To the second class belong languages that support a visual representation of traditional data type to provide systems with a more user-oriented interface. We consider the first approach and define a language for the definition of pictorial objects and visual queries on an image knowledge base. In the definition of our language we use a logical formalism instead of an approach based on grammars since this can enforce the importance of the semantics in reasoning about image content. The proposed language is made up of a definition and a query language, both defined following description techniques based on a Knowledge Representation (KR) approach. The approach is declarative and defmes a sketch-based language whose syntax and semantics stems from Description Logics (DL), a family oflogic formalisms for KR. As any KR formalism, they are equipped with a syntax to express pieces of knowledge, a semantics (which for DL is usually model-theoretic), and a set of reasoning services that infer implicit knowledge from asserted expressions. In such a definition, a set-theoretical view of images is needed both for the syntactic and for the semantic level. This has many advantages: the language we propose is compositional so it can provide a structured representation of objects; it is possible to 346

Knowledge based systems technology and applications in image retrieval 347

perform logical reasoning about the spatial representation component. Besides syntactic transformations can be proved to be sound with respect to the semantics. Finally, the method implements a sound and complete algorithm that performs reasoning services typical of a knowledge based environment such as subsumption, i.e., query containment, recognition, retrieval and classification. Besides, complex services such as reasoning about queries, e.g., containment and emptiness can be performed. These services can be used for both exact and approximate matching, using similarity measures. As other approaches do, we start from low-level features extracted with image analysis to detect and characterize regions in an image. However, in contrast with feature-based approaches, the syntax we provide allows one to describe segmented regions as basic objects and complex objects as compositions of basic ones. We believe that the main adavantages that a knowledge representation approach brings to research in image retrieval can be summarized as follows: 1. It separates the problem of finding an intuitive semantics for query languages in image retrieval from the problem of implementing a correct algorithm for a given semantics. 2. Once the problem of image retrieval is semantically formalized, results and techniques from Computational Geometry can be exploited in assessingthe computational complexity of the formalized retrieval problem, and in devising efficient algorithms, mostly for the approximate image retrieval problem. This is very much in the same spirit as finite model theory has been used in the study of complexity of query answering for relational databases [14]. 3. Our language borrows from object modeling in Computer Graphics the hierarchical organization of classes of images [27]. This, in addition to an interpretation of composite shapes which one can immediately visualize, opens our logical approach to retrieval of images of 3D-objects constructed in a geometric language [44]. 4. Our logical formalization, although simple, allows for extensions which are natural in logic, such as disjunction of components. Although alternative components of a complex shape are difficult to be shown in a sketch, they could be used to specify moving (i.e., non-rigid) parts of a composite shape. This exemplifies how our logical approach can shed light to extensions of our syntax suitable for, e.g., video sequence retrieval. 5. The language can be easily extended to represent and reason on vectorial images and adapted to new standard such as the W3C recommened Scalable Vector Graphics (SVG) [20]. 2. KNOWLEDGE REPRESENTATION AND DESCRIPTION LOGICS

To make the work self-contained, we now give a brief introduction to Knowledge Representation and Description Logics; more details can be found in the literature, e.g. [59), [8], [23], [4].

Knowledge Representation. Knowledge Representation provides methods for representing high-level descriptions of the real world that can be used to build systems

348 Di Sciascia et al.

able to find implicit consequences of explicit represented knowledge ("intelligent" applications). The first approaches to KR were roughly classified in two categories: logic-based formalisms that used the first order calculus to capture facts about the world and non-logic-based representations in which knowledge was represented by means of some ad hoc data structures (frames, semantic networks). Reasoning in the first category of formalisms amounted to verifying logical consequences, while in the second category it was accomplished by ad hoc procedures that manipulated the structures. Two main realizations in this field led to the definition of Description logics: the recognition that the core features of frames could be given a semantics by relying on first order logic and that, at the same time, frames and semantic networks did not require all the machinery of the first order logic, but could be recognized as fragments of it [4]. In fact, this implied that reasoning in structured-based representations could be accomplished by specialized reasoning techniques, on different fragments of first order logics, thus leading to computational problems of differing complexity. Hence, Description Logics were first considered as representation languages to establish the basic terminology in the modelled domain. In fact, in a DL formalism a knowledge base has an intensional component called TBox (Terminological Box) to define the description of objects and build complex descriptions, e.g., the scheme of data. The word "terminology" denotes a hierarchical structure built to provide an intensional representation of the domain of interest. Later the emphasys was on the set of constructs admitted in the language to form concepts. The word "concept" refers to the expressions of a DL language denoting sets of "individuals". In fact, a knowledge base has also an extensional aspect called ABox (Assertional Box), i.e., knowledge that is specific to the individuals of the domain of discourse. The integration ofthe two components TBox and ABox leads to an advanced query processing and answering. Hence, DLs are viewed as the core of knowledge representation systems; they can be useful in the design of a knowledge-based application as a language for defining a knowledge base, besides they provide tools to carry out inferences over it, i.e., to perform reasoning services.

Description Logics. In DL, the basic syntax elements are: concept names, role names. Intuitively, concepts stand for sets of objects, and roles link objects in different concepts. Semantics of DLs is defined an an interpretation on a subset of a domain. Formally, concepts are interpreted as subsets of a domain of interpretation fl, and roles as binary relations (subsets of fl x A). Basic elements can be combined using constructors to form concepts and roles expressions, and each DL has its distinguished set of constructors. Every known DL allows one to form a conjunction of concepts, usually denoted as n; some DL [54] include also disjunction u and complement r-to close concept expressions under boolean operations. Roles can be combined with concepts using existential role quantification, and universal role quantification. Concept expressions can be used in inclusion assertions, and definitions, which impose restrictions on possible interpretations according to the knowledge elicited for a given domain. Sets of such inclusions are TBox. Individuals can be asserted to belong to a concept using

Knowledge based systems technology and applications in image retrieval

349

membership assertions in an ABox. Usually, it is assumed that different names denote different elements in the domain. A concept description can also be considered as a query describing a set of objects the user is interested in. Reasoning services

Description logics are equipped with reasoning services: logical problems whose solution can make explicit information which was implicit in the assertions. The main reasoning services we are interested in are subsumption, classification, retrieval. The basic reasoning service in a DL is subsumption, i.e., the "Automatic classification" refers to the ability to insert a new concept into a taxonomy so that it is directly linked to the most specific concepts that subsume it (are more general than it is) and to the most general concepts that it in turn subsumes. Classification, allows one to place a new concept expression in the proper place in a hierarchical taxonomy of concepts. Classification is obtained by verifying subsumption relation between the new concept and the concepts already placed in the hierarchy. Retrieval, allows one to find the individuals in the knowledge based that are instance of a given concept. 3. RELATED WORK

Content-Based Image Retrieval (CBIR) has recently become a widely investigated research area. Several systems and approaches have been proposed; here we briefly report on the three main research directions. 3.1. Feature-based approaches

Largest part of research on CBIR has focused on low-level features such as color, texture, shape, which can be extracted using image processing algorithms and used to characterize an image in some feature space for subsequent indexing and similarity retrieval. In this way the problem of retrieving images with homogeneous content is substituted with the problem of retrieving images visually close to a target one [7, 35, 43, 45, 36, 26, 5, 13, 17, 29]. Among the various projects, particularly interesting is the QBIC system [43, 26], often cited as the ancestor of all other CBIR systems, which allows queries to be performed on shape, texture, color, by example and by sketch using as target media both images and shots within videos. The system is currently embedded as a tool in a commercial product, ULTIMEDIA MANAGER. Later versions have introduced an automated foreground/background segmentation scheme. Here the indexing of an image is made on the principal shape, with the aid of some heuristics. This is an evident limitation: most images do not have a main shape, and objects are often composed of various parts. Other researchers, rather than concentrating on a main shape, which is typically assumed located in the central part of the picture, have proposed to index regions in images; so that the focus is not on retrieval of similar images, but of similar regions

350

Oi Sciascia et al.

within an image [55, 38, 12]. The problem is that although all these systems index regions, they lack of a higher level description of images. Hence, they are not able to describe-and hence query for-more than a single region at a time in an image. In order to improve retrieval performances, much interest has grown in recent years towards relevance feedback [51,18,17]. Relevance feedback is the mechanism, widely used in textual information systems, which allows improving retrieval effectiveness by incorporating the user in the query-retrieval loop. Depending on the initial query the system retrieves a set of documents that the user can mark either as relevant or irrelevant. The system, based on user preferences, refines the original query retrieving a new set of documents that should be closer to the user's information need. This issue 'is particularly relevant in feature-based approaches, as on one hand, the user lacks of a language to express in a powerful way her information need, but on the other hand, deciding whether an image is relevant or not takes just a glance. 3.2. Approaches based on spatial constraints

This approach concentrates on finding the similarity of images in terms of spatial relations among objects in them. Usually the emphasis is only on relative positions of objects, which are considered as "symbolic images" or icons, identified with a single point in the 2D-space. Information on the content and visual appearance of images are normally neglected. The modeling ofthis type ofimages in terms of 2D-strings is presented in [15], each of the strings accounting for the position oficons along one ofthe two planar dimensions. In this approach retrieval of images basically reverts to simpler string matching. The approach in [31] considers the objects in a symbolic image associated with vertexes in a weighted graph. Edges-i.e., lines connecting the centroids of a pair of objects-represent the spatial relationships among the objects and are associated with a weight depending on their slope. The symbolic image is represented as an edge list. Given the edge lists of a query and a database image, a similarity function computes the degree of closeness between the two lists as a measure of the matching between the two spatial-graphs. The similarity measure depends on the number of edges and on the comparison between the orientation and slope of edges in the two spatial-graphs. The algorithm is robust with respect to scale and translation variants in the sense that it assigns the highest similarity to an image that is a scale or translation variant ofthe query image. An extended algorithm includes also rotational variants of the original images. More recent papers on the topic are in [30, 25], which basically propose extensions of the strings approach for efficient retrieval of subsets of icons. eR-strings are proposed in [30] as a logical representation of an image. Such representation also provides a geometry-based approach to iconic indexing based on spatial relationships between the iconic objects in an image individuated by their centroid coordinates. Translation, rotation and scale variant images and the variants generated by an arbitrary composition of these three geometric transformations are considered. The approach does not deal with object shapes, nor with other basic image features, and considers only the sequence of the names of the objects. The concatenation of the objects is based on the euclidean distance of the domain objects in the image starting from a reference

Knowledge based systems technology and applications in image retrieval

351

point. The similarity between a database and a query image is obtained through a spatial similarity algorithm that measures the degree of similarity between a query and a database image by comparing the similarity between their 8R-strings. The algorithm recognizes rotation, scale and translation variants of the image and also subimages, as subsets of the domain objects. A constraint limiting the practical use of this approach is the assumption that an image can contain at most one instance of each icon or object. An extension of the spatial-graph approach is presented in [25], and includes both the topological and directional constraints. The topological extension of the objects can be obviously useful in determining further differences between images that might be considered similar by a directional algorithm that considers only the locations of objects in term of their centroids. The similarity algorithm extends the graphmatching one previously described in [31]. The similarity between two images is based on three factors: the number of common objects, the directional and topological spatial constraint between the objects. The similarity measure includes the number of objects, the number of common objects and a function that determines the topological difference between corresponding objects pairs in the query and in the database image. The algorithm retains the properties ofthe original approach, including its invariance to scaling, rotation and translation and is also able to recognize multiple rotation variants. An algorithm that maesures a weighted global similarity between a sketched query and a database image is proposed in [19]. 3.3. Logic-based approaches

The use of structural descriptions of objects for the recognition of their images can be dated back to Minsky's frames, and to the work in [9]. The idea is to associate parts of an object or of a scene to the regions an image can be segmented into. The hierarchical organization of knowledge to be used in the recognition of an object was first proposed in [39]. A formalism to reason about maps as sketched diagrams in [49]. In this approach, the possible relative positions oflines are fixed and highly qualitative, such as touching and intersecting. Structured descriptions ofthree-dimensional images are already present in languages for virtual reality such as VRML [34] or hierarchical object modeling. However, the semantics of these languages is operational, and no effort is made to automatically classify objects with respect to the structure of their appearance. A formalism integrating Description Logics and image and text retrieval was proposed in [40], while the integration of Description Logics with spatial reasoning was proposed in [32]. Further extensions of the approach are described also in [41]. Both proposals integrate Description Logics and concrete domains [3]. However, neither of the formalisms can be used to build complex shapes by nesting more simple shapes. Moreover, the work in [32] is based on the RCC8 logic, which although effective for specifying meaningful relations in a map, is too qualitative to specify the relative sizes and positions of regions in a complex shape. Also in [33] description logics and concrete domains are at the basisofa logical framework for image databases aimed at reasoning on query containment. Unfortunately,

352

Di Sciascia et al.

the proposed formalism cannot consider geometric transformations neither determine specific arrangements of shapes. In [2] parts of a complex shape are described with a description logic. However, the composition of shapes does not consider their positions, hence reasoning cannot take positions into account. Relative position of parts of a complex shape are expressed in a constraint relational calculus in [6]. However, reasoning about queries (containment and emptiness) is not considered. In [1] a multi-modal logic is devised, which provides a formalism for expressing topological properties and for defining a distance measure among patterns. Spatial relation between parts of medical tomographic images are considered in [57]. There, medical images are formed by the intersection of the image plane and an object. As the image plane changes, different parts ofthe object are considered. Besides, a metric for arrangements is formulated by expressing arrangements in terms of the Voronoi diagram ofthe parts. Compositions ofparts of an image are considered in [53] for character recognition. The approach does not use of an extensional semantics for composite shapes, hence no reasoning is possible. A logic-based multimedia retrieval system was proposed in [28]; the method, based on an object-oriented logic, supports aggregated objects but it is oriented towards a high-level semantic indexing, which neglects low-level features that characterize images and parts of them. In the field of computation theories of recognition, we mention two approaches that have some resemblance to our own: Biederman's structural decomposition and geometric constraints proposed by Ullman, both described in [24]. Unfortunately, neither of them appears suitable for realistic image retrieval: the structural decomposition approach does not consider geometric constraints between shapes, while the approach based on geometric constraints does not consider the possibility of defining structural decomposition of shapes, hence reasoning on them. Starting with the reasonable assumption that the recognition ofan object in a scene can be eased by previous knowledge on the context, in [46], the recognition task, or the interpretation of an image, takes advantage of the information a cognitive agent has about the environment, and by the representation of these data in a high-level formalism. A structured knowledge representation approach to image retrieval is proposed in [21]. 4. PROPOSED KNOWLEDGE BASED APPROACH

We present here our approach, which adopts a formalism that allows the definition of composite shape descriptions and of a companion extensional and compositional semantics. Notice that our formalism deals with image features, such as shape, color, texture, but is basically independent of the way features are actually extracted from images. 4.1. Syntax

Our main syntactic objects are basic shapes, position of shapes, composite shape descriptions, and transformations. We also take into account the other features that typically determine the visual appearance of an image, namely color and texture.

Knowledge based systems technology and applications in image retrieval

353

Basic shapes are denoted with the letter B, and have an edge contour e(B) characterizing them. We assume that e(B) is described as a single, closed 2D-curve in a space whose origin coincides with the centroid of B. Examples of basic shapes can be circle. rectangle. but also any complete, rough contour is a basic shape. To make our language compositional, we consider only the external contour of a region. The possible transformations are the simple ones that are present in any drawing tool: rotation (around the centroid of the shape), scaling and translation. We globally denote a rotation-translation-scaling transformation as r. Recall that transformations can be composed in sequences r~ ...or n , and they form a mathematical group. The basic building block of our syntax is a basic shape component (c, t, r, B), which represents a region with color c, texture t, and edge contour r(e(B)). With r(e(B)) we denote the pointwise transformation r of the whole contour of B. For example, r could specify to place the contour e(B) in the upper left corner of the image, scaled by 1/2 and rotated 45 degrees clockwise. Composite shape descriptions are conjunctions of basic shape components-each one with its own color and texture-denoted as:

We do not expect end users of our system to actually define composite shapes with this syntax; this is just the internal representation of a composite shape. The system can maintain it while the user draws-with the help of a graphic tool-the complex shape by dragging, rotating and scaling basic shapes chosen either from a palette, or from existing images (see Figure 1). For example, the composite shape lighted-candle could be defined as lighted-candle

=

(Cj, t1, Tj, rectangle) n (cz. t2, T2, circle)

with rj, r2 placing the circle as a flame on top of the candle, and textures and colors defined accordingly to the intuition. In a previous paper [18] we presented a formalism including nested composite shapes, as it is done in hierarchical object modeling [27, Ch. 7]. However, nested composite shapes can always be flattened by composing their transformations. Hence in this paper we focus on two levels: basic shapes and compositions of basic shapes. Also, just to simplify the presentation of the semantics, in the following section we do not present color and texture features, which we take into account later on. 4.2. Semantics

We consider an extensional semantics, in which syntactic expressions are interpreted as subsets of a domain. For our setting, the domain of interpretation is a set of images ~, and shapes and components are interpreted as subsets of ~. Hence, also an image database is a domain of interpretation, and a complex shape C is a subset of such a domain-the images to be retrieved from the database when C is viewed as a query.

354

Di Sciascio et al.

D·,··,·.!,

Figure 1. Schematic of a query composition.

This approach is quite different from previous logical approaches to image retrieval that view the image database as a set of facts, or logical assertions, e.g., the one based on Description Logics in [40]. In that setting, image retrieval amounts to logical inference. However, observe that usually a Domain Closure Assumption [50] is made for image databases: there are no regions but the ones which can be seen in the images themselves. This allows one to consider the problem of image retrieval as simple model checking-check if a given structure satisfies a description. Obviously, a Domain Closure Assumption on regions is not valid in artificial vision, dealing with two-dimensional images ofthree-dimensional shapes (and scenes), because solid shapes have surfaces that will be hidden in their images. Formally, an interpretation is a pair (;:'5, ~), where ~ is a set of images, and;:'5 is a mapping from shapes and components

Knowledge based systems technology and applications in image retrieval

355

to subsets of Li. We identity each image I with the set of regions {rl' ... , r.} it can be segmented into. Each region r comes with its own edge contour e(r). An image I E Li belongs to the interpretation of a basic shape component (r , B):1 if! contains a region whose contour matches r(e(B)). In formulae, (r, B)~

= (I E t> I :3r E I: e(r) = r(e(B))}

(1)

The above definition is only for exact recognition of shape components in images, due to the presence of strict equality in the comparison of contours; but it can be extended to approximate recognition as follows. Recall that the characteristicJunction fs of a set S is a function whose value is either 1 or 0; fs(x) = 1 if xES, fs(x) = otherwise. We consider now the characteristic function of the set defined in Formula (1). Let I be an image; if I belongs to (r, B):I, then the characteristic function computed on I has value 1, otherwise it has value 0. To keep the number of symbols low, we use the expression (r, B)'S also to denote the characteristic function (with an argument (I) to distinguish it from the set).

°

~

(r, B)' (I)

= {Io

if:3r E I: e(r)

.

= r(e(B))

otherwise

Now we reformulate this function in order to make it return a real number in the range [0, l]-as usual in fuzzy logic [61]. Let sim(·,.) be a similarity measure from pairs of contours into the range [0, 1] of real numbers (where 1 is perfect matching). We use sim(·,.) instead of equality to compare contours. Moreover, the existential quantification can be replaced by a maximum over all possible regions in 1. Then, the characteristic function for the approximate recognition in an image I of a basic component, is: (r, B);'«l)

= max (sim(e(r), r(e(B)))} rEI

Note that sim depends on translations, rotation and scaling, since we are looking for regions in I whose contour matches e(B), with reference to the position and size specified by r. The interpretation of basic shapes, instead, includes a translation-rotation-scaling invariant recognition, which is commonly used in single-shape Image Retrieval. We define the interpretation of a basic shape as B~

= (I E t>

I :3r :3r E I : err)

= r(e(B))}

and its approximate counterpart as the function B;'«I)

= max r

max (sim(e(r), r(e(B)))} rEI

356

Di Sciascia et al.

~o

o

I

I

Figure 2. The semantics of the proposed language.

The maximization over all possible transformations max, can be effectively computed by using a similarity measure sim., that is invariant with reference to translationrotation-scaling. Similarity of color and texture will be added as a weighted sum later on. In this way, a basic shape B can be used as a query to retrieve all images from f.. which are in Bel. Therefore, our approach generalizes the more usual approaches for single-shape retrieval, such as Blobworld [12]. Composite shape descriptions are interpreted as sets ofimages that contain all components ofthe composite shape. Components can be anywhere in the image, as long as they are in the described arrangement relative to each other. Let C be a composite shape description (iI, B 1 ) n ... n (in, Bn ) . In exact matching, the interpretation is the intersection of the sets interpreting each component of the shape: (2)

Figure 2 shows the semantics of the proposed language. Observe that we require all shape components of C to be transformed into image regions using the same transformation r. This preserves the arrangement of the shape components relative to each other-given by each ii-while allowing C el to include every image containing a group of regions in the right arrangement, wholly displaced by r. To clarify this formula, consider Figure 3: the shape C is composed by two basic shapes B 1 and Bz, suitably arranged by the transformations il and iz. Suppose now that

Knowledge based systems technology and applications in image retrieval 357

o

c

o

T

I

U' 0

IJ

Figure 3. An example of application of Formula (2).

II contains the image I. Then, lEe'" because there exists the transformation r , which globally brings C into I, that is, rOrj brings the rectangle B 1 into a rectangle recognized in I, and rOr2 brings the circle B2 into a circle recognized in I, both arranged according to C. Note that I could contain also other shapes, not included in C.

Definition 1 {Recognition] A shape description C is recognized in an image I if for every interpretation (:J, ll) such that I E ll, it is i «c». An interpretation (:J, ll) satisfies a composite shape description C if there exists an image I E II such that C is recognized in I. A composite shape description is satisfiable if there exists an interpretation sati~fying it. Observe that shape descriptions could be unsatisfiable: if two components define overlapping regions, no image can be segmented in a way that satisfies both components. Of course, if composite shape descriptions are built using a graphical tool, unsatisfiability can be easily avoided, so we assume that descriptions are always satisfiable. Anyway, unsatisfiable shape descriptions could be easily detected, from their syntactic form, since unsatisfiability can only arise because of overlapping regions (see Proposition 4). Observe also that our set-based semantics implies the intuitive interpretation of conjunction "n"-one could easily prove that n is commutative and idempotent. For approximate matching, we modify definition (2), following the fuzzy interpretation of n as minimum, and existential as maximum: (3)

Our interpretation of composite shape descriptions strictly requires the presence of all components. In fact, the measure by which an image I belongs to the interpretation of

358 Di Sciascia et al.

a composite shape description C::i is dominated by the least similar shape component (the one with the minimum similarity). Hence, if a basic shape component is very dissimilar from every region in I, this brings near to 0 also the measure of C::i (I). This is more strict than, e.g., Gudivada & Raghavan's or El-Kwae & Kabuka's approaches, in which a non-appearing component can decrease the similarity value of C::i (I), but I can be still above a threshold. Although this requirement may seem a strict one, it captures the way details are used to refine a query: the "dominant" shapes are used first, and, if the retrieved set is still too large, the user adds details to restrict the results. In this refinement process, it should not happen that other images that match only some new details, "pop up" enlarging the set of results that the user was trying to restrict. We formalize this refinement process through the following definition.

Proposition 1 [Downward refinement] Let C beacomposite shape description, andlet D be a refinement' oj C, thatis D ~ en (r', B '). For every interpretation ~, if shapes are interpreted as in (2), then D::i ~ cr. if shapes are interpreted as in (3), then Jor every image I it holds ~(I) ::::: C::i(I). Proo]. For (2), the claim follows from the fact that D::i considers an intersection of ::i , ,cthe same components as the one of C , plus the set ((rOr ), B )'-5. For (3), the claim analogously follows from the fact that D::i (I) computes a minimum over a superset of the values considered for C::i (I). The above property makes our language fully compositional. Namely, let C be a composite shape description; we can consider the meaning of C-when used as a query-as the set of images that can be potentially retrieved using C. At least, this will be the meaning perceived by an end user of a system. Downward refinement ensures that the meaning of C can be obtained by starting with one component, and then progressively adding other components in any order. We remark that for other frameworks cited above [31, 25] this property does not hold. We illustrate the problem in Figure 3. Starting with shape description C, we may retrieve (among many others) the two images 11, Iz, for which both C::i(I 1) and C::i(I z) are above a threshold t, while another image 13 is not in the set because C::i (1 3 ) < t. In order to be more selective, we try adding details, and we obtain the shape description D. Using 0, we may still retrieve Iz, and discard 11. However, 13 now partially matches the new details ofD. If Downward refinement holds, D::i (13) ::::: C::i (13) < t, and 13 cannot "pop up". In contrast, if downward refinement does not hold (asin [31]) it can be D::i (13) > t > C::i (13) because matched details in 0 raise the similarity sum weighted over all components. In this case, the meaning of a sketch cannot be defined in terms of its components. Downward refinement is a property linking syntax to semantics. Thanks to the extensional semantics, it can be extended to an even more meaningful semantic relation, namely, subsumption. We borrow this definition from Description Logics [23], and its fuzzy extensions [60, 56].

Knowledge based systems technology and applications in image retrieval

359

DO c

D

Figure 4. Downward refinement: the thin arrows denote non-zero similarity in approximate recognition. The thick arrow denotes a refinement [21].

Definition 2 [SubsumptionJ A description C subsumes a description D iffor every interpretation ;;s, IY ~ C::J. if (3) is used, C subsumes D if for every interpretation ;;s and image I E ~, it is IY(I) :::: C::J(I). Subsumption takes into account the fact that a description might contain a syntactic variant of another, without both the user and the system explicitly knowing this fact. The notion of subsumption extends downward refinement. It enables also a hierarchy ofshape descriptions, in which a description D is below another C ifD is subsumed by C. When C and D are used as queries, the subsumption hierarchy makes easy to detect query containment. Containment can be used to speed up retrieval: all images retrieved using D as a query can be immediately retrieved also when C is used as a query,

360

Di Sciascio et al.

1

Figure 5. An example of subsumption hierarchy of shapes (thick arrows), and images in which the shapes can be recognized (thin arrows) [18].

without recomputing similarities. While query containment is important in standard databases [58], it becomes even more important in an image retrieval setting, since the recognition of specific features in an image can be computationally demanding. Figure 5 illustrates an example of subsumption hierarchy of basic and composite shapes (thick arrows denote a subsumption between shapes), and two images in which shapes can be recognized (thin arrows). Although we did not consider a background, it could be added to our framework as a special basic component (c , t , ,background) with the property that a region b satisfies the background simply if their colors and textures match, with no check on the edge contours. Also, more than one background could be added; in that case background regions should not overlap, and the matching of background regions

Knowledge based systems technology and applications in image retrieval

361

should be considered after the regions of all the basic shapes recognized are subtracted to the background regions. 5. REASONING AND RETRIEVAL

We envisage several reasoning services that can be carried out in a logic for image retrieval: 1. shape recognition: Given an image I and a shape description D, decide if D is recognized in I. 2. image retrieval: given a database of images and a shape description D, retrieve all images in which D can be recognized. 3. image classification: given an image I and a collection of descriptions D 1 , ... , Do, find which descriptions can be recognized in I. In practice, I is classified by finding the most specific descriptions (with reference to subsumption) it satisfies. Observe that classification is a way of "preprocessing" recognition. 4. description subsumption (and classification): given a (new) description D and a collection of descriptions D 1 , ... , Do, decide whether D subsumes (or is subsumed by) each D j , for i = 1, ... , n. While services 1-2 are standard in an image retrieval system, services 3-4 are less obvious, and we briefly discuss them below. The process of image retrieval is quite expensive, and systems usually perform offline processing of data, amortizing its cost over several queries to be answered on-line. As an example, all document retrieval systems for the web, both for images and text, use spiders to crawl the web and extract some relevant features (e.g., color distributions and textures in images, keywords in texts), that are used to classify documents. Then, the answering process uses such classified, extracted features of documentsand not the original data. Our approach can adapt this setting to composite shapes, too. In our approach, a new image inserted in the database is immediately segmented and classified in accordance with the basic shapes that compose it, and the composite descriptions it satisfies (Service 3). Also a query undergoes the same classification, with reference to the queries already answered (Service 4). The more basic shapes are present, the faster will the system answer new queries based on these shapes. More formally, given a query (shape description) D, if there exists a collection of descriptions D 1 , ... , Do and all images in the database were already classified with reference to D 1 , .•• , Do, then it may suffice to classify D with reference to D 1 , •.. , Do to find (most of) the images satisfying D. This is the usual way in which classification in Description Logics-which amounts to a semantic indexing-can help query answering [42]. For example, to answer the query asking for images containing an arch, a system may classify arch and find that it subsumes threePortalsGate (see Figure 5). Then, the system can include in the answer all images in which ancient Roman gates can be recognized, without recomputing whether these images contain an arch or not.

362

Di Sciascia et al.

The problem of computing subsumption between descriptions is reduced to recognition in the next section, and then an algorithm for exact recognition is given. Then, we extend the algorithm to realistic approximate recognition, reconsidering color and texture. 5.1. Exact reasoning on images and descriptions

Theorem 2 [Recognition as mapping] Let C = (T1, B1) n ... n (Tn, Bn) be a composite shape description, and let I be an image, segmented into regions {r1,"" rm } . Then Cis recognized in I iff there exists a traniformation T and an injective mapping j: {1, ... , n} ---+ {1, ... , m} such thatfor i = 1, ... , n it is

Proo]. C is recognized in I iff

Expanding ((TOT;), B;):5 with its definition yields :3r [;31:3r E I.e(r)

= T(Tl(e(B;)))]

and since regions in I are {r1' ... , rm } this is equivalent to

Making explicit the disjunction over j and conjunctions over i, we can arrange this conjunctive formula as a matrix:

:3T

[

(e(rl)

= T(Tj(e(B 1 ) ) )

(e(rl)

= T(Tn(e(B

v v

:

n)) )

v

v

~]

(4)

Now we note two properties in the above matrix of equalities: 1. For a given transformation, at most one region among r1, ... , rm ean be equal to

each component. This means that in each row, at most one disjunct can be true for a given T. 2. For a given transformation, a region can match at most one component. This means that in each column, at most one equality can be true for a given T.

Knowledge based systems technology and applications in image retrieval

363

We observe that these properties do not imply that regions have all different shapes, since the equality of contours depends on any translation, rotation, and scaling. We use equality to represent true overlap, and not just equal shape. Properties 1-2 imply that the above formula is true iff there is an injective function mapping each component to one region it matches with. To ease the comparison with the formulae above we use the same symbol j as a mapping j: {1, ... , n} ~ {1, ... , m}. Hence, Formula (4) can be rewritten into the claim: (5)

Hence, even if in the previous section the semantics of a composite shape was derived from the semantics of its components, in computing whether an image contains a composite shape one can focus on groups of regions, one group rj(I), ... , rj(n) for each possible mapping j. Observe that j injective implies m ::::: n, as one would expect. The above proposition leaves open which one between r or j must be chosen first. In fact, in what follows we show that the optimal choice for exact recognition is to mix decisions about j and T. When approximate recognition will be considered, however, exchanging quantifiers is not harmless. In fact, it can change the order in which approximations are made. We return to this issue in the next section, when we discuss how one can devise algorithms for approximate recognition. Subsumption in this simple logic for shape descriptions relies on the composition of contours of basic shapes. Intuitively, to actually decide if D is subsumed by C, we check if the sketch associated with D-seen as an image-would be retrieved using C as a query. From a logical perspective, the existentially quantified regions in the semantics of shape descriptions are skolemized with their prototypical contours. Definition 3 [Prototypical imageJ Let B be a basic shape. Its prototypical image is I (B) = {e(B)}. Let C = (TI, B I ) n n (r;, B n ) be a composite shape description. Its prototypical , r, (e(B n )) } . image is I (C) = {TI (e(B I ) ) ,

In practice, from a composite shape description one builds its prototypical image just applying the stated transformations to its components (and color/texture fillings, if present). Recall that we envisage this prototypical image to be built directly by the user, with the help of a drawing tool, with basic shapes and colors as palette items. The system will just keep track of the transformations corresponding to the user's actions, and use them in building the (internal) shape descriptions stored with the previous syntax. The feature that makes our proposal different from other query-bysketch retrieval systems, is precisely that our sketches have also a logical meaning. So, properties about description/sketches can be proved, containment between query sketches can be stated in a formal way, and algorithms for containment checking can be proved correct with reference to the semantics. Prototypical images have some important properties. The first is that they satisfythe shape description they exemplify-v-as intuition would suggest.

364

Di Sciascia et al.

Proposition 3 For every composite shape description D, ifD issatiifrable then the interpretation (::S,{I(D)}) satisjies D. Proof From Theorem 2, using an identical transformation forj.

T

and the identity mapping

A shape description 0 is satisfiableif there are no overlapping regions in 1(0). Since this is obvious when 0 is specified by a drawing tool, we just give the following proposition for sake of completeness.

Proposition 4 A shape description D is satiifrable iffitsprototypical image I(D) contains no overlapping regions. We now turn to subsumption. Observe that if B I and B2 are basic shapes, either they are equivalent (each one subsumes the other) or neither of the two subsumes the other. If we adopt for the segmented regions an invariant representation, deciding equivalence between basic shapes, or recognizing whether a basic shape appears in an image, is just a call to an algorithm computing the similarity between shapes. This is what usual image recognizers do-allowing for some tolerance in the matching of the shapes. Therefore, our framework extends the retrieval of shapes made of a single component, for which effective systems are already available. We now consider composite shape descriptions, and prove the main property of prototypical images, namely, the fact that subsumption between shape descriptions can be decided by checking if the subsumer can be recognized in the sketch of the subsumee.

Theorem 5 A composite shape description C subsumes a description D if and only if C is recognized in theprototypical image I(D). Proof Let C = (Tj, BI) n ... n (Tn, Bn ), and let 0 = (aI, AI) n ... n (am, Am). Recall that 1(0) is defined by 1(0) = {aj(e(A j)), ... , am(e(Am))}. To ease the reading, we sketch the idea of the proof in Figure 6. If. Suppose C is recognized in 1(0), that is, 1(0) E C"' for every interpretation (::S, 6.) such that 1(0) E 6.. Then, from Theorem 2 there exists a transformation i and a suitable injective functionj from {1, ... , n} into {1, ... , m} such that

Since 1(0) is the prototypical image of 0, we can substitute each region with the basic shape of 0 it comes from: (6)

Knowledge based systems technology and applications in image retrieval 365

(prototypical image of) C

prototypical image I (D) image J Figure 6. Schematic of the If-proof of Theorem 5 [21].

Now suppose that D is recognized in an image] = {SI' ... , sp}, with] E ~. We prove that also C is recognized in J. In fact, if D is recognized in ] then there exists a transformation and another injective mapping q from {l, ... , m} into [L, ... , p} selecting from] regions {Sq(l), ... , Sq(m)} such that

a

(7)

Now composing q andj-that is, selecting the regions of] satisfying those components of D which are used to recognize C-one obtains e(Sq(j(k)))

= fI °OJ(k)(e(Aj(k)))

for k

= 1, ... , n

(8)

Then, substituting equals for equals from (6), one finally gets

which proves that C too is recognized in], using a Of as transformation of its components, and q(j (.)) as injective mapping from {l , ... , n} into [l , ... , p}. Since] is a generic image, it follows that D:J ~ C!."l. Since (~, ~) is generic too, C subsumes D.

366 Di Sciascioet al.

Only if. The reverse direction is easier: suppose C subsumes D. By definition, this amounts to D~ S; C~ for every collection of images ~. For every ~ that contains I(D), then I(D) E D~ for Proposition l. Therefore, I(D) E C~, that is, C is recognized in I(D). This property allows us to compute subsumption as recognition, so we concentrate on complex shape recognition, using Theorem 2. Our concern is how to decide whether there exists a transformation r and a matching j having the properties stated in Theorem 2. It turns out that for exact recognition, a quadratic upper bound can be attained for the possible transformations to try.

Theorem 6 Let C = (rl, BI ) n ... n (r;, Bn ) be a composite shape description, and let I be an image, segmented into regions {rl, ... , r m}. Then, there are at most m(m - 1) exact matches between the n basic shapes and the m regions. Moreover, each possible match can be verified by checking the matching of n pairs of contours. Proof. A transformation r matching exactly basic components to regions is also an exact match for their centroids. Hence we concentrate on centroids. Each correspondence between a centroid of a basic component and a centroid of a region yields two constraints for r. Now r is a rigid motion with scaling, hence it has four degrees of freedom (two degrees for translations, one for rotation, and one for uniform scaling). Hence, if an exact match r exists between the centroids of the basic components and the centroids of some of the regions, then r is completely determined by the transformation of any two centroids of the basic shapes into two centroids of the regIOns. Fixing any pair of basic components B 1 , B2, let PI, P2 denote their centroids. Also, let rj(l), rj(2) be the regions that correspond to B I, B2, and let Vj(I), Vj(2), denote their centroids. There is only one transformation r solving the point equations (each one mapping a point into another) r(rl(Pl)) { r(rz(pz))

= Vj(l) = vJ(Z)

Hence, there are only m(m - 1) such transformations. For the second claim, once a r matching the centroids is found, one checks that the edge contours ofbasic components and regions coincide, i.e., that r(rl (e(B I))) = e(rj(l)), r(r2(e(B 2))) = e(rj(2)), and for k = 3, ... , n that r(rk(e(B k)) coincides with the contour of some region e(rj(k))' Recalling Formula (5) in the proof of Theorem 2, we can eliminate the outer quantifier in (5) using a computed r, and conclude that C is recognized in I iff N

3j : {I .. n} ---+ {I .. mj .« e(rJ(j)) 1=1

= r(rj(e(B

j) ) )

Knowledge based systems technology and applications in image retrieval

367

Observe that, to prune the above search, once a r has been found as above, one can check for k = 3, ... , n that r (fdcentr(B k))) coincides with a centroid of some region rj, before checking contours. Based on Theorem 6, we can devise the following algorithm:

Algorithm Recognize (C,I); input a composite shape description C = (fl, Bj ) n ... n (f n, B n), and an image I, segmented into regions rl, ... , rm output True if C is recognized in I, False otherwise begin (1) compute the centroids Vj, , Vm of rj , ... , rm (2) compute the centroids Pt. , Pn of the components of C (3) for i, h E {I, ... , m} with i < h do compute the transformation r such that f(Pl) = Vi and f(P2) = Vh; iffor every k E {I, ... , n} f(fk(e(B k))) coincides (for some j) with a region rj in I then return True endfor return False end The correctness of Recognize (C, I) follows directly from Theorems 2 and 6. Regarding the time complexity, step (1) requires to compute centroids of segmented regions. Several methods for computing centroids are well known in the literature [37]. Hence, we abstract from this detail, and assume there exists a function f(N h, N v ) that bounds the complexity of computing one centroid, where Nh, N, are the horizontal and vertical dimensions of I (number of pixels). We report in the Appendix how we compute centroids, and concentrate on the complexity in terms of n, m, and f(N h, Ny).

Theorem 7 Let C = (fl, B1 ) n ... n (fn , Bn ) be a composite shape description, and let I be an image with Ni; x N; pixels, segmented into regions {r 1, ... , r m }. Moreover, letf (Nh, N) be afunction bounding the complexity of computing the centroid of one region. Then C can be recognized in I in time O(m· f(Nh, N v ) + n + m 2·n· Nh· Nv ) ' Proof From the assumptions, Step (1) can be performed in time O(m·f(Nh, N v ) ) . Instead, Step (2) can be accomplished by extracting the n translation vectors from the transformations fl, ... , Tn of the components of C. Therefore, it requires O(n) time. Finally, the innermost check in Step (3)-checking whether a transformed basic shape and a region coincide-can be performed in O( N, . Ny), using a suitable marking of pixels in I with the region they belong to. Hence, we obtain the claim.

Di Sciascia et al.

368

Since subsumption between two shape descriptions C and D can be reduced to recognizing C in I(D), the same upper bound holds for checking subsumption between composite shape descriptions, with the simplification that also Step (1) can be accomplished without any further feature-level image processing. 5.2. Approximate recognition

The algorithm proposed in the previous section assumes an exact recognition. Since the target of retrieval are real images, approximate recognition is needed. We start by reconsidering the proof of Theorem 2, and in particular the matrix of equalities (4). Using the semantics for approximate recognition (3), the expanded formula for evaluating C':l (I) becomes now the following:

I

max{slm(e(rl)' T(Tl(e(Bl)))),

max mill r

:

maxjsimfctr- ), T(Tn(e(B n))),

:1

Now Properties 1-2 stated for exact recognition can be reformulated as hypotheses about sim, as follows. For a given transformation, we assume that at most one region among rl, ... , rm is maximally similar to each component. This assumption can be justified by supposing its negation: if there are two regions both maximally similar to a component, then this maximal value should be a very low one, lowering the overall value because of the external minimization. This means that in maximizing each row, we can assume that the maximal value is given by one index among 1, ... , m. For a given transformation, we assume that a region can yield a maximal similarity for at most one component. Again, the rationale of this assumption is that when a region yields a maximal similarity with two components in two different rows, this value can be only a low one, which propagates along the overall minimum. This means that in minimizing the maxima from all rows, we can consider a different region in each row. We remark that also in the approximate case these assumptions do not imply that regions have all different shapes, since sim is a similarity measure which is 1 only for true overlap, not just for equal shapes with different pose. The assumptions just state that sim should be a function "near" to plain equality. The above assumptions imply that we can focus on injective mappings from {1 .. n} into {1 .. m} also for the approximate recognition, yielding the formula max. r

n

max

J{Ln}-->{1..m)

min (sim(e(rJ(j)), T(Tj(e(Bj))))} 1=1

The choices of rand j for the two maxima are independent, hence we can consider groups of regions first: max

j:{Ln}-->{Lm}

n

max min (sim(e(rj(j)), T(Tj(e(Bj))))} r

1=1

(9)

Knowledge based systems technology and applications in image retrieval

369

Differently from the exact recognition, the choice of an injective mapping j does not directly lead to a transformation r , since now r depends on how the similarity of transformed shapes is computed, that is, the choice of r depends on sim. In giving a definition of sim, we reconsider the other image features (color, texture) that were skipped in the theoretical part to ease the presentation of semantics. This will introduce weighted sums in the similarity measure, where weights are set by the user according to the importance of the features in the recognition. Let sim(r, (c, t, r , B)) be a similarity measure that takes a region r (with its color c(r) and texture t(r)) and a component (c, t, r, B) into the range [0, 1] of real numbers (where 1 is perfect matching). We note that color and texture similarities do not depend on transformations, hence their introduction does not change Assumptions 1-2 above. Accordingly, Formula (9) becomes max

j:{l ..n}-->{1..m}

max mill (sim(rj(i)' (c, t, (r °ri)' Bi ) )} r

i=l

(10)

This formula suggests that from all the groups of regions in an image that might resemble the components, we should select the groups that present the higher similarity. In artificially constructed examples in which all shapes in I and C resemble each other, this may generate an exponential number of groups to be tested. However, we can assume that in realistic images the similarity between shapes is selective enough to yield only a very small number of possible groups to try. We recall that in Gudivadas approach [30] an even stricter assumption is made, namely, each basic component in C does not appear twice, and each region in I matches at most one component in C. Hence our approach extends Gudivada's one, also for this aspect-besides the fact that we consider shape, scale, rotation, color and texture of each component. In spite of the assumptions made, finding an algorithm for computing the "best" r in Formula (10) proved a difficult task. The problem is that there is a continuous spectrum of r to be searched, and that the best T may not be unique. We observed that when only single points are to be matched-instead of regions and components-our problem simplifies to Point Pattern Matching in Computational Geometry. However, even recent results in that research area are not complete, and cannot be directly applied to our problem. [11] solve the nearly-exact point matching with efficient randomized methods, but without scaling. They also observe that best match is a more difficult problem than nearly-exact match. Also [16] propose a method for best match of shapes, but they analyze only rigid motions without scaling. Therefore, we adopt some heuristics to evaluate the above formula. First of all, we decompose sim (r, (c, t, r , B)) as a sum of six weighted contributions. Three contributions are independent of the pose: color, texture and shape. The values ofcolor and texture similarity are denoted by simcolor(c(r), c) and simtexture(t(r), t), respectively. Similarity of the shapes (rotation-translation-scale invariant) is denoted by simshape(e(r), e(B)). For each feature, and each pair (region, component) we compute a similarity measure as explained in the Appendix. Then, we assign to all similarities of a feature-say, color-the worst similarity in the group. This yields a pessimistic estimate

370

Di Sciascia et al.

ofFormula (10); however, for such estimate the Downward Refinement prop erty hold s (see next Theorem 8). The other three contri butions depend on the pose, and try to evaluate how the pose of each region in the selected group is similar to the pose specified by the corresponding compone nt in the sketch. In parti cular, simscale(e(r), r (e(B)) represent s how similar in scale are th e region and the transformed component, while simrotation(e(r), r (e(B)) denotes how e(r) and r (e(B)) are similarly (or not) rotated wi th referen ce to the arr angement of the other compone nts. Finally, simspatial(e(r), r(e(B)) denotes a measure of how coi ncident arc the cent roids of th e region and the transformed compone nt. In summary, we get the followin g form for the overall similarity bet ween a region and a compon ent: sim(r,(c, t, r , E))

= simspatial(e(r), r (e(B)) . IX + Sim,hape(e(r), c(B)) . f3 + simcolor(c(r), c) · y + simroution(e(r), r (e(B)) . 8 + simscal e(e(r), r (e(B)) . I] + simrexture(t(r), t) . E:

wh ere co efficients a , fJ, y, 8, 1], e weight th e relevance each feature has in the overall similarity computation . Obviou sly, we impose ct + fJ + y + 8 + 1] + e 1, and all coefficients are greater or equ al to O. Because of the difficulties in computing the best r , we do not compute a maximum over all po ssible r s. Instead, we evaluate whether there can be a rigid transform atio n w ith scaling from '[1 (e(B ))), .. . , rn(e(Bn)) int o rj(I), . . . , rj(n), through similarities simspatial, simscaJe, and simrotatioll ' There is a transformation iff all th ese sim ilari ties are 1. If not , th e lower th e similarities are, th e less "rigid" the transformation should be to match co mpo ne nts and region s. Hence, instead of Formula (10) we evaluate the following simpler formula:

=

n

. max min {sim(rj(i)' (c, t, Ti , B J:{l..n}-> {l..rn} ,=1

l) ) }

(11)

int erpreting pose sim ilarities in a different way. We now describ e in detail how we estimate pose similarities. Let C = (CI' tl, '[I, B 1 ) n .. . n (c.,, tn, rn , B n), and let j be an injective functi on from {1 . . n } into {I .. m}, th at matche s compone nts with regions {rj(I), ... , rj(n)} respectively. 5.2. 1. Spatialsimilarity

For a given compone nt-say, co mponent I-we compute all angles under which the other compone nts are seen from 1. Formally, let ctil h be th e counter-c lockwiseorient ed angle with vertex in th e centroid of component 1, and formed by the lines linking this centroid wi th the centroids of compone nt i and h . T here are n (n - 1)/ 2 such angles. T hen, we compute the correspo ndent angles for region rj(I), namely, angles fJ j (j)j (l )j (h ) wi th vertex in th e cent roid of rj(J), forme d by the lines linking this centroid

Knowledge based systems technology and applications in image retrieval

371

0/

Querylq

Image1d

0/

Querylq

Imagel«

Querylq

Imageld

Figure 7. Representation of angles uscd for computing spatial similarity of component 1 and region rj(l).

with the centroids of regions rj(i) and rj(h) respectively. A pictorial representation of the angles is given in Figure 7. Then we let the difference ~spatial (e(rj(l)), Tj (e(B1)) be the maximal absolute difference between correspondent angles:

372

Di Sciascia et al.

We compute an analogous measure for components 2, ... , n, and then we select the maximum of such differences: (12)

where the argumentj highlights the fact that this measure depends on the mappingj. Finally, we transform this maximal difference-for which perfect matching yields 0into a minimal similarity-perfect matching yields 1-with the help of the function described in the Appendix. This minimal similarity is then assigned to every simspatial (e(rj(i)), ri(e(B i)), for i = 1, ... , n. Intuitively, our estimate measures the difference in the arrangement of centroids between the composite shape and the group of regions. If there exists a transformation bringing components into regions exactly, every difference is 0, and so simspatial raises to 1 for every component. The more an arrangement is scattered with reference to the other arrangement, the higher its maximum difference. The reason why we use the maximum of all differences as similarity for each pair component-region will be clear when we prove later that this measure obeys Downward Refinement property. 5.2.2. Rotation similarity

For every basic shape one can imagine a unit vector with origin in its centroid and oriented horizontally on the right (as seen on the palette). When the shape is used a~ a component-say, component 1-also this vector is rotated according to rl. Let h denote such a rotated vector. For i = 2, ... , n let Yjll; the counte:;-clockwise-oriented angle with vertex in the centroid ofcomponent 1, and formed by h and the line linking the centroid of component 1 with the centroid of component i. For region rj(l), the analogous u of h can be constructed by finding the rotation phase for which cross-correlation attains a maximum value (see Appendix). Then, for i = 2, ... , n let Dj(i)J(I)" be the angles with vertex in the centroid ofrj(l), and formed by and the line linking the centroid of rj(l) with the centroid of rj(i). Figure 8 clarifies the angles we are computing. Then we let the difference Llrotation (e(rj(l)), rl (e(BI ) ) be the maximal absolute difference between correspondent angles:

u

If there is more than one orientation of rj(l) for which cross-correlation yields a maximum-e.g., a square has four such orientations-then we compute the above maximal difference for all such orientations, and take the best difference (the minimal one). We repeat the process for components 2 to n, and we select the maximum of such differences: (13)

Knowledge based systems technology and applications in image retrieval

373

Image I

Figure 8. Representation of angles used for computing rotation similarity of component I and region rj(l).

It

j

R,

Figure 9. Sizes and distances for scale similarity computation of component I and region rj(l).

Finally, as for spatial similarity, we transform ~rotation[j] into a minimal similarity with the help of <1>. This minimal similarity is then assigned to every simrotation(e(rj(i)), Tj(e(B i)), for i = 1, ... , n. Observe that also these differences drop to 0 when there is a perfect match, hence the similarity raises to 1. The more a region has to be rotated with reference to the other regions to match a component, the higher the rotational differences. Again, the fact that we use the worst difference to compute all rotational similarities will be exploited in the proof of Downward Refinement. 5.2.3. Scale similarity

We concentrate again on component 1 to ease the presentation. Let mj be the size of component 1, computed as the mean distance between its centroid and points on the contour. Moreover, for i = 2, ... , n, let d li be the distance between the centroid of component 1 and the centroid of component i. In the image, let Mj(l) be the size of region rj(i), and let Dj(l)j(i) be the distance between centroids of regions j (1) and j (i). Figure 9 pictures the quantities we are computing.

374

Di Sciascia et al.

We define the difference in scale between e(rj(I)) and A

'-'scale

_ max e rJ(I) ,
((.)

1=2 .... ,0

TI

(e(B1) as:

{1 1 -

min (Mj(l)jDj(l)j(i)' mdd1ill} max (Mj(l)jDj(l)j(i)' mdd1il

We repeat the process for components 2 to n, and we select the maximum of such differences: (14)

Finally, as for the other similarities, we transform ~scale [j] into a minimal similarity with the help of
Using the same worst difference in evaluating pose similarities of all components may appear a somewhat drastic choice. However, we were guided in this choice by the goal of preserving the Downward Refinement property, even if we had to abandon the exact recognition of the previous section.

Theorem 8 Let C be a composite shape description, and let D be a rejinement cf C, that is, D ~ C n (c', t', t'; B'). Forevery image I, segmented into regions rj , ... , r m, if C:J (I) and

rP (I) are computed as in (11) usingsimilarities defined above,

then it holds

rP

(I) :::: C:J(I).

Proof Every injective functionj used to map components ofC into I can be extended to a functionj' by lettingj'(n + 1) E {1, ... , m} be a suitable region index not in the range of j. Since D:J(I) is computed over such extended mappings, it is sufficient to show that values computed in Formula (11) do not increase with reference to the values computed for C.

Let jl be the mapping for which the maximum value C:J(I) is reached. Every extension j. of'j, leads to a minimum value min::;/ in Formula (11) which is lower than C:J(I). In fact, all pose differences (12), (13), (14), are computed as maximums over a strictly greater set of values, hence the pose similarities have either the same value, or a lower one. Regarding color, texture, and shape similarities, adding another component can only worsen the values for components of C, since we assign to all components the worst similarity in the group. Now consider another injective mapping j- that yields a non-maximum value V2 < C':\(I) in Formula (11). Using the above argument about pose differences (12), (13), (14), every extension j; leads to a minimum value v; :::: V2. Since V2 < C':\(I), also every extension of every mapping j different from jl yields a value which is less than C':\ (I). This completes the proof.

Knowledge based systems technology and applications in image retrieval

375

6. REPRESENTING SHAPES, OBJECTS AND IMAGES

In this section we briefly revise the methods we used for the extraction of image features. We also describe the smoothing function

In order to deal with objects in an image, segmentation is required to obtain a partition of the image. Several segmentation algorithms have been proposed in the literature; our approach does not depend on the particular segmentation algorithm adopted. It is anyway obvious that the better the segmentation, the better our system will work. In our system we used a simple algorithm that merges edge detection and region growmg. Illustration of this technique is beyond the scope of this paper; we limit here to the description of image features computation, which assume a successful segmentation. To make the description self-contained we start defining a generic color image as {i(x, y) I 1 ::: x ::: N h , 1 ::: y~::: Ny}, where N h , Ny are the horizontal and vertical dimensions, respectively, and I(x, y) is a three-components tuple (R, G, B). We assume that the image I has been partitioned in m regions (r.), i = 1, ... , m satisfying the following properties: • 1= U(rj), i = 1,2, ... , m E {1, 2, ... , m}, r, is a nonempty and connected set • r, n rj = '1 iff i # j • each region satisfies heuristic and physical requirements.

• 'V i

We characterize each region r, with the following attributes: shape, position, size, orientation, color and texture.

Shape. Given a connected region a point moving along its boundary generates a complex function defined as: z(t) = x(t) + jy(t), t = 1, ... , Nj., with N, the number of boundary sample points. Following the approach proposed by [52] we define the Discrete Fourier Transform (OFT) of z(t) as: Z(k)

=L Nb

z(t)e-j[(2rrtk)/(Nbll

= M(k)eJ8 (k)

t=1

with k = 1, ... , N b . In order to address the spatial discretization problem we compute the Fast Fourier Transform(FFT) of the boundary z(t); use the first (2N c + 1) FFT coefficients to form a dense, non-uniform set of points of the boundary as: Zd,ns, (t )

=

N,

j [(2rrtk)/(N b)] "Z(k)eL...

376

Di Sciascia et al.

with t = 1, ... , Ndense. We then interpolate these samples to obtain uniformly spaced samples Zunif(t), t = 0, ... ,Nunif. We compute again the FFT of Zunif(t) obtaining Fourier coefficients Zunif(k), k -N e , ..• , N,; The shape-feature ofa region is hence characterized by a vector of2N e + 1 complex coefficients.

=

Position and Size. Position is determined as the region centroid computed via moment invariants [48]. Size is computed as the mean distance between region centroid and points on the contour. Orientation. In order to quantify the orientation of each region r; we use the same Fourier representation, which stores the orientation information in the phase values. We obviously deal also with special cases when the shape of a region has more than one symmetry, e.g., a rectangle or a circle. Rotational similarity between a reference shape B and a given region r, can then be obtained finding maximum values via cross-correlation: 1

2N,

t:a

. h C(t) - cWit t 2N-+-1 '"' ZB(k)Zrl·(k).~i[(2Jr)/(2N,)]kn c

E

0 , ... , 2N c

Color. Color information of each region r, is stored, after quantization in a 112 values color space, as the mean RGB value within the region:

n,

= L R(p) per,

Gri

= L G(p)

B ri

= L B(p)

Texture. We extract texture information for each region ri with a method based on the work by [47]. Following this approach, we extract texture features convolving the original grey level image I(x, y) with a bank of Gabor filters, having the following impulse response: hX,(Y )

= -1-2 . e ~[(x2+y2)/(2,,2)] . c-~ i 2Jr(Ux+Vy) 2Jra

where (U, V) represents the filter location in the frequency-domain, A is the central frequency, a is the scale factor, and the orientation, defined as:

e

A = JU2 +V2

e = arctan U/V

The processing allows to extract a 24-components feature vector, which characterizes each textured region. 6.2. Similarity computation

°

Smoothing function <1>. In all similarity measures, we use the function (x, 61:, fy). The role of this function is to change a distance x (in which corresponds to perfect

Knowledge based systems technology and applications in image retrieval

377

matching) to a similarity measure (in which the value 1 corresponds to perfect matching), and to "smooth" the changes of the quantity x, depending on two parameters fx, fy. fy + (1 - fy) cos (;~) (x, Ex, fy) =

arctan

fy [

[

1T(x

- fx) (1 - fy)] ] 1-fxfy

ifO~x<Ex

if x> Ex

1T

where fx > 0 and 0 < fy < 1. The input data to the approximate recognition algorithm are a shape description D, containing n components (Ck, tk, tk, B k) and an image I segmented into m regions rj , . . . , rm . The algorithm provides a measure for the approximate recognition of D in 1. The first step of the algorithm considers all the m regions of the image and all the n components in the shape description D and finds-if any-all the groups of n regions rj(k) satisfying the higher shape similarity with the shape components of D. To this purpose we compute shape similarity, based on the Fourier representation previously introduced, as vector of complex coefficients. Such measure denoted with sim., is invariant with respect to rotation, scale and translation and is computed as the cosine distance between the two vectors. The similarity gives a measure in the range [0, 1] assuming the higher similarity sim., = 1 for perfect matching. Given X and Y, vectors of complex coefficients describing respectively the shape ofa region r, and the shape ofa component Bj , X = (x., ... , X2Nc) and Y = (yl' ... , Y2NJ

Shape Similarity simshape measures the similarity between shapes in the composite shape description and the regions in the segmented image.

Color Similarity simcol or measures the similarity in terms ofcolor appearance between the regions and the corresponding shapes in the composite shape description. In the following formula, Llcolor(k).R denotes the difference in the red color component between the k-th component ofD and the region rj(k), and similarly for the green and the blue color components.

378

Di Sciascia et al.

Then the function

(m'kx {~color(k)}, k=l

Exc01o" Eyeolor)

Texture Similarity simtexture measures the similarity between the texture features in the components ofD and in the corresponding regions. D.texture (k) denotes the sum of differences in the texture components between the k-th component of D and the region rj(k) and dividing by the standard deviation of the elements. simtexture

= (m;x k=l

I1texture (k), fxtexture,

fytexture)

7. PROTOTYPE SYSTEM

We implemented a complete client-server image retrieval system, which allows a user to pose both queries by sketch and queries by example. Interface: the user is given a simple visual language to specify (by sketch or by example) a geometric composition of basic shapes, which we call description. The composite shape description intuitively stands for a set of images (all containing the given shapes in their relative positions); it can be used either as a query, or as an index for a relevant class of images, to be given some meaningful name. Figure 10 shows the user interface. Syntax and semantics: the system has an internal syntax to represent the user's queries and descriptions, and the syntax is given an extensional semantics in terms of sets of retrievable images. In contrast with existing image retrieval systems, our semantics is compositional, in the sense that adding details to the sketch may only restrict the set of retrievable images. Syntax and semantics constitute a Semantic Data Model, in which the relative position, orientation and size ofeach shape component are given an explicit notation through a geometric transformation. The extensional semantics allows us to define a hierarchy of composite shape descriptions, based on set containment between interpretations of descriptions. Coherently, the recognition of a shape description in an image is defined as an interpretation satisfying the description. Algorithms and complexity: based on the semantics, subsumption between descriptions can be carried out in terms of recognition. Exact and approximate algorithms for composite shapes recognition in an image have been presented before, which are correct with respect to the semantics. Generally, soundness and completeness refer to the fidelity of an algorithm to a model-theoretic criterion like for example a model-theoretic semantics. Informally, an algorithm is sound if it is guaranteed to conclude something if that conclusion is justified by the model-theoretic semantics-usually, if it is true in all allowable models. Conversely, an algorithm is complete if it is guaranted to draw any conclusion that is so justified. Ideally, if the computational complexity of the problem of retrieval was known, the algorithms should also be optimal with reference to the computational complexity

Knowledge based systems technology and applications in image retrieval

C U Sc.rc h

379

II!I~ El

Figure 10. A snapshot of the user interface.

of the problems. Presently, we solved the problem for exact retrieval, and propose an algorithm for approximate retrieval which, although probably non-optimal, is correct. 7.1. Knowledge base management

The knowledge base supports the following functionalities: • shape, object and image insertion • query by sketch • textual query • shape, object and image deletion Such functionalities are performed using a hierarchical graph to represent and organize shape and object descriptions and real images. Basic shape insertion

Basic shapes belong to the higher level of the hierarchy. More complex shapes are obtained by combining such elementary shapes and/or by applying transformations

380

Di Sciascia et al.

o

D

o

Figure 11. The insertion of a new object in the hierarchy.

(rotation, scaling and translation) to basic shapes. An image is linked to a node N if it contains the object or the basic shape corresponding to the node. Images are linked to a node in the structure depending on the most specific description that they are able to satisfy, New objects insertion

A new object is inserted in the knowledge base as a new node. The insertion is carried out through a search process in the hierarchy to fmd the exact position where the new description D (a simple or a complex one) has to be inserted. The position is determined considering the descriptions that the new one is subsumed by. Once the position has been found, the real images that are recognized in the new description are linked to it. Basic shapes have no parents, so they are at the top of the hierarchy. Complex objects are linked to the basic shapes they contain. Images containing the object or a group of regions whose configuration is similar to the object are linked to the node. The insertion algorithm of the object 0 determines the set of parent nodes of the new node. The first step of the algorithm searches in the top level of the graph the

Knowledge based systems technology and applications in image retrieval 381

parent nodes N, of the new object. The set G = {No, ... , Ng-d is filled with those nodes corresponding to the basic shape recognized in the new object. In the next steps the algorithm performs a depth search in the graph for each node N, E G. Given the set C of child nodes containing objects of 0, the elements C, E C replace their parents in G only if all parent nodes belong to G. At the end of the iterative search for all the nodes in G, the set G will contain the direct ancestors of the new node. The algorithm determines the set H o of images that might contain the new object O. Given a node Nj, the set of images linked to the node N, or to a derived node is obtained as:

where IN i is the set of images linked to N, and Dj is a node derived by Nj. Given the set G = {No, ... , NG-d of parents of 0, the set of images to link to the new object IS:

Ho

=

nc- 1

U X N1

i=O

H o is the set of images containing the basic shapes of O. The set T 0 ~ H o contains the images in H o that effectively contain 0. Given the set of images linked to the nodes N, E G: Mo

=

nc- 1

U IN i

i=O

is determined. The links to images in the set Ton Mo are moved to the new node N and links to images in [To - (Ton M o)] are copied in N instead ofbeing moved. For the insertion of a real image, the step 3 in the algorithm returns the set of nodes in which the new image must be inserted. New image insertion

The insertion of a new image requires a reorganization of the graph in order to add the links to the new image. The algorithm determines the nodes in the graph where the new image should be tied. It includes the Object insertion since the set G is filled with the nodes tied to the basic shapes in the image 1. Query by sketch

Query processing is performed through an object insertion algorithm. The object is not effectively inserted in a new node. In this way it is possible to keep small the size of the graph. For all the elements in the set To of images linked to the new node a similarity measure is computed through a recognition algorithm. Images are returned to the user ranked on the similarity measure.

382

Di Sciascio et al.

Query by sketch can be used also to retrieve objects instead of images. The prototype has been used to carry out an extensive set of experiments on a test database of images, which allowed us to verify the effectiveness of the proposed approach in comparison with expert users ranking. A description of experiments and evaluation of the proposed method is in [21]. 8. DISCUSSION

Feature-based approaches to content-based image retrieval have been widely studied. Nevertheless low-level features are unable to capture the semantics of imges. Here we presented a Knowledge Representation approach to Image Retrieval and proposed a language to describe composite shapes, and gave an extensional semantics to queries, in terms of sets of retrieved images. To cope with a realistic setting from the beginning, we also generalized the semantics to fuzzy membership of an image to a description. The composition of shapes is made possible by the explicit use in our language ofgeometric transformations (translation-rotation-scale), which we borrowed form hierarchical object modeling in Computer Graphics and significantly extends standard invariant recognition of single shapes in image retrieval. The extensional semantics allows us to properly define subsumption between queries. Borrowing also from Structured Knowledge Representation, and in particular from Description Logics, we stored shape descriptions in a subsumption hierarchy. The hierarchy provides a semantic index to the images in a database. The logical semantics allowed us to define other reasoning services: the recognition of a shape arrangement in an image, the classification of an image with reference to a hierarchy of descriptions, and subsumption between descriptions. These tasks are aside, but can speed up, the main one, which is Image Retrieval. We proved that subsumption in our simple logic can be reduced to recognition, and gave a polynomial-time algorithm to perform exact recognition. Further research is needed in various directions. The language for describing composite shapes could be enriched either with other logic-oriented connectives-e.g., alternative components corresponding to an OR in compositionsor to sequences of shape arrangements, to cope with objects with internal movements in video sequence retrieval. Furthermore techniques from Computational Geometry could be used to optimize the algorithms for approximate retrieval, while a study in the complexity of the recognition problem for composite shapes might prove the theoretical optimality of the algorithms. REFERENCES [1] Aiello, M. 2001. Computing spatial similarity by games In Esposito, E, Proceedings of the Eighth Conference of the Italian Association for Artificial Intelligence (AI*IA'99), 2175 in Lecture Notes in Artificial Intelligence, 99-110. Springer-Verlag. [2] Ardizzone, E., Chella, A., Gaglio, S. 1997. Hybrid computation and reasoning for artificial vision In Cantoni, V, Levialdi, S., Roberto, V, Artificial Vision, 193-221. Academic Press. [3] Baader, E Hanschke, P. 1991. A schema for integrating concrete domains into concept languages In Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (IJCAI'91), 452-457, Sydney. [4] Baader, E, Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. Editors 2003. The Description Logic Handbook, Theory, Implementation and Applications. Cambridge.

Knowledge based systems technology and applications in image retrieval

383

[5] Bach, R., Fuller, c., Gupta, A., Hampapur, A., Horowitz, B., Humphrey, R.,Jain, R., Shu, C. 1996. The Virage image search engine: an open framework for image management In Storage and Retrieval for Image and Video Databases, 2670, 76-87. SPIE. [6] Bertino, E. Catania, B. 1998. A constraint-based approach to shape management in multimedia databases MultiMedia Systems, 6, 2-16. [7] Del Bimbo A. Visual Information Retrieval. 1999. Morgan Kaufmann Publisher [8] Borgida, A. 1995. Description Logics in Data Management. IEEE Transactions on Transactions on Knowledge and Data Engineering, 7(5), 671-682. [9] Brooks, R. 1981. Symbolic reasoning among 3-D models and 2-D images Artificial Intelligence, 17, 285-348. [10] Calvanese, D., Lenzerini, M., Nardi, 0. 1998. Description logics for conceptual data modeling In Chomicki, J. Saake, G., Logics for Databases and Information Systems, 229-264. Kluwer Academic Publisher. [11] Cardoze, 0. Schulman, L. 1998. Pattern matching for spatial point sets In Proceedings of the Thirtyninth Annual Symposium on the Foundations of Computer Science (FOCS'98), 156-165, Palo Alto, CA. [12] Carson, c., Thomas, M., Belongie, S., Hellerstein, J. M., Malik, J. 1999. Blobworld: A system for region-based image indexing and retrieval In Huijsmans, 0. Smeulders, A., Lecture Notes in Computer Science, 1614, 509-516. Springer-Verlag. [13] Celentano, A. Di Sciascio, E. 1998. Features integration and relevance feedback analysis in image similarity evaluation Journal of Electronic Imaging, 7 (2), 308-317. f14] Chandra, A. Harel, 0. 1980. Computable queries for relational databases Journal of Computer and System Sciences, 21,156-178. [15] Chang, S., Shi, Q., Yan, C. 1983. Iconic indexing by 2D strings IEEE Transactions on Pattern Analysis and Machine Intelligence, 9 (3),413-428. [16] Chew, L., Goodrich, M., Huttenlocher, D., Kedem, K., Kleinberg, J., Kravets, 0. 1997. Geometric pattern matching under euclidean motion Computational Geometry, 7, 113-124. [17] Cox, I., Miller, M., Minka, T., Papathomas, T. 2000. The bayesian image retrieval system, PicHunter IEEE Transactions on Image Processing, 9 (1), 20-37. [18] Di Sciascio, E., Donini, F. M., Mongiello, M. 2000. A Description logic for image retrieval In Lamma, E. Mello, P, AI*IA 99: Advances in Artificial Intelligence, 1792 in Lecture Notes in Artificial Intelligence, 13-24. Springer-Verlag. [19] Di Sciascio, E., Donini, F. M., Mongiello, M. 2002. Spatial layout representation for query-by-sketch content-based image retrieval. Pattern Recognition Letters, Elsevier, 23(13),1599-1612. [20] Di Sciascio, E., Donini, F. M., Mongiello, M. 2002. A logic for SVG documents query and retrieval In Proceedings ofInternational Workshop on Multimedia Semantics (SOFSEM 2002), Milovy, Czech Republic, November 28-29. [21] Di Sciascio, E., Donini, F. M., Mongiello, M. 2002. Structured Knowledge Representation for Image RetrievalJournal of Artificial Intelligence Research, 16,209-257, Morgan-Kaufmann. [22] Di Sciascio, E. Mongiello, M. 1999. Query by sketch and relevance feedback for content-based image retrieval over the web, Journal of Visual Languages and Computing, 10 (6), 565-584. [23] Donini, E, Lenzerini, M., Nardi, D., Schaerf, A. 1996. Reasoning in description logics In Brewka, G., Foundations of Knowledge Representation, 191-236. CSLI-Publications. [24] Edelmann, S. 1999. Representation and Recognition in Vision. The MIT Press. [25] El-Kwae, E. Kabuka, M. 1999. Content-based retrieval by spatial similarity in image databases ACM Transactions on Information Systems, 17, 174-198. [26] Flickner, M., Sawhney, H., Niblak, W, Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., Yanker, P 1995. Query by image and video content: The QBIC system IEEE Computer, 28 (9), 23-31. [27] Foley, J., van Dam, A., Feiner, S., Hughes, J. 1996. Computer Graphics. Addison Wesley Publ. Co., Reading, Massachussetts. [28] Fuhr, N., Covert, N., Rolleke, T. 1998. DOLORES: A system for logic-based retrieval ofmultimedia objects In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Developement in Information Retrieval (SIGIR '98), 257-265, Melbourne, Australia. [29] Gevers, T. Smeulders, A. 2000. Pictoseek: Combining color and shape invariant features for image retrieval IEEE Transactions on Image Processing, 9 (1), 102-119. [30] Gudivada, V 1998. R-string: A geometry-based representation for efficient and effective retrieval of images by spatial similarity IEEE Transactions on Knowledge and Data Engineering, 10 (3), 504-512.

e

384

Di Sciascio et al.

[31] Gudivada, V Raghavan, J. 1995. Design and evaluation of algorithms for image retrieval by spatial similarity ACM Transactions on Information Systems, 13 (2), 115-144. [32] Haarslev, V, Lutz, C, Moeller, R. 1998. Foundations of spatioterminological reasoning with description logics In Proceedings of the Sixth International Conference on Principles of Knowledge Representation and Reasoning (KR'98), 112-123. [33] Hacid, M.-S. Rigotti, C 1999. Representing and reasoning on conceptual queries over image databases In Proceedings of the Twelfth International Symposium on Methodologies for Intelligent Systems (ISMIS'99), 1609 in Lecture Notes in Artificial Intelligence, 340-348, Warsaw, Poland. SpringerVerlag. [34] Hartman, J. Wernecke, J. 1996. The VRML 2.0 Handbook. Addison-Wesley. [35] Hirara, K. Kato, T 1992. Query by visual example In Pirotte, A., Delobel, C, Gottlob, G., Advances in Database Technology-Proc. 3rd Int. Conf. Extending Database Technology, EDBT, 580 ofLecture Notes in Computer Science, 56-71. Springer-Verlag. [36] Jacobs, C, Finkelstein, A., Salesin, 0. 1995. Fast multiresolution image querying In Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '95), 277-286. [37] Jahne, B., Haubecker, H., Geibler, E 1999. Handbook ofComputer Vision and Applications. Academic Press. [38] Ma, W Manjunath, B. 1997. NETRA: A toolbox for navigating large image database In Proceedings of the IEEE International Conference on Image Processing (ICIP '97), 1, 568-571, Santa Barbara. [39] Marr, 0. 1982. Vision. W.H. Freeman and Co., Oxford. [40] Meghini, C, Sebastiani, E, Straccia, U. 2001. A model of multimedia information retrieval Journal of the ACM, 48 (5), 909-970. [41] Moeller, R., Neumann, B., Wessel, M. 1999. Towards computer vision with description logics: some recent progress In Proceedings of the IEEE Integration of Speech and Image Understanding, 101115. [42] Nebel, B. 1990. Reasoning and Revision in Hybrid Representation Systems. 422 in Lecture Notes in Artificial Intelligence. Springer-Verlag. [43] Niblak, W, Barder, R., Equitz, W, Flickner, M., Glasman, E., Petkovic, D., Yanker, E, Faloustos, C 1993. The QBIC project: Querying images by content using color, texture, and shape In Storage and Rerrieval for Srill Image and Video Darabases, 1980, 173-182. SPIE. [44] Paquet, E. Rioux, M. 1998. A content-based search engine for VRML databases In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'98), 541-546, Santa Barbara, CA. [45] Picard, R. Kabir, T. 1993. Finding similar patterns in large image databases In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP '93), 161-164, Minneapolis, MN. [46] Pirri, E Finzi, A. 1999. An approach to perception in theory of actions: part 1 In Linkoping Electronic Articles in Computer and Information Science, 41. Linkoping University Electronic Press. [47] Pok, G. Liu, J. 1999. Texture classification by a two-level hybrid scheme In Storage and Retrieval for Image and Video Databases VII, 3656, 614-622. SPIE. [48] Pratt, W 1991. Digital Image Processing. J. Wiley & Sons Inc., Englewood Cliffs, NJ. [49] Reiter, R. Mackworth, A. 1989. A logical framework for depiction and image interpretation Artificial Intelligence, 41 (2), 125-155. [50] Reiter, R. 1980. Equality and domain closure in first-order databases Journal of the ACM, 27 (2), 235-249. [51] Rui, Y, Huang, T, Mehrotra, S. 1997. Content-based image retrieval with relevance feedback in MARS In Proceedings of the IEEE International Conference on Image Processing (ICIP '97), 815818. [52] Rui, Y, She, A., Huang, T 1996. Modified Fourier descriptors for shape representation-a practical approach In Proceedings of 1st Workshop on Image Databases and Multimedia Search, Amsterdam. [53] Sanfeliu, A. Fu, K. 1983. A distance measure between attributed relational graphs for pattern recognition IEEE Transactions on Systems, Man, and Cybernetics, I3 (3), 353-362. [54] Schmidt-Schau,8, M. Smolka, G. 1991. Attributive Concept Descriptions with Complements, Artificial Intelligence, 48 (1), 1-26. [55] Smith, J. Chang, S. 1996. VisuaISEEK: a fully automated content-based image query system In Proceedings of the fourth ACM International Conference on Multimedia (Multimedia'96), 87-98.

Knowledge based systems technology and applications in image retrieval

385

[56] Straccia, U. 2001. Reasoning within fuzzy description logics Journal ofArtificial Intelligence Research, 14,137-166. [57] Tagare, H., Vos, E, Jaffe, c., Duncan,]. 1995. Arrangement: A spatial relation between parts for evaluating similarity of tomographic section IEEE Transactions on Pattern Analysis and Machine Intelligence, 17 (9), 880-893. [58] Ullman,]. D. 1988. Principles of Database and Knowledge Base Systems, 1. Computer Science Press, Potomac, Maryland. [59] Woods, W A. Schmolze,]. G. 1992. The KL-ONE family. In Lehmann, E W, Semantic Networks in Artificial Intelligence, 133-178. Pergamon Press. Published as a special issue of Computers & Mathematics with Applications, 23, 2-9. [60] Yen,]. 1991. Generalizing term subsumption languages to Fuzzy logic In Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (IJCAI'91), 472-477. [61] Zadeh, L. 1965. Fuzzy sets Information and Control, 8, 338-353.

VOLUME II. INFORMATION TECHNOLOGY

TECHNIQUES IN INTEGRATED DEVELOPMENT AND IMPLEMENTATION OF ENTERPRISE INFORMATION SYSTEMS

CHOON SEONG LEEM AND JONG WOOK SUH

1. INTRODUCTION TO THE INTEGRATED METHODOLOGY FOR ENTERPRISE INFORMATION SYSTEMS

Information technology is the important weapon to improve and keep an enterprises' competitiveness in ever-changing business environment. It is a systematic methodology that is mostly required as a supporting tool achieving complicated activities connected with introduction of information systems. The information systems embodied to be impertinent can be wasting enterprise resources and weakening enterprise's competitiveness. Therefore, many consulting corporations have developed and applied various commercial methodologies in order to provide systematic guide on the construction of enterprise information systems. Methodology must integrate each kinds of theory and tools scattered and support that all of the users may utilize it easily. Thus, related methodology research has to connect each kind of theory and tools in synthetic viewpoint to satisfy efficient and effective construction of information systems. Also, previous researches show that enterprises which have systematic methodology construct information systems more effectively. Most research works and commercial products, however, are lack ofthe architectural integrity and functional applicability to meet these sophisticated needs of enterprises. Lack of the architectural integrity is caused by two factors: the absence of customizable architecture regarding inner environment and natural culture of enterprises, and the non-integrated framework to manage engineering tools and output data used and generated during development and implementation of information systems. Lack of 3

4

Choon Seong Leem and Jong Wook Suh

the functional applicability is caused by three factors: broken bridge linking business strategy with information strategy in rational manner, the absence of economic justification and management systems, and unreliable mechanism for analysis and evaluation about level of enterprise information systems. This chapter introduces a new integrated methodology for successful development and implementation of the enterprise information systems. 1.1. Development of information systems

The development methodology of information system considers the life-cycle of information system and additional elements. At large, the whole life-cycle of information system are like the SDLC (System Development Life Cycle). The life-cycle of information system is composed planning, analysis design, implementation, and maintenance • Planning: The necessity and purpose of system, validity check, cost/benefit analysis • Analysis: Investigation on the organizational environment, systems, user requirement, and configuration of the system functions based on user requirement • Design: Logical system design, design of new system structure, business process, and input-output, file/database design, application coding, software development • Implementation: Purchasing hardware, installing systems, user training • Maintenance: System performance evaluation, User feedback, system upgrade, continuous support. In Addition, IS package introduction and implementation, IS outsourcing, economic justification and measurement of IS, analysis of enterprise competency, and administration of IS projects which are recently applied to the enterprise are included in the integrated methodology. 1.2. Previous research

Methodology in the enterprise informatization and information engineering plays a role in establishing framework to manage the project, defining operations, setting up the goals and procedures of project, identifying the required resources during project, and assigning the responsibility. Moreover, it creates the baseline of project, monitors the executed operations, and evaluates the result of project. Finally it helps to check the parts to be improved for the next businesses. Methodology is generally composed of systems development life-cycle above mentioned. There are several development methodologies like IDEF (US Air force) and ARIS (Scheer) that can support the enterprise process and data modeling, and Rose (Rational corporation) which support UML. However, these methodologies are not classified as IS development methodology because they cannot cover the whole range of enterprise. There are some information system development methodologies focused on the development of IS and promotion of informatization. Until by now, the recent information systems development methodologies have been led by IT consulting firms

Techniques in integrated development and implementation of enterprise information systems 5

Table 1 Major information system development methodologies Methodology

Characteristics

Information Engineering (James Martin) Navigator (Ernst & Young) Method/1 (Accenture) ASAP (SAP)

It develops the information systems with enterprise model, data model, and process model in the knowledge base. It takes the IE (information engineering)-based approach that is composed of planning, analysis, design, construction/acquisition, and evolution of IS. It has been applied to many projects, and revised and extended periodically

ASAP (accelerate SAP) helps company to implement SAP Il/3 by reducing the time and cost.

which have provided the consulting services and implemented information system to many enterprises. Table 1 is summary of major information system development methodologies. 1.3. Overview of the integrated methodology for enterprise information systems

The integrated methodology for enterprise information systems is the methodology to help the enterprises to construct the information systems. Using this methodology, the ones who implement IS execute the works through the roadmaps suggested in this methodology and store outputs in the repository which is one of the component of this methodology. Applied subjects are in a larger sense than in general, which is the enterprise includes company, government, university and other organizations. Framework of the integrated methodology for enterprise information systems

The integrated methodology is composed of pattern & scenario, roadmaps, components and repository as following figure 1. -Patterns & Scenario

The integrated methodology for enterprise information systems has several development paths. These paths are able to be applied to the peculiar characteristics of enterprises. Besides, this integrated methodology offers the scenarios which can be applied originally using the components. Figure 2 shows the relations between the roadmaps and patterns. The patterns suggested in the integrated methodology have the meanings as follows. They are classified into higher and lower patterns by the in/ out state of the enterprises for users to apply this methodology easily. The higher patterns have development/package introduction in development method and traditional!radical approach in development velocity. The lower patterns are classified by industry, size and development range.

6

C hoo n Seong Leem and Jo ng Wook Suh

Development

Package Introduction

E1~~~~~_-:

..J

Figure 1. Co nce pt of the integrated methodology for en terprise inform ation systems.

-R oadmap

Each patterns and scenario s has own roadmaps and is suppo rted by the components which are applied to each roadm aps. -Component

T here are five components in the int egrated methodology. A. Information Strategic Planning Meth odology (ISPM): is compose d of strategic managem ent planning, infor mation strategy, information systems execu tion planning, and is related and w ith information strategy and management strategy system ically. B. Econ omi c Ju stification and Measurem ent Systems (EJM S): suppo rts accurate and effective investment decisions by qu antitating the eco no mic investment effects of information systems

Techniques in int egrated developme nt and implementation of en ter prise information systems 7

Developm ent

Higher Pattern s

( roadmap

~

Package Introduction

( roadmap

0

Lower Patterns

Figure 2. R oadrnaps and pattern s.

C. Evaluation Indices of Indu strial Informatization (EIII): evaluates the state of enter prise takin g the objec ts of information systems impl ement ation and all the circumstances related to information systems into consideration D. Unifi ed M odeling Technique (U MT ): is a modeling tool supporting int egration of outputs through entire life- cycle of implementation of information systems. User requirement s are reflected by UMT effectively and make it easy to implement information systems by con necti ng modelin g outputs to system deign and analysis ofli nkage amo ng modelin g. E. Suppo rt Systems for Soluti on Introduction & Evaluation (S3IE): helps decision making of enterprise executives to plan package introduction strategy, to evaluate each package and select one. These five components support the roadmaps described above continuously. Moreover, they can be used independentl y in the roadmaps which are supported by scenarios. -Repositorv

The outputs created in applicatio n of methodology are stored in repositor y. Repositor y consists of not only database wh ich has role ofstorage hous e of out put, but best practice, knowledge coo rdinato r, and knowledge storage. The features of the integrated methodology for enterprise information systems

The integ rated method ology tor enterprise information systems supports th e who le life- cycle (planning, analysis, design , constru ction, and operation) and has consistent

8

Choon Seong Leem and Jong Wook Suh

Strategy

& CASE TOOL

II 1111 Project Management

&

&

Application

Database

_ _11_Figure 3. Four models in enterprise model.

approaches through the entire stages. This methodology lets the enterprises use the suitable components to their states of business. This methodology has the architecture which is composed of Milestone, Phase, Activity, Task, and Subtask. Moreover, the quantitation of the analyzed results and elimination of the irregular factors in the integrated methodology helps the user to implement information systems using case tools easily in this methodology not depending on consultants' ability. And, it is easy to connect qualitative analysis results closely with modeling and guarantees good adaptability to user in the various states. Finally, it provides the results of evaluation in various viewpoints. The approach of the integrated methodology for enterprise information systems

The integrated methodology supports the consistency from planning to construction by enterprise models. These four models are function, organization, information, and technology model. The enterprise models represent and record the companies or organizations using simple terms and symbols. They are useful tools to presuppose the figures of information systems which will be constructed and to estimate the justification, cost, and time. Fig. 3 shows the conversion of the strategy to the applications and databases through the integrated methodology. The integrated methodology is supported by four models, case tools and management methods. Process of the integrated methodology for enterprise information systems

Roadmaps in the integrated methodology have the procedures for the implementation of information systems from information systems planning to construction and maintenance by relating the tasks in each phase of methodology. Besides, roadmaps are

Techniques in integrated development and implementation of enterprise information systems 9

sets of action for achieve the goal that is enterprise informatization under the consideration of environment and strategy. Hence, roadmaps in the integrated methodology have several paths. 2. TECHNIQUES OF INFORMATION STRATEGIC PLANNING

2.1. Overview

ISPM (Information Strategic Planning Methodology) plays a role in the integrated methodology for enterprise information systems to establish the Information Strategic Planning (ISP). ISP is defined as the process of defining the business application portfolio and the planning which has goal to achieve the competency in business using the information systems in innovative ways. Therefore, ISPM means the methodology that makes the requirements of business clear, converts them to into requirement in systems and supports the process to implement information systems. 2.2. Previous researches

The role and functions ofIS in organization have been dramatically changed in recent years. Most of all, IS has been the critical value creator nowadays, not just business supporter. Thus, numerous researches on ISP have been conducted. In 1970, Zani defined ISP as a top down plan concentrating on the alignment business strategy with information system plan, which is considered as the foundation of ISP research. Afterwards there are various researches and corresponding definitions on ISP. King (1994) defined ISP as all planning activities that are directed toward identifying opportunities for using information technology to support the organization's strategic business plans and to maintain an effective and efficient IS function. Lederer and Sethi (1996) defined ISP as the process of identifying a portfolio of computerbased application that will assist an organization in executing its business plans and realizing its business goals. Baker (1995) defined ISP as the identification of prioritized information systems that are efficient, effective and/ or strategic in nature together with the necessary resources (human, technical, financial), management of change considerations, control procedures and organizational structure needed to implement IS. However, many definitions confine ISP to a kind of plan for the IS portfolio, while the scope of ISP needs to expand as the role of IS/IT in 21st century expands. In this integrated methodology, ISP is defined as follow. ISP is all planning activities to identify strategic information requirements and business strategies related to ISIIT, and to support information system development, business transformation and education. Typical objectives of ISP are summarized as below: • aligning investments in IS with business goals, • directing the efficient and effective management of IS function and IS resource, • identifying information requirements and priorities of IS, • deriving the top executive's participation and supporting to develop IT, • reducing the implementation and management cost of IS, • supporting the execution of business plans through IT.

10

Choon Seong Leem andJong Wook Suh

P 1100 Preparation

--

P 1400 To-Be Modeling

----.

P 1200 Environment Analysis

~

P 1300 As-Is Modeling

----.

P 1500 Value Estimation

f------

P1600 WrapUp

'--

Figure 4. Main flow ofISPM.

In order to achieve objectives of ISp, crucial factors to influence the developing process ofISP should be studied. One of the best-known studies in this field is Lederer and Sethi's. They defined factors influencing ISP as follows: the proliferation and maturity ofIT in the company, the complexity of business plan, the scope ofISp' and the involvement ofIS organization in developing business plans. Organizational factors such as the scale of organization, the role of top management, the duration of decision making also were considered as major factors. 2.3. Information Strategic Planning Methodology (ISPM) Objectives of ISPM

Success ofInformation Strategy Planning (ISP) depends on the linkage between business strategy and information strategy. ISP consists of strategic management planning, strategic information system planning and execution planning of information systems. Control and management of changes must be conducted to feedback ISP to business strategy. Figure 4 shows the main flow of ISPM. Key features of ISPM

First, requirement analysis via reviews on documents and interviews, and evaluation of existing information systems are conducted in preparation phase. Second, environment analysis phase executes analysis of enterprise status, business goals which decide enterprise strategy, technical environment, rival company's systems and related information technologies. It estimates enterprise's competitiveness and level of information systems. At last, it sets up key strategic points of information systems. Third, as-is modeling phase models enterprise in four modeling elements of technology, organization, information and function with simplified terms and symbols to grasp full states ofan enterprise. It verifies the integrity ofintegrity and analyzes consistency of the models with business strategy. Finally, it generates improvement processes and examines improvement possibility. Fourth, to-be modeling phase models will be enterprise architecture based on improvement process drawn from as-is modeling. It models goals in four modeling

Techniques in integrated development and implementation of enterprise information systems 11

Table 2 The framework for evaluation of ISP ISPM-El (Information Strategic Methodology-Evaluation 1)

ISPM-E2 (Information Strategic Methodology-Evaluation 2)

Objective

- check the authority of ISP processes, completeness of outputs and their relation

- verify the agreement and possibility to be realized

Role

- evaluate the reliableness of ISP - analyze the competency of To-BE model - analyze the IS performance of To-BE model - analyze economic justification of ISP

- compare the constructed information system with ISP

elements of technology, organization, information and function. After modeling of each model, it sets up strategies for implementation of four models and integrates these strategies. Fifth, value estimation phase executes estimation on consistency and robustness to judge how faithfully former phases follow methodology in terms of outputs and their formality. It estimates achieved competitiveness, level of information systems and economic values of information strategy planning from to-be modeling. Sixth, wrap-up phase gets the final confirmation of information strategy planning and endorsement of system users. It includes training plans for newly adopted systems and maintenance plans for information strategy planning. 2.4. Framework for evaluation of ISP

The objective of evaluation of ISP is to reduce the reworks and development time through early adjustment, to establish ISP suitable for the enterprises, and leads the board of directors to take part in IS projects by providing the necessary information in decision making. The framework for evaluation ofISP considered in ISPM is divided into ISPM-El and ISPM-E2 as following table 2. Information Strategic Methodology-Evaluation 1 (ISPM-E1)

The evaluation of ISP establishment has four major roles. First, the reliableness check of ISP evaluates the authority of ISP processes. Second, economic justification of ISP compares the cost with benefit of To-Be enterprise models. Third, analysis ofIS performance of To-Be models evaluates the potential level of the state ofIS in enterprise. Finally, the assessment of competency in IS shows the latent IS competitiveness of To-Be enterprise models. Information Strategic Methodology-Evaluation 2 (ISPM-E2)

The evaluation on execution of ISP is achieved by improvement of IS performance, uplift of IS competitiveness, benefit-cost analysis and administration in IS process as shown in fig. 5.

12

C ho on Seong Leern andJong Wo ok Suh

improvement of IS performance

Administration in IS process

Figure 5. The evaluation on exe cut ion of ISP.

3. TECHNIQUES FOR THE EVALUATION OF INDUSTRIAL INFORMATION SYSTEMS (EIII)

3.1. O verview

Recently, the importance ofIS (Informa tion Systems) is being rapidly increased as a key strategic mean promoting th e efficien cy of enterprise activity. According to dramatic progresses of info-techn ology, the typical users of IS are expected to use various applications in dynamic enter prise environments. Furthermore, mo st enterprises pursue the renovation of business process and strategies throu gh IS. In order to adequately respon se these trend s, ent erprises have to establish co mprehensive concepts and goals based on evolutionary characteristics of IS and to identi fy th eir objectives from th e continuous evaluation of cur rent IS conditions by a scient ific and systemic meth odology. This paper examines the evaluatio n issues ofenter prise IS perform ance dealing with: (1) suggestio n for the perfor mance improvement model based o n th e evolutio nary characteristics of IS, (2) development of an integrated evaluation system based on the improvement m odel , and (3) verification of efficien cy and applicability of th e evaluation system . 3.2. Previous researches

Th is wo rk focuses on imp rovement of IS performance by systematic evaluation m eth odology. Previous researches can be classified int o two types regarding impro vement models and evaluation models of IS performan ce. Moreover, th e researches related to the evaluation models co ncern three kind s of topic which are evaluatio n mo del, evaluation fields, and evaluation items of IS performance. Previous researches on the improvement model of IS performance

T here are two types of researches related to improvemen t of IS performance. The one is on imp rovement processes and th e other is on imp rovem ent stages ofIS performance.

Techn iques in integrated development and implementation of enterprise informat ion systems 13

Table 3 R esearches on important processes ofl S performance Ti tle

Improvem ent processes

Focus

PDCA QI P

Plan - Do - Che ck - Act C haracterize the environ ment - Set goals- Choose and tailor a process mod el - Execut e the pro cess- Analyze the collected data ~ Learn and feedback Initiating - D iagnosing - Establishing - Acting - Leveraging or Learni ng Co ntact - Awareness - Understanding - Evaluation - Trial Use - Ado ption - Institutionalization

Product quality improvement S/ W quality improvement

IDEAL Kaizen

Process improvement New techn ology adoption

PDCA (Plan- Do- Check-Act), IDEAL (lnitiating-Diagnosing-E stablishing-ActingLearning), and QIP (Q uality Improvement Paradigm) are typical researches on improvement process. The PD CA initialized by Shewhart (1931) and generalized by Deming (1986) after World War II is the improvement proc ess of product quality based on feedback cycle that can optimize un it production proc ess. T he Q IP by NASA Software Engin eering Laboratory is the impro vement pro cess of software quality based on the meta- Iifecycle model to improve long term quality. Th is process has several functions; packing, assessing, and increasing comprehension ofdevelopme nt experi ence for software. The IDEAL by SEI (Software Eng ineering Institut e) in the C arn egie Mellon U niversity is the process improveme nt mod el focused on proj ect management. T his model is composed of five steps that are conti nuously and recursively performed (McFeeley, 1996). The Kaizen model to improve th e pro cess performance has been applied to the ESPRIT project. T he basic concept of this model is called 'adoption curve' to take up new technology which is propo sed by Conner and Patterson . Table 3 bri efly summarize the se researches (R enaissance Consortium, 1997). Th is work focuses on improvement of IS performance by systematic evaluation methodology. Previou s researches can be classified into two types regarding improvement models and evaluation models ofIS performance. Fur ther, the researches related to the evaluation mod els concern three kinds of top ic which are evaluation mod el, evaluation fields, and evaluation items of IS performance. T here are several researches on imp rovement stages of IS performance . Nolan and Wetherbe (1980) suggested six maturi ty stages ofIS focused on data, and Venkatraman (1 997) also proposed a five stage model focused on structure inn ovation oforganization by IS. Vern adat (1996) present ed a three stage model of system s integration according to expansion of the CIM (Com puter Integrated Manu facturing) int egration range. T he C MM (Capability Maturity Mod el) by SEI (software engineering institute ) is composed of five stages derived from the degree of process maturi ty (Bate, 1995). The ISM (Information Systems Manag eme nt) m.odel by Tan (1999) is based on balance between organizatio nal structure and IT compo nents. T his model is originated from M IT90s framewo rk that is composed of the levels of IT-enabled business reconfiguration by Venkatraman. In this mod el, IS fields are divided into three parts which are external environments, organization environments, and IS environments. Table 4 shows the researches related to improvem ent stages of IS per formance.

14

Choon Seong Leem andJong Wook Suh

Table 4 Researches on improvement stages of IS performance Title

Improvement stages

Focus

Nolan CIM ISM

Initiation - Contagion - Control- Integration - Data Administration - Maturity Physical System Integration - Application Integration - Business Integration Functional Integration - Cross-functional Integration - Process Integration Business Process Redesign - Business Redesign or Business Scope Redefinition Performed Informally (Initial) - Planned & Tracked - Well-defined Quantitatively Controlled - Continuously Improving

Data System Business

CMM

Process

Previous researches on the evaluation model of IS performance

The evaluation diagnoses the current condition, and utilizes its results for future plans, so that the organization could get the better performance. For instance, the Japanese Deming prize, USA's Malcolm Baldrige Award called 'criteria for performance excellence', and European's 'Business Excellence model' are known to significantly contribute to quality improvement of products and process. Also, USA, UK, Japan, and GECD are continuously working out the national IS indices, so as to gradually increase the level ofIS performance (Jeong, 1996). In the research related to the evaluation model of IS, the DeLone and McLean's IS success model (1992) has been referred by many researchers, which was based on the works by Shannon & Weaver (1949) and Mason (1978). This model is examined and improved by Seddon and Kiew (1994) which suggests the measures of six fields and proves their appropriateness. Since the IS model did not cover the appropriate measures coincided with the characteristics of organization, Saunders and Jones (1992) developed the 'IS Function Performance Evaluation Model' which encompasses a selection method of appropriate measures corresponding to organization features. Myers, Kappelman, and Prybutok (1997) worked out the 'Comprehensive IS Assessment Model' that expanded the six evaluation fields of DeLone and McLean's model into eight fields and combined these fields into organizational and external environments. Also, Goodhue and Thompson (1995) and Goodhue (1998) proposed the TPC (Task- to-Performance Chain) model based on a fitting technique into individual performance. The focus of the model is to apply the technique to individual tasks to calculate their positive impact upon individual performance. Additionally, there are several researches related to IS framework. Tan (1999) suggested the 'Consistency Model' composed of seven components that expanded the MIT90s model and SEI also proposed a framework composed of seven evaluation fields (Bergey, 1997). As researches related to identification of the evaluation items, the GQM (GoalQuestion- (indicator)-Measures) methodology was introduced by Basili and Rombach (1988), refined by AMI (1992), Pulford (1996) in ESPRIT project, and was applied to the goal-driven software evaluation by Park (1996) in SEI. Especially, Mondonqa (1998) converted the GQM (Goal-Question-(indicator)-Measures) to another GQM (Goal-Question-Metric) for improvement of evaluation processes. Sometimes researches related to the evaluation model of IS performance.

Techniques in integrated developm ent and impleme ntation of enterprise inform ation systems 15

~===;=====;======r===;:===;impl'u\in;!. SI'Il-:l'

Fun cti on Inte gration

Init iation \'to' : Weight S : Sco re

Co ntag io n

P ro cess Int egr a ti on

Bu sin es s In tegration

~

Indu str y Inte g rati on

WISI + W2S2 + W3 S3 + W4 S4 + WSSS +W6 S6

R ol e-M o del G rnera tio n S ix F ie lds O rIS P er fo r m a n ce

Figure 6. Five improvemen t stages of IS performance.

3.3 . The improvement model of IS performance

'IS (Information Systems)' is able to be defmed as integrated systems that collect data, analyze that, generate the new useful information, transmit it and use information related with businessactivities in organizations, typically business process in enterprises. 'The IS perfo rmance' that is usually divided into several stages is defined as the degree of effectiveness and efficiency in business goal accomp lishment by IS. The ' Improvement of IS performance' implies that the IS performance is improved to becom e flexibly comme nsurate with changes in internal and externa l environm ents and various requ irements of users, so that the IS perform ance can be optimized with activities in organization . 'T he imp rovement model ofIS perfor man ce' is a representation of their relatio nships and the improvement model of IS perfor mance, w hich consists of improvem ent stages and cycles. Improvement stage of IS performance

T he improveme nt stage ofI S perform ance plays maj or role in overall evaluatio n of IS performance. The imp rovement stage is suggested to consist of five stages and figure 6 shows th em . As shown in figure 6, the five improv ement stage ons performance in this research are function integration, process int egration, business int egration, industry integration , and role-model generation . T he level of the stage can be deter mined by the six comprehensive fields of IS perform ance which are vision , organization & institution , infrastruc ture, support ing, application, and usage ofIS. T he 'function integration' represents to computerize the individual tasks within isolated systems The 'pro cessinte gration' combines the individu al processes and functions into corresponding work ing group via IS. The 'b usiness inte gration' is defin ed to int egrate the working groups into the level of entire organizatio n, and the ' industry inte gration' should be cover up to partner companies and, individual customers, outside the organization. In the 'role-model generation' stage, the organization can flexibly accommo date to new external environment by itself and naturally create new business models by accumulated information and upd ated IS.

16

Choon Seong Leem and Jong Wook Suh

Q Leu r n i n g/

' lJ Figure 7.

Le ve r a - in '

L e ur nl n g l Lever ag i ug , Pr-e p a r auu n For N e xt S t e p h • • • • • • • • • • • • • • • • • • • • • •• • • • • _

•

Imp rovement cycles oflS perform ance.

The improvement stage of IS performance has imp ortant meanings that can quantitatively represent the cur rent IS status and target IS status in futur e. Seeing that the IS environme nts have many diverse qualitative facto rs and these facto rs are tangled wi th each ot her, it is very difficult for organization to decide level of the stages for curre nt IS status or target IS status. T herefore, in order to decide the stage correctly, these stages sho uld be characterized and explained by vario us facto rs. Th is paper suggests th ese decision factors based on the IS framework that are divided into six fields; IS vision, IS organization & instituti on, Infrastru cture, suppo rting, application, and usage. Improvement cy cle of IS performance

Th e improvement mod el of IS performance in th is paper consists of th ree component s: im provement stages, integrated evaluation system, and constructio n process, and sho uld be applied by five cont inuous and circular cycles; initiation, goal establishm ent , diagnosis and evaluation ofIS performance, con stru ction process, and leveragin g and learning. Figure 7 shows the cycle. As shown in Figure 7, th e impr ovem ent of IS performance can be achieved by five proce sses. First, the moti ve to improve IS performance is triggered by stimulus or iginated from chan ges in internal and extern al environment . Secon d, the organization should establish th e goal (IS vision) th at can flexibly cope with th e trends of IS environme nt. Third, th e organiza tion sho uld evaluate th e cur rent IS status, identify future objec tives, and analyze th e gap thro ugh the comparison between goal states and current states. Fourth, detailed pro blems in cur rent states should be considered in planning and co nstru ction ofIS proj ects. Finally, information and knowledge acquired from previous processes should be utilized with recur sive iterations of the cycle, the IS enviro nments can be co ntinuo usly recon ciled with managem ent environme nts of th e organization .

Techniques in integrated development and implementation of enterprise information systems

17

Analysis step Interpretation step Feedbackstep Figure 8. Integrated evaluation system of IS performance.

3.4. Framework for the evaluation of IS performance

The integrated evaluation system of IS performance is designed to diagnose the current IS status, and identify the deficiencies of current status for target systems by gap analysis. This system consists of three parts; evaluation procedures, evaluation fields, and evaluation methods. The evaluation procedures can be decomposed into five steps; preparation, measurement, analysis, interpretation, and feedback. The evaluation fields which are originated from IS framework can be decomposed into three parts; measurement factors, influence factors, and evaluation factors. The measurement factors mean the static standpoint of IS framework, the influence factors mean the dynamic standpoint that represents the relationship between subject and object in IS framework, and the evaluation factors are considered to supply useful information to decisionmakers. These factors are measured, analyzed, and interpreted by various evaluation methods. Figure 8 shows a schematic diagram of the integrated evaluation system of IS performance. 4. TECHNIQUES OF IS ECONOMIC JUSTIFICATION AND MEASUREMENT

4.1. Overview

Investment of information system for achieving business goals must be an investment that can achieve the maximum effectiveness from limited resources. Thus, IS economic justification and measurement has the goal to supply quantification methodology and procedure about the effectiveness ofinformation systems investment. Usual investment propriety analysis on information systems consists of economical propriety analysis, technological propriety analysis and operational propriety analysis. But, technological propriety analysis and operational propriety analysis is not so important because most information strategic planning are based on existing information technology and

18

Choon Seong Leem and Jong Wook Suh

inn er resources. IS economic j ustification and measurement just focuses on eco nomical prop riety analysis. IS eco no mic justifi cation and measurement are used ind ividu ally or for the purpose of calculatin g the enterprise competency indice s in th e int egrated meth odology for ente rprise information systems. In the case of individual usage, it helps to compares the estimated effectiveness of all altern atives and to selects one. Further, it examines whether or not the estimated effectiveness is made. When it is applied as a part of the inte grat ed methodology for ente rprise information systems, it decides whether ISP or IS are executed in-hou se developed or outs ourced. Besides, the effectiveness estimation of IS projects and cost/ benefit analysis are performed after IS project is over. 4.2. Previous researches

Bacon (1992) found that the criteria such as the support of explicit business objectives and response to competitive systems are important in IS investment decision-makings. Theo and Berghout (1997) discern ed four basic appro aches such as financial approach, multi-criteria approach, ratio appro ach, and portfolio approach and group evaluation approach int o four classification s: economic appr aisal techniques, strategic approaches, analytical appraisal techniques, and int egrated approaches. Economic appraisal techniques are struc tured in nature, and include those traditionally used by accountants. They are based on th e assignment of cash values to tangible cost and benefit but largely ignore intangible factors. Strateg ic approaches are less structured in nature but co mbine tangible and intan gible factor s. Analytical appraisal techniques are highl y stru ctured in design but subjective in nature, with th eir use often including tangibl e and intan gible factors. Finally, integrated approaches combine subje ctivity with a formal struc ture. These approaches int egrate the financial and non-fin ancial dimensi ons together, through the acknowledge me nt and the assignment of weighting factors. 4.3 . Framework for economic justification and measurement system (EJMS)

Framework for economic justification and measurement system (EJ MS) is classifiedinto cost factors , effectivene ss factors, classifying scheme for ent erprise features, procedure, and techniques for using. Cost factors

C ost is divided into investment cost and maintenance cost . Th ey mean the resources whi ch are invested to equipment s, time, manpower, and so on. Besides, they are easy to be measured numerically. H owever, because the identification of the actual IS investment is difficult, the basic guideline must be provided to extract cost factors of IS project. C ost factors are classified into 12 co nstruc tions by peri ods and items. Periods are subdivided into construction and maintenance. Items are subdivided into service, labor , overhe ad cost, hardware, software and conversion . Table 5 shows th e cost factor s of EJMS.

~

.....

Appli catio n deve lopment cost

App licatio n d evelo pm ent cost C o nsulting cost

C o nstructio n

M aint enance

Service

Tab le 5 C ost factor s ofEJ MS

Empl oym ent cos t Traini ng cost

Emp loym ent cost Training cost

Lab or

Pub lic charge Equipment cost Space cost

U pgrade co st

Articles of co ns u m p tio n M ac h in e parts Excha nge cost Up grad e cost

Com m unication co st

O /S cos t DBMS cost Appli cation cost

Serve r co st PC co st N /W co st Periphe ral equ ip me nt

C o m m u nica tio n c os t Pub lic charge Equipm ent cost Spac e cos t co st

So ftware

H ard ware

O verhe ad cost

Loss of work dur in g in form ati o n system s intro d u ction Inefficient work du r in g the first state

Conversio n

20

C hoon Seo ng Leern and Jong Wook Su h

Table 6 Benefit factor s of EJMS

Operation al benefi ts

Factor

M easurem ent inde x

Cos t saving

Logistics cost saving . Op eration cost saving . Marketing and sales cost saving. Service cost saving . Firm infrastructure cost saving, Labor cost saving . Tech nology development cost saving, Procurement cost saving Increase of sales, Increase o f profit ability T imc reduction. Enhanced quality Enh anced flexibility. Enhanced usability. Enh anced credibility Different iation . C ost advantage Increased supp lier, Enh anced supplier manipula tion Increased custom er, Enhanced service

Added pro fitability R edu ced dec ision making Enh anced business function Strategic bene fits

Reduced threat of rivalry Enhan ced supp lier relation ship Enhan ced custom er relationship

Benefit fact o r s

Benefits are divided into thre e according to their characteristics. O ne is the eco nomic facto r, wh ich is measured and evaluated by monetary terms. Others are th e numerical factor, which are measured and evaluated by number or volume. Th e others are the qualitative factor. Benefits are divided int o operational benefits and strategic benefits. Operation al ben efits mea n the enhanced efficiency of firm operations . Th ey consist of cost saving, added profitability, enh anced decision-making, and enhanced business fun ction. Strategic benefits mean enhanced competitive advantage s. According to Porter (1979)'s five competitive forces model , there are five thr eats such as the threat of new ent rants, the power of suppliers, th e thr eat of substitute produ cts, and the rivalry amo ng existing competito rs. Table 6 shows the effectiveness facto rs ofE] MS. Benefit is divided int o three according to their character istics. O ne is th e eco nomic factor, which is measured and evaluated by monetary term s. O thers are the numerical facto rs whi ch are me asured and evaluated by number or volume. Th e others are th e qualitative factors . Benefi t can be classified into easy- quantified benefit and hard-quantified benefit. Easy-q uantified ben efit is monetary benefit like the reducti on of fixed charges and cost reduction. H ard-quantified benefit s is abstract ben efit like imp rovem en t of service quality, manage m ent efficien cy, con sumer's recognition, enterprise competency, and so on. The techniques whi ch are able to com pare each benefit to estimate and qualify the int egrated benefit of IS project are need. Classification of enterprise feat u res

Same investment on IS project doesn't always make same results in ente rprises. It is caused by the differen t featur es and co mpetency of each ente rpr ise. E]M S considers the type of indu stry, size, business process quality, alignment wi th business strategy, and external factors such as industry types, and competition

Techniques in integrated development and implementation of enterprise information systems

21

Table 7 Processes ofEJMS Phase

Content

Preparation

• Analysis of investment objective and background • Determination of evaluation scope and depth • Establishment of evaluation organization and schedule

Analysis

• Analysis of organization and business • Analysis of information systems • Analysis of users

Evaluation

• Establishment of cost/ effectiveness factors • Measurement of cost/ effectiveness factors • Evaluation of cost/effectiveness factors

Reporting

• Comprehensive evaluation • Report of evaluation results

environment. The enterprise sizes is divided into large enterprises and medium and small-sized enterprises. The industries are sorted in EJMS into the manufacturing industry, finance business, the distribution industry, and service industry. The features of enterprise are applied with the weight for the economic evaluation in EJMS. Process of EJMS

EJMS makes progresses through four phases: (1) Preparation, (2) Analysis, (3) Evaluation, and (4) Reporting. Detail descriptions are like Table 7. Methods using in EJMS

Evaluation of economic effectiveness or numerical effectiveness is somewhat easy. Enhanced productivity could be measured by increased amount of task numbers. It also could be measured by changes of task structure. Hedonic wage model could be applied. A task is organized by different value added subtasks. If a high value added subtask is expanded, profitability is grown. It is hard to evaluate qualitative effectiveness. AHP could be used. IT includes three major steps: identifying and selecting criteria, weighting the criteria and building consensus on their relative important, and evaluation the IS using weighted criteria. The methods can be classified by their features into measurement methods for the qualification of the tangible value, estimation methods for the quasi-tangible value, and substitution methods for the intangible value. 5. OTHER TECHNIQUES

5.1. Techniques of requirements analysis

Despite the necessity of strategic IS planning, the process is difficult and replete with opportunities for failure. Many strategic planning efforts produce plans that are never implemented. Cerpa and Verner (1998) presented 5 key issues in ISP as follows:

22

Choon Seong Leern and Jon g Wook Suh

L..: . .:. :=. t:==t - - ...~~ · =====i

Feasible To -ee

~I

. I- Ex - ec - u - t-,o-n- p- I-an - I

P1en tor AeQulremenla

~s / Spec/tied AeQ
....•

~

Business strategies Information strateates

I

Enylro nm e n t Internal External

I

U g~r

INeeds/Problems

I

I

I I

.r-- - - - ___

/.

I

...

o•

Feas ib ility Economic Organizational Toch nlc al Ooer ati onal

I

IThe degree Ollmportaoc~

IManagement ·s view I User's view

Figure 9. Framework of requirement analysis.

• T he involveme nt and commitment of senior management is essential to the success of the IS plan. It does not matter how good the plan is, if the involvement and commitment of senior management is absent. • Linking IS to business goals is the heart of IS planning, and without this link , the IS function will not have major relevance for the organization. • Ch oosing the right planning method ology depends on the cur rent use and spread of techn ology within the organization and the imp ort ance of the current systems. Available resources (staff, skills, CASE tool, and so on) will also impact this process. It appears that the use of more than one methodol ogy sho uld be recommended. • While new technology can be advantageou s, it can also pose severe problems if the right skills and expertise are not available to use it properl y. • On-goin g evaluation of the IS strategic plan to ensure that the plan is implemented correctly and the expec ted results are being obtained. If requirement s analysis is effective and systematic, the alignment of business strategies and information strategies, the evaluation of ISp, and the implementation of ISP will be achieved efficiently. Thus requirements analysis is mu ch important procedure in develop ing ISP. R equirements analysis for developing ISP is divided by two domains. O ne is requirements determination, and the other is requirements evaluation. First, requirements are determined in the area of strategy, environment and user, and supportive tools for users

Techniques in integrated development and implementation of enterprise information systems 23

Table 8 Sub-domain and its supportive tools in requirements determination Domain

Sub-domain

Snpportive tools

Strategy

business strategy information strategy

Business strategy statements, SWOT analysis Information Strategy statement, SWOT analysis, Statement of relationship between information strategy and business strategy

Environment

internal

General environment analysis, Porter's five forces model, statement of evaluation of competitive environment, Portfolio analysis Value chain analysis, Organization chart RAEW matrix, ERD, DFl), FDD, CRUD matrix Information intensity analysis, Strength and weakness analysis of IT, IT environment analysis, IS structure, IT trend analysis, IT in value chain analysis

external technological

User

needs/problems

User requirements analysis

are suggested to help them to draw requirements with ease. Last, requirements evaluation also has two domains, one is that of feasibility and the other is that of importance. Checkpoints for each evaluation are suggested to evaluate determined requirements. The evaluation of feasibility has four views, economic, organizational, technological, and operational. The evaluation of importance considers two perspectives, that of users and that of management. Figure 9 show the framework of requirement analysis. Requirements determination

Requirements determination for requirements analysis is divided by 3 domains. Those are strategy, environment, and user. They have also sub-domains. Business strategy and information strategy for strategy, internal, external and technological are for environment, and needs/problems are for user. Each sub-domain and its supportive tools in requirements determination are shown in table 8. Requirements evaluation

Requirements determination for requirements analysis is divided by 2 domains, that of feasibility and that of degree of importance. The former is to check if determined requirements form various analyses are feasible practically and it has 4 domains. ; economic, organizational, technical, and operational. The latter is to fmd higher prioritized ones among feasible requirements. It has 2 domains. ; management's view and user's view. 5.2. UMT (Unified Modeling Tools) and repository

UMT is a modeling tool supporting integration of outputs through entire life-cycle of implementation of information systems. User requirements are reflected by UMT effectively and make it easy to implement information systems by connecting modeling outputs to system deign and analysis oflinkage among modeling. UMT presents tools as matrixes, graphs, diagrams, reports, algorithms, and figures.

24

Choon Seong Leem and Jong Wook Suh

Function

Organization

Tip

+

SericsfBookl Chapter

DB

UMT

TEMPLATE

Figure 10. Architecture of Repository.

UMT supports a repository storing knowledge database. It contains industrial best practices, knowledge storage, database and consistency checker (King and Teo, 1994). Figure 10 illustrates the architecture of repository. Best practices are collection of function, information, technology, organization models describing to-be enterprises. Other enterprises can refer these models to improve competitiveness. Consistency checker can eliminate redundant works and preserve integrity of data stored in the repository. Database store not only tools and techniques that UMT offers but also template of data. Knowledge storage is a repository that collects knowledge and data generated in the progress ofproject. Participants in the project are able to get important knowledge through the storage. That is, knowledge storage enables users to get many tips related to the problems they would face. Figure 10 shows architecture of repository.

Techniques in integrated development and implementation of enterprise information systems 25

Table 9 Major processes ofS31E Stage

Sub-stages

Strategy

• • • •

Design

• Software installation • Customization requirement analysis

Construction

• Customization and unit testing • Integrated testing

Implementation

• Training • Delivery

Initiation Diagnose Strategy planning RFP preparation and software evaluation

5.3. S3IE (Support Systems for Solution Introduction and Evaluation)

531£ helps decision making of enterprise executives to plan package introduction strategy, to evaluate each package and select one. It is based on input data related each enterprise specific environment. This component supports whole processes that choose suitable products in enterprise environment through introduction preparation, enterprise environment diagnosis, introduction strategy planning, RFp, proposal document estimation and package estimation. Table 9 shows the processes of 531£. 6. FURTHER WORKS

Though the integrated methodology for enterprise information systems is expected to help enterprise to carry the informatization projects, it still has several limitations to be researched hereafter. The additional studies which are not provided in this integrated methodology are able to be summarized as follows and should be researched. The efficient application of the special matters of enterprise by industry Improvement of application of each components :J Additional automated tools for the integrated methodology :J Practical use of the best practices and linkage with process knowledge library :J Linkage with business strategy :J Method for development of To-Be enterprise model. :J :J

REFERENCES Bacon, C. J. (1992), The use of decision criteria in selecting information systems/technology investments. MIS Quarterly, September. Baker, B. (1995), "The role offeedback in assessing information systems planning effectiveness," Journal of Strategic Information Systems, Vol. 4. No. 1, pp. 61-80. Cassidy, A. (1998), A Practical Guide to Information Systems Strategic Planning, CRC Press. Cerpa, N., and Verner, J. M. (1998), The effect of IS maturity on information systems strategic planning, Information & Management, No. 34, pp. 199-208. Del.one, W. H., and Mclean, E. R. (1992), "Information Systems Success: The Quest for the Dependent Variable," Information Systems Research, Vol. 3, No.1, pp. 60-95.

26

Choon Seong Leern and Jon g Wook Suh

Deming, W E. (1986), Out of Th e Crisis, MIT C enter for Advanced Engin eer ing Stidy, MIT Press, Ca mbridge, MA . Dickson , G. W , Leirheiser, R . L., and Weth erbe,j. C. (1984), Key informatio n systems issues for the 1980's, M IS Qu arte rly, Sep, 1'1'. 135-1 47. Fazlolahi, B., and Tann iru , M . R . (199 1), "Selecting requirement determ ination methodology-contingency approach revisited," Infor mation & Management , N o. 2 1, PI'. 29 1-3 03 . Galletta, D. E, Sampler, J. L., and Teng, J. T. (1999), "Strategies for integ rating creativity prin ciples int o the systems developm ent process," Proceedings of the twenty - fifth Hawaii inte rnatio nal conference on. 1999, PI'. 26ll-275. Goodhue, D. L. et al. (1992), "Strategic data planning: lessons from the field," MIS Quarterly, Vol. 16, No. 2, PI'. 11- 32. Ha, J.-K . (200 1), "A study on the suppo rti ng methodology for implem eutin g and evaluating e- Business packages," Master thesis, Yonsei Uni versity, Korea. Jeffrey, H . J. (1996), "Addressing the essential difficulties of software enginee ring, Journal of Systems Software, 32-2 (February), PI'. 157-176. Kim, D.-W (20lJ!) , "A Study on R eq uiremen ts Analysis for Information Strategy Plannin g" , Master thesis, Yonsei University, Korea. Kim, S. T. (2001), ':A study on enterprise information system investment evaluatio n," Master thesis. Yonsei Uni versity, Korea. Lederer, A. L., and Sethi, V (1996), "Key prescriptions for strategic information systems plann ing." Journal of M anagement Information Systems, Vol. 13. No.1. PI'. 35- 62. Leem, C i-S. (1999), '99 Annual reports for evaluation of IS Perfor mance, IT R esearch and consulting. Leem , C i-S. (2000) , '00 Ann ual reports for evaluation of IS Performan ce, IT Resea rch and consulting. Leem , C. S., and Kim, I. " An Integrated Evaluation System based on the Co ntinuo us Improvement Model of IS Performance," Indu strial Managem ent & Data Systems, will appear. Leern, C. S., and Kim, S. (2002), " Intro ductio n to an integrated met hod ology for developm ent and implement ation of en terprise inform ation systems," Th e Journal of Systems and Software Vol. 60, PI'. 249-26 1. Mason , R.. O. (1978), "Me asur ing Information O utput: A C ommunication Systems Approach," Infor mation & Management, Vol. I , N o. 5, 2 19- 234. Mcfarlan. E W , McKenn ey, J. L., and Pyburn , E (1983), Th e infor mation archipelago- plotting a course, Harvard Business Re view. j an-F eb. PI'. 145- 156. Myers, B. L., Kappelman, L. A.. and Prybu tok, V R . (1997). A Co mprehensi ve Model for Assessing the Qua lity and Produ ctivity ofthe Infor mation Systems Fun ction: Toward a T heory for Inform ation Systems Assessment, Infor mation R esour ces Management j ourn al, Vol. 10, N o. I . Pl'. 6-25. N olan, R . L., and Wetherbe, J. B. (1980), Toward A Co mprehensive Framework for M IS R esearch, M IS Q uarte rly, PI'. 1-19. O h, B. (200 1), " A Stud y on the De velopm ent of the Evaluation Framework for Info rmation Strategic Plannin g," Master thesis, Yonsei Uni versity, Korea. R obertson , S., and Robertson,j. (1999), " Mastering the requirements Process," ADDI SON WESLEY. Saunde rs, C. S., and Jones, W. (1992), Measur ing performance of the information systems function ,journal of Management Infor mation System s, PI'. 63- 82. Seddo n, P. B., and Kiew, M .- Y. (1994), " A Partial Test and Development of the De l.o ne and McLean Model of IS Success," Proceedings of the International Conference on Infor mation Systems, Vancouver. C anada (ICI S 94), 99- 110. Shewh art, W A. (1931). Econ omic Co ntro l of Qu ality of Manufactured Produ ct, D. Van Nos trand Company, Inc., New York. Tan, D. S. (1999), Stages in Infor mation Systems Management, Handbook oftS Managemen t. e R C Press LLC , PI'. 51-75. Th ee, J. W., and I3erghout, E. W. (1997), Methodologies for information systems investment evaluation at the prop osal stage: a comparative review. Infor mation and Software Techn ology 39. Venkatraman, N . (1997), Beyond outsourcing: Managing IT resources as a value cente r, Saloan Management R eview, spring, PI'. 51-64. Vernadat, F. B. (1996), Enterprise Modelin g and Integration: prin ciples and applications, C hampman & Hall, PI'. 14-1 6, 317-334. Zani, W M. (1970), " Blueprint for M IS", Harvard Business Re view, Vol. 48, N o. 6, 1970, PI'. 95- 100.

INFORMATION SYSTEMS FRAMEWORKS AND THEIR APPLICATIONS IN MANUFACTURING AND SUPPLY CHAIN SYSTEMS

ADRIAN E. CORONADO MONDRAGON, ANDREW C. LYONS, AND DENNIS F. KEHOE

1. INTRODUCTION

In recent years manufacturing organisations have been facing increasing changes in their business environment. Those changes are being driven by customers demanding greater choice in products and services and competition from all corners of the globe. Moreover, manufacturing industry has been subject to a number of trends that include outsourcing, time compression, mass customisation and pricing pressures to mention just a few. Information and communication technologies can be used by manufacturing organisations to respond to changes in the business environment. The need to respond quickly to changes in market conditions is forcing manufacturing organisations to become more dependent on information technology. Indeed, the pace of technology change offers manufacturing organisations the possibility of implementing new solutions to old problems. Definitions of information technology (IT) like the one provided by Boar [1] are still valid. The researcher described IT as the asset on which an enterprise constructs its business information systems. IT is the preparation, collection, transport, retrieval, storage, access, presentation and transformation of information in all its forms (voice, graphics, text, video, and image). IT has been recognised by Shaw et al. [2] as having a major influence on all manufacturing organisations, large or small, and the rapid evolution of IT brings new possibilities for work and collaboration. In manufacturing organisations IT enables information to flow between different business units. IT is considered means to facilitate codification, processing and diffusion of information supporting the development of new 27

28

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis F. Kehoe

knowledge [3]. Such capabilities become increasingly important for manufacturing organisations facing increasingly competitive business environments. The concept of information systems is broader than that of IT. According to Ezingeard [4], information systems encompass the whole range of procedures that are in place in an organisation. Information systems have been defined as the set of applications that gather individuals and information flow on IT based devices and infrastructure. Moreover, added functionality features to information systems enable the execution of new ways of work not experienced before (e.g. concurrent design operations) . The historical use of information systems in manufacturing industry is reviewed in the first sections of this chapter. Trends that are defining the direction of information systems in manufacturing are considered and current developments of information systems in manufacturing as well as future research opportunities are provided at the end of the chapter. 2. INFORMATION SYSTEMS USE IN THE MANUFACTURING INDUSTRY

The adoption ofIT/information systems in manufacturing has been through an evolution process that started decades ago. The latest developments in information systems for manufacturing represent the utilisation of Internet-based electronic commerce (e-commerce) applications, active agents, widespread use of communication protocols, platform independent programming languages, virtual enterprise and integration not only at the enterprise level but with other organisations. However, information systems applications that were developed some decades ago are still widely used in the industry. Examples of information systems widely used include the use of MRP (Material Resource Planning), MRPII (Manufacturing Resource Planning), CAD (Computer Aided Design)/CAM (Computer Aided Manufacturing), CNC (Computer Numerical Control), SPC (Statistical Process Control), Data Management, extensive automation using PLCs (Programmable Logic Controllers), robots and AGV (Automated Guided Vehicles), CIM (Computer Integrated Manufacturing) and EDI (Electronic Data Interchange). The adoption of MRP systems, followed by MRPII, SFDC (Shopfloor Data Collection) and Data Management, represented revolutionary developments of information systems in the manufacturing sector, helping companies to improve their operations dramatically. Affordable hardware, ubiquitous use ofPe's, and better performing applications triggered the massive use of information systems in manufacturing. The introduction of automation through the utilisation of PLCs, robots and AGVs, gave birth to the concept of CIM and extended enterprises begin to develop as customers and suppliers could be integrated through the use of ED!. Manufacturing and production information systems can be classified in several ways. Table 1 shows a categorisation based on the impact information systems have on the strategic, tactical, knowledge and operational levels of an enterprise. According to Laudon and Laudon [5] strategic-level manufacturing systems deal with the firm's long-term manufacturing objectives. Long-term involve those objectives related to the installation of a new production line. Tactical objectives in manufacturing are involved in the management and control ofproduction costs and resources. Knowledge

Information systems frameworks and their applications in manufacturing systems 29

Table 1 Classification of manufacturing information systems in terms of enterprise levels Strategic level systems Production technology Facilities location applications Competitor scanning and intelligence Tactical systems Manufacturing Resource Planning Computer Integrated Manufacturing Inventory Control Systems Cost Accounting Systems Capacity Planning Labour Costing Systems Production Schedules

Knowledge level systems Computer aided design systems (CAD) Computer aided manufacturing (CAM) Engineering workstations Computer Numerically Controlled (CNC) machine tools Operational systems Purchase/receiving systems Shipping systems Labour-costing systems Materials systems Equipment maintenance systems Quality control systems

(From Information Systems and the Internet, Fourth Edition 4th edition by LAUDON. © 199R. Reprinted with permission of Course Technology, a division of Thomson Learning: www.thomsonrights. com. Fax ROO 730-2215.)

systems represent the creation and distribution of knowledge and expertise driving the production process. Operational information systems deal directly with all production tasks involving purchasing, shipping, materials and quality control. 3. INFORMATION SYSTEMS EVOLUTION IN MANUFACTURING

Shewchuk [6] described that the function of information systems in manufacturing is to support the planning, scheduling and control activities of an organisation. The use of computers in manufacturing is represented by different technology trends that have appeared in the last six decades, Next Generation Manufacturing Project 1997 [7]. The origins of the use ofIT in manufacturing can be traced back to the 50's, but it was not until the beginning of the 70's when IT started to be widely adopted in manufacturing organisations, represented by applications such as CAM and Materials Transformation. The progress ofthe 70's saw applications and technologies such as Data Management, CAD/CAM, MRp, CNC and JIT (Just-In-Time) being developed and widely implemented in manufacturing enterprises. The 80's saw the development and implementation of technologies such as Intelligent Scheduling, Supplier Partnerships, CIM, Automation (use ofPLCs in substitution of relay arrays), Robotics, EDI, CAE (Computer Aided Engineering) and MRPII. Gefen [8] highlighted that on occasions, MRPII systems are incorporated into larger ERP (enterprise resource planning) packages, enabling companies competing in the global marketplace to redefine, integrate and optimise their supply chains. The same researcher stated that MRPII systems are complex information systems that manage and coordinate a company's supply chain, inventory, bill of materials, production scheduling, capacity planning, job costing, and cash planning. The 90's have witnessed the development of software based on Object Technology, and the widespread use of applications and technologies related to Operational Modelling, Enterprise Integration, Intelligent Sensors, Active Agents, Virtual Reality, APC (Advanced Process Control), e-commerce using the Internet and B2B (business to business). Communications across organisations using heterogeneous application systems

30

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis F Kehoe

integration has become a reality due to the use of protocols such as TCP (transport control protocol), HTTP (hypertext transfer protocol) and platform independent programming languages (e.g. Java). The first years of the 21 st century have witnessed the consolidation of technologies such as XLM (extensible markup language). The Next Generation Manufacturing project -NGM- [7] provided a description of information systems requirements to support the operation of manufacturing organisations facing an increase in competition and unpredictable business changes. The NGM framework proposed the creation of adaptive/responsive information systems to facilitate rapid response between enterprise partners and their suppliers and customers, enabling inter-enterprise integration. Enterprise integration has been defined as the discipline that connects and combines people, processes, systems and technologies to ensure that a manufacturing company can function as a well co-ordinated whole, by itself and with other organisations [9]. According to the NGM [7], in the future it will be the integration with other organisations that will enable manufacturing enterprises to survive. In a changing business environment, the information systems function of a company may deal with the problems of standardisation and integration of heterogeneous systems. Integration of information systems plays a significant role in the sense that legacy applications may be needed to keep an enterprise fully operational. The evolution of I'T in manufacturing has motivated researchers to develop a variety of means for classifying the use of IT in manufacturing. For example, Kathuria and Igbaria [10] provided a classification consisting ofseven major functional areas: product design, demand management, capacity planning, inventory management, shopfloor systems, quality management and distribution. Randall from Compass Consulting [11] suggested that investments and use ofIT/information systems in manufacturing organisations can be classified by the following: - Infrastructure covers the Internet, Intranet, databases and operating systems. According to Broadbent et al. [12] infrastructure is the enabling base of shared IT capabilities which provide the foundation for other business systems. - Planning covers MRp, ERP and APS. These are information systems applications for the assessment of materials and plant resources, business processes modelling and real time decision support. This element of the classification also includes applications used in design (e.g. CAD). According to Robinson and Wilson [13] ERP systems are one of the latest attempts to utilise the capacities of I'T to extend management control of the process of capital accumulation. From a technical point of view, ERP systems comprise application domain, back-office and transaction-processing systems [14]. - Execution covers workflow and data warehousing among other functions. This element of the classification includes resources that facilitate minute by minute transactions and links other data streams within manufacturing operations, both internal (e.g. ERP systems) and external (e.g. customers, suppliers and service providers). Also, execution covers applications such as CAM and CNC used in design and manufacturing processes.

Inform ation systems frameworks and their applications in manufactur ing systems 31

Produ cts/ Serv ices

Mate rials Manu facturing Organisation

Supp lier Requ irements

~~

........

.

..

Customer

Hequ irernent s/ Customer Op portunities

Applications, Solutions In terms of : -infrastructure -plann ing -execution

IT

Department

Figure 1. Information systems role in manufactur ing.

Data warehousing technology has emerged as one of the most powerful tools in deliverin g information to end users. A data warehouse offers integrated, historical informatio n that can be accessed by end users directly. The aim is to provide a consolidated view of information (both summary and detailed) to facilitate end user query tor management and decision support [15]. Figure 1 depicts the tradition al suppor t th e IT function provides in manufacturing enterprises. In this simplified model, the information systems departm ent is responsible for providing the applications/ soluti on s in terms ofinfrastructure, planning and execution. The organisatio n reacts to customer oppo rtunities by providin g the required services or products. Information systems applicatio ns/solutions used to suppo rt business processes link the IT function to the organisation. Ce rtainly, the adoptio n of new manufacturing paradigms may requ ire the IT function to impact not only the organisation alone but the interrelation betw een the organisation and its suppliers and customers. Information systems in manufactur ing are used to man age the bills of materials (parts needed to assemble a produ ct), inventory, and procurement , int egrating their management with production schedulin g, capacity planning, and job costing (calculating the cost of each product according to inventory, mach ine and work tim e needed). All these activities are typically coupl ed with related systems, including accounts payable, vendor management, RFQ (request for quo tation), order and delivery processing, and billing activities. Figure 2 shows a typical manu facturing information system, comprised of material requirements plann ing systems, bill of materials (production and reports) and mater ial compo nents data sto rage. 3.1. Infrastructure as an element of information systems in manufacturing

Infrastructure is an imp ort ant compo nent of information systems. Farbey et al. [16] in their benefits evaluation ladder identified the imp ort ance of infrastru cture . They

32

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis F. Kehoe

Entry of component Data changes

1 Online Queries

..

Bill-of-Materials Production

....._--,-----_.

..

I

Data Elements -Component numbe r, -Description -Unit -Unit cos,t. ...

Availability: Changes: Materials: Cost Changes :

+

Materials Requirements Planning Systems

Bill-of-Materials Reports

Figure 2. A typical information system used to support manufacturing enterprises. (From Information Systems and the Internet, Fourth Edition 4th edition by LAUDON. © 1998. Reprinted with permission of Course Technology, a division of Thomson Learning: www.thomsonrights.com. Fax 800 730-2215.)

explained that investments in infrastructure are intended to provide the foundation upon which subsequent value adding applications can be built. Infrastructure investments provide a general capability but may not be targeted at any specific application. Because investments ofthis type do not provide direct benefits to the business, they may therefore not figure prominently in the senior management's value systems. Investments justification needs to demonstrate the link between the infrastructure and subsequent projects whose value to the business can be demonstrated. Moreover, investments in IT infrastructure are seen as necessary in order for the company in question to respond rapidly to any moves by competitors. According to Saaksjarvi [17), investments in infrastructure are long-term commitments accounting for a considerable share of the total IT budget. The researcher emphasised that infrastructure helps the company to integrate and accumulate earlier developments in transaction processing (e.g. Decision Support Systems and strategic information systems). Broadbent et al. [12] defined IT !information systems infrastructure as the base foundation of budgeted-for IT capability (both human and technical), shared throughout the organisation in the form of reliable services. The focus of investment justification

Information systems frameworks and their applications in manufacturing systems

33

turns from specific applications to the capability of an infrastructure to support a range offuture activities. However, IT infrastructure can be a constraint where systems are not compatible, or where inconsistent data models have been used in different parts of the business. The same researchers concluded that knowledge of the role ofIT infrastructure capabilities remains largely "in the realms of conjecture and anecdote". Flexibility for information systems infrastructure is also an important issue. In fact, evaluation turns from specific applications to the capability of an infrastructure to support a range of future developments. According to Hanseth and Braa [18], benefits from IT infrastructure only accrue through business applications, infrastructure cannot be designed and managed in the same way as information systems, as it is created by several actors and can thus be changed only gradually. However, the contribution of IT can be directly measured through the support different applications provide for business processes. 4. ELECTRONIC COMMERCE AND MANUFACTURING INFORMATION SYSTEMS

Internet-based e-commerce applications are in part aimed at enabling inter-enterprise integration. Kettinger and Hackbarth [19] highlighted that e-commerce is about rethinking business models by exploiting information asymmetries, leveraging customer and partner relationships and finding the right fit of co-operation and competition. Their work showed an evolutionary process faced by organisations in the areas of e-commerce strategy, business strategy, scope, payoffs, levers and the role of information. Table 2 shows the areas involved in this evolutionary process. The levels ofdevelopment shown in table 2 represent attributes required by manufacturing organisations to succeed in an environment in constant change. For example, in the scope area, cross-enterprise involvement collaboration is compatible with the

Table 2 Levels of development of e-commerce in organisations Area e-cornmerce strategy

Levell

Level 2

No EC strategy

EC strategy supports current business strategy

Level 3

EC strategy supports breakout ("to be") business strategy Business EC not linked to EC strategy EC is a driver of strategy business strategy business strategy Scope Departmental/functional Cross-functional Cross-enterprise involvement orientation participation interconnected (customers, suppliers and consumers) Payoffs Unclear Cost reduction, business Revenue enhancement, support and enhancement increased customer of business processes satisfaction. drastic improvement In customer service. Levers Technological infrastructure Business processes People intellectual and software applications capital and relationship Role of Secondary to technology Supports process efficiency Information asymmetries information and effectiveness used to create business opportunities

34

Adri an E. Co ronado M ondragon. And rew C. Lyons. and Den nis F. Kehoe

attribu tes of enterprise integra tion and close supplier relationships emphasised in manufacturing paradigms such as agile manufacturing [20], or the attribute of customer satisfaction in the area of pay~tJS is com patible with the attr ibute of satisfaction of custom er requi rements in TQM or lean manufacturing. The attribute of leveraging the impact of people and information . Kidd [21], is addressed in level three of the Levers area. Th e foundations of e-commerce systems are the software compo nents that deliver business-to-business (B2B) or business-to- consumer (B2C) services [22]. A close interaction between custome rs and suppliers is essential for manufacturing organisations in a business environment in constant change. In the view of Gunasekaran [23]. the main motivation behind e-commerce is to improve the response time to custom er's demand as quickly as possible by directly collecting the customer's requirements throu gh an online communications system . The primary benefit of ED I to businesses is a considerable reduction in transaction costs by impro ving the speed and efficiency of filling orders. E- commerce is a digital platform that pervades all functions and departments within a company. According to Gunasekaran [23], e-cornmerce can ensure higher quality, reduced costs, and increased respon siveness. The researcher declared that e- commerce applications are intended to provide the capabilities to manage the supply chain, that is the ability to deliver produ cts faster, sho rtening the cycle from order to cash receipts. The addition of e- commerce in the development of inform ation systems in manufacturing is compatible with concepts that eme rged dur ing the decade of the 90's like the extended enterprise [24], and the extended supply-c hain [25]. In fact, e- commerce through the utilisation of the Internet may facilitate the seamless integratio n of suppliers and custo mers. According to Kasarda and R ondin elli [26] tod ay, even small and medium-sized enterpr ises increasingly rely on internation al networks of suppliers, distributors and customer s to improve their global com petitiveness. The use of e-commerce to integrate operations with customers and suppliers may ease respo nding to changing custo mer demand, facilitate adapting to a changing business climate and flexibility to redesign their pro cesses towards suppliers and customers w hile enabling decentralized ope ratio ns. Furthermore, the adop tio n of Internet-based e-commerce application s sho uld be within easier reach of smaller manufacturers, compared to other more expen sive applications. However, the ubiquitous accessto information and acquisition of techn ology necessarily demands adequate management policies to deliver any sort of advantage. 5. VIRTUAL ORGANISATIONS AND MANUFACTURING INFORMATION SYSTEMS

Day by day operations in manufacturing organisations require the integration of informatio n systems through scatte red manu facturing plants. Th e utilisation of Internet techn ologies can brin g togeth er applications related to resource planning (MR P, ERP and cost accounting systems); manu facturing execution (facto ry level coo rdinating and tracking systems) and distribut ed control (floor devices and process control systems). T his integ ration is th e foremost step towards the consolidatio n of operations to form virtual organisations.

Information systems frameworks and their applications in manufacturing systems

3S

The wide utilisation ofInternet based e-commerce supported by an IT !information systems infrastructure and applications for execution and planning, are key to the development ofvirtual organisations. Reid et al. [27] described that a virtual enterprise is conceived when a need is recognised in the marketplace and a business· objective or set of objective(s) is/are established. To conceive a virtual enterprise it is important for organisations to understand customer expectations and what it will take to satisfy them. An enterprise is created when relationships are established to eventually bring together the requisite competencies. Different researchers [28] have provided guidelines to the formation of virtual organisations. The virtual enterprise concept has been used to characterise the global supply chain of a single product in an environment of dynamic networks of companies engaged in many different complex relationships [29]. Manufacturing organisations need to implement information systems able to cope with several technical constraints such as concurrent engineering, inter-network applications, hardware heterogeneity, software for application communication and time constraints. Also, a sound methodology will be required to, if necessary, re-define business process, the states of synchronisation, the way collaboration is achieved and once a virtual enterprise has been formed under which organisation it will be managed. 6. PARADIGMS SHIFTS IN MANUFACTURING

Information technology/systems occupy a relevant place in the literature of a significant number of improvement paradigms for manufacturing such as agile manufacturing [31] and mass customisation [32,33]. Indeed, in almost all improvement initiatives for manufacturing organisations, information systems play a role. For example, JIT (Just-In-Time) emphasises minimising (if not eliminating) waste in the form of inventories in order to reduce costs. JIT empowers employees to check quality at the source and ensuring that products are consistently made to standards. Some information systems applications have been classified as JIT. However, some researchers argue that JIT is more of a philosophy than just another computerised planning system intended for repetitive environments with stable schedules, narrow product ranges and standard items [10]. In the early 90's Business Process Re-engineering (BPR) was the focus of attention in the manufacturing industry. BPR is essentially supported by IT. Then, lean thinking gained the attention of manufacturing managers. Lean means doing more with less resources; banishing waste, Womack and Jones [30]. Information systems have been identified as key enablers of concepts such as the extended and virtual enterprise [34] and hence, they are considered to be important components of agility. According to the originators of the concept of agility [31], agile manufacturing organisations operate in dynamic business enviromnents. Success in a dynamic business environment requires information systems that enable the organisation to react quickly to emerging customer opportunities. A dynamic business environment is typified by rapidly emerging customer opportunities. Researchers in the fields of industrial engineering and operations management have remarked upon the importance of a dynamic business environment in shaping all

36

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis f Kehoe

the activities of manufacturing organisations [31]. Manufacturing organisations need to grasp to those emerging customer opportunities to their advantage. Another paradigm in manufacturing that has received attention is Build-to-Order (BTO). The concept of build-to-order does not imply mass customisation per se. Mass customisation is dependent on the adoption of BTO schemes that will enable the production of customised goods or products. Indeed, the capability to build goods without any sort of delay is a component of any mass customisation initiative aimed at meeting customer needs in the shortest time possible. BTO may provide manufacturers with the capability to grow in a business environment represented by tough competition and variation in customer requirements. Given the importance of IT /information systems to support the concept of many manufacturing improvement paradigms, Huang and Nof [35] classified the impact of modern information technologies in three categories: a) speeding up activities, b) providing intelligent and autonomous decision-making processes; and c) enabling distributed operations with collaboration. According to them, utilisation of IT /information systems enables the creation of -

New manufacturing/services. Strategic information and knowledge management. Enterprise integration and management. Virtual enterprise. Virtual manufacturing/services. Concurrent engineering. Rapid prototyping.

The same researchers found that IT improves enterprise activities in different areas, including:

Collaboration: Distributed designers can work together on a common design project through a computer supported collaborative work (CSCW) software system. Decisions: Powerful computation allows many simulation trials to find a better solution in decision making. Logistics: Information networks monitor logistics flows ofproductivity and packages. Recovery: Utilisation ofartificial intelligence techniques (e.g. knowledge based logic) to improve the quality of activities. Sensing: Input devices (e.g. sensors, bar-code readers) can gather and communicate environmental information to computers or people. Partners: A computer system in an organisation may automatically find co-operative partners (e.g. vendors, suppliers and subcontractors) to fulfil a special customer order. 6.1. IT and information systems for mass customisation

Da Silveira et al. [32] have provided examples of IT /information systems supporting mass customisation. Indeed, information systems have been defined as enabling

Information systems frameworks and their applications in manufacturing systems 37

technologies supporting mass customisation. The researchers provided examples that include Motorola using CIM-related technologies (such asCartesian and gantry robots) to implement two MC factories. Another example cited by Da Silveira et al. [32] is Perkins Diesel. The company based their MC system on a hybrid CAD/CAE (computer-aided engineering) system with flexible manufacturing assembly lines. Computer numeric control (CNC), flexible manufacturing systems (FMS), communication and network technologies such as computer-aided design (CAD), computer-aided manufacturing (CAM), computer integrated manufacturing (CIM), and electronic data interchange (EDI) are widely used in all business units of Perkins Diesel. The researchers emphasised that the main motivation behind the extensive use of communications and networks based on IT is to provide direct links between internal units (e.g. design, analysis, manufacturing and testing) and to improve the response time to customer requirements. 7. DEVELOPMENT OF INFORMATION SYSTEMS IN MANUFACTURING

The types of information systems used in manufacturing organisations can be classified in two major groups: In-house development of systems using Rapid Application Development (RAD) and purchase of systems commonly known as Commercial offthe-shelf applications (COTS). RAD has aimed at fast development and high quality of products through: -

Requirements identification using workshops, Prototyping and early and continuous user testing of designs, Re-use of software components, Compliance to a fixed calendar of activities, Establishing informal communication channels between team members

Some software development firms offer products that provide some or all of the tools for RAD software development. These products may include requirements gathering tools, computer-aided software engineering tools, tools for prototyping, tools for communication among development members, language development environments such as those for the Java platform and XML and testing and debugging tools. On the other hand there is no guarantee that RAD developments would not face budget overrun, lack of communication between developers and behind schedule activities. Certainly many organisations may avoid the development of their own information systems, turning themselves to commercial applications offered by different software vendors. COTS, commercial off-the-shelf applications describe ready-made products that can easily be purchased and implemented. Supporting this fact, Geffen [8] emphasised that given the complexity of MRPII systems and the cost of developing them, most MRPII systems are off-the-shelfsoftware. Yet, although the code in these systems is seldom modified by the buyers, these systems do undergo extensive customisation before being successfully deployed. Whatever the type of application used by an organisation, RAD or COTS, information systems development consists of a cycle of seven stages that usually include process workflows, business modelling, requirements,

38

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis F. Kehoe

Figure 3. Information systems development cycle.

analysis and design, implementation, test and deployment [36]. Figure 3 depicts the information systems development cycle. The information systems development cycle is augmented with the stages of maintenance and evolution. The last two stages represent serious challenges for many organisations. In the case of maintenance, the information system should have been developed in a way that guarantees that the related managerial and technical activities ensure meeting organisational and business objectives in a cost-effective manner. Evolution should guarantee that further changes to customer requirements can be accommodated. Moreover, according to Hevner et al. [37] in an e-commerce environment many companies try to juggle the need for projects to meet specific customer needs and the desire to create a fundamental product architecture that will produce a more stable future growth. The adoption of new improvement paradigms in manufacturing will directly affect the complexity of developing information systems solutions. Indeed, software development organisations and in-house teams involved in the development of information systems for e-commerce have to face challenges prompted by a business environment in constant change and demanding customers with ill-defined requirements. The outcome of that situation involves priority conflicts between development teams and customer projects. According to Hevner et al. [37] rigorous requirements for security, performance, reliability, portability and availability are essential in order to achieve high levels of customer satisfaction. The researchers stated that many e-commerce companies have moved to a software development environment where they simultaneously pursue product lines (the software components that are tailored to meet a market need in general) and projects (the software components designed to meet the needs ofa specific customer). The techniques developed for building and deploying information systems should pay emphasis to identifying the conditions in the marketplace and the requirements

Information systems frameworks and their applications in manufacturing systems 39

I I

~ Business strategy I I

+

Manufacturing strategy -c

.... .... ........ ....

Product Planning and management

. . ...-

....----

Information systems planning and develo p-ment

Figure 4. Information systems development for manufacturing.

of their customers which ultimately shape the functionality of the application. Based on the points highlighted by Hevner et al. [37], figure 4 shows that any information systems planning and development process is the direct consequence of clearly and previously defined manufacturing and business strategies. Enterprises will not only face different conditions when developing information systems in-house or customising the acquisition of a particular commercial off-theshelf application. Indeed, organisations will have to face the process of deciding the acquisition of information systems. Figure 5 depicts the acquisition process relevant for information systems that will guarantee meeting the needs of the manufacturing organisation. Figure 5 depicts the involvement of enterprise business units such as engineering, personnel, information systems, marketing, finance and production (manufacturing and operations) in a decision process designed to first meet the immediate needs of manufacturing operations and then meet long-term organisational needs. Manufacturing organisations require the use of tools that guarantee the translation of business needs and requirements into the development of e-commerce information systems. The Unified Modelling Language (UML) is seen as an effective way ofmanaging requirements in information systems development. The adoption of requirements management is seen as a solution to the ongoing problems of systems development. The IEEE 833 standard defines a requirement for information systems as a capability that the system must deliver. According to Oberg et al. [38] a requirement is a capability of the system needed by the user to solve a problem or achieve an objective, a

40

Adrian E. Co ronado Mondragon, Andrew C. Lyons, and D ennis F. Kehoe

Decision making

\.._---~

V

~-----~

1

Meet immediate needs in manufacturing (on the shop floor)

Long -term organ isational needs Figure 5. Infor mation systems acquisition process.

capability that must be met or possessed by a system or system component to satisfy a contract, specification, standard, or other formall y imp osed documentation . 8. EXAMPLES AND HIGHLIGHTS OF INFORMATION SYSTEMS DEVELOPMENTS IN MANUFACTURING

T here is consensus among academics about the enabling capabilities of information systems within manufacturing improvement initiatives. Researcher s and practitioners like DeVor et al. [39] have stated that recent advances in information networking, processing and e-commerce are rapidly expanding the capability to achieve powerful interactive links among organisational and functional units of the manufacturing enterprise. The researchers discussed how the Internet and the evolution of global networkin g capabilities enable the creation of an architecture for an open data network . Impro vement programmes for manu facturing will becom e a con sumer of such information infrastru cture function ality, and will focus o n building information tools and resources. This approach focused mainly on the techni cal difficulties to join heterogeneo us information systems. Future models and frameworks need to consider no n- technical facto rs behind the performance of information systems in organisations wishing to participate in the virtual ent erprise.

Information systems frameworks and their applications in manufacturing systems 41

A significant number of works available in the literature have placed emphasis on technical factors regarding the development of information systems support of manufacturing operations. For example, Song and Nagi [40] described manufacturing improvement paradigms making use of modern IT to form virtual enterprises, swiftly responding to changing market demands. The researchers proposed the creation of an Agile Manufacturing Information System with the idea of providing partners with integrated and consistent information. Considerations for the system included partner information interoperability across companies, information consistency across partners in the virtual enterprise, partner policy independence and autonomy maintenance, and finally, open and dynamic system architecture. The researchers proposed in their model that each participating company becomes a node in a network linking companies to the virtual enterprise. Each company has its own systems (CAD, MRp, CAPp, DBMS) and works as an autonomous unit. Also data and workflow hierarchies that would enable organisations to share information and process queries and requests were contemplated in this model. The proposed framework does not take into consideration the current level ofperformance of the information systems used in participating companies. Also, attributes like IT skills of employees have not been considered in this model. Moreover, for the average SME the formation of virtual enterprises and collaboration with other organisations through information systems is less developed than other sectors like retailing and financial services. On the same theme, Cheng et al. [41] developed an information systems architecture based on AI (artificial intelligence) and the Internet. This work was deployed to enable remote and quick access to design and manufacturing expertise. The researchers recognised that improvement initiatives in manufacturing are primarily business concepts but new technology is still one of its most important driving forces. Moreover, the researchers provided a scenario where the Internet is used to speed up information flow in a product development cycle and thus achieve reduced development time and costs. Bullinger et al. [42] developed an integration concept for heterogeneous legacy systems. Legacy systems are integrated into a company-wide IT architecture through the encapsulation of these systems into several business objects that can be re-used and transformed into an object-oriented architecture. The proposed architecture relies on the use of middleware standards for the integration oflegacy systems. Other researchers like Whiteside et al. [43] have investigated the use and development of middleware and distributed computing to develop robust information architectures that can be used in the integration of physically distributed design and manufacturing facilities within an enterprise. Researchers have recognised the importance of robust information architectures to support the success of adopting new manufacturing paradigms. Research has continued with the development of seamless enterprise data management solutions in support of manufacturing environments [44,45]. Nowadays XML is a mature tool used to integrate heterogeneous legacy systems. Figure 6 depicts the use of XMLlJava applications used to insert/extract data inlfrom web servers. Zhou et al. [46J developed an information management system for production planning in virtual enterprises. The researchers presented a distributed information

42

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis F. Kehoe

~~ Web Server

0

JavaIXML extraction application

DB

$) Web Server

Figure 6. Use of XMLI]ava applications to insert/extract data.

management architecture for production planning and control in the manufacturing of semiconductors. The proposed architecture is based on the Internet and the use of an Object Request Broker. Herrman et al. [47] presented the information required for three functions of agile manufacturing: prequalifying partners, evaluating a product design with respect to the capabilities of potential partners, and selecting the optimal set of partners for the manufacture of certain product. The implemented model is used as part of a decision support system for design, evaluation and partner selection. The development and use of sophisticated IT !information systems applications like the examples previously shown confirms the importance of technology in the future of manufacturing not only at the managerial level but also at the shop floor level. Indeed, manufacturing operations in the shop floor will continue to be influenced by the adoption of e-strategies in automation systems, enabling transparent information management, real time control and condition monitoring across distributed industrial systems. In the late 90's, agent technologies started to impact manufacturing information systems. According to Turowski [33] a software agent is defined as an autonomous problem-solving unit that may collaborate with other agents to achieve optimised results in a specific problem area. In a manufacturing environment characterised by the use ofagent technologies, suppliers and manufacturers will require sharing information systems applications that will provide them with at least a proprietary interface for exporting and importing data in a non-standard format. Procurement of all parts from suitable suppliers and resulting demand reports may be transferred to agents. Moreover, the foundation ofe-strategies in shop floor automation lies in the integration of networking and agent technologies developed on open architectures, facilitating the automation oflarge scale distributed industrial systems.

Information systems frameworks and their applications in manufacturing systems 43

9. A BROADER SCOPE OF INFORMATION SYSTEMS IN MANUFACTURING

Information systems have been seen as an important tool to support the needs of manufacturing organisations facing the pressures of a turbulent business environment. Furthermore, it appears evident that the boundaries separating applications of being exclusively for manufacturing, logistics or purchasing operations have disappeared. Indeed, state of the art applications are modular in nature, and once expanded may cover whole departments and business units of manufacturing enterprises. Figure 7 depicts the integral approach of information systems covering not only configuration and procurement but production and distribution as well. In the diagram it is possible to appreciate that request and quotations are originated at the customer level. Further on, request and negotiation activities take place between the manufacturer and its firsttier suppliers, and then between first-tier suppliers and second-tier suppliers and so on. From the customer to lower tiers of suppliers, production and distribution, involves the placement of orders followed by the delivery of parts of components upstream in the supply chain. Several information-related tools have emerged in recent years to help develop more robust information systems that will enable manufacturing organisations cope with reacting to customers' needs, reduced product life-cycles, reduced cycle times, cost cutting and rapid product development cycles among others. Internet-based, e-commerce applications linking manufacturer, customers and suppliers made possible to overcome difficulties associated with the adoption of solutions such as ED!. Investing in EDI only pays off when almost all partners use it [33]. Indeed, high investment costs for the acquisition of ED I meant that SMEs were excluded from adopting it. The use of Internet based tools in manufacturing has enabled the design of CIM interface systems reducing communication efforts from quadratic to linear complexity and by allowing the exchange of design data among manufacturing organisations based on the use of a previously defined language interface. Active agents give the opportunity to automate a significant number of operations linking systems across the Internet. The functionality specified on the agents will certainly determine the impact and effectiveness of the application as a whole. Components ofintelligent agents have been designed to address the needs ofmanufacturing organisations, including the definition of knowledge bases, problem solving directives and communication components. 9.1. Information systems role in improving manufacturing organisations performance

Present manufacturing improvement initiatives are highly dependent on the seamless integration of internal and external units provided by the use of efficient information systems. Indeed, internal units comprising design, engineering, manufacturing, all require seamless integration using information systems. Furthermore, the integration with external units, represents the link between customers and suppliers enabled by the use of information systems.

.... ....

I

I

,

n tier supplier

-I

parts & components

-

1_ . • •

order

negotiate

I

order

I First tier suppl ier

r.

order

parts & co mpon ents

I-

Production & Distribution

First tier supplier

·1•

·1 negotiate

Configuration & Procurement

negotiate

•

parts & components

Second tier supplie r

Figure 7. Manufacturing information systems integral approach.

-z

0 u.

II:

:E

<

i=

0

Z

>VI

VI

l-

w

VI

:E

•

L . . .=ISecond tier supp lier I~

-----I

n tier supplier

request quotati on

Manufa cturer

Manufacturer

I

delive r oroduct

: 1 order

~ otter

Customer

Customer

Information systems frameworks and their applications in manufacturing systems 45

According to Da Silveira et al. [32] information systems bring the opportunity ofdesigning an effectively decentralised control architecture that will support the adoption offuture manufacturing improvement initiatives such as mass customisation, BTO and agility among others. The control system will be composed of autonomous components with the purpose of reducing complexity, increasing flexibility and enhancing fault tolerance. Also, further research needs to be accomplished in the area of enterprise modelling. Open systems architectures for computer integrated manufacturing and information systems integration like CIMOSA (CIM Open System Architecture) may be considered as background for future developments. Indeed, enterprise modelling encompasses modelling, analysis, design and implementation of integrated information systems. Certainly, any enterprise modelling methodology will need to consider information systems issues such as overall system architecture, product design, project management, software specifications, including data models and non-technical factors as well. The potential benefits of any application supporting the needs of manufacturing organisations may depend significantly on the development of an information management infrastructure based on the integration ofdifferent standards or tools, including the Internet, STEP (Standard for the Exchange of Product Model Data) and full support ofthe object-oriented paradigm. Turowski [33] remarks highlighted that information systems applications used to support e-cornmerce can be seen asa competitive strategy requiring that different production types be employed simultaneously-especially single-item production with its normally high requirements for inter-company interactions. 10. INFORMATION SYSTEMS ENTERPRISE-WIDE SUPPORT: AN EXAMPLE

Information systems in manufacturing organisations cover not only manufacturing operations but also, finance, human resources and supply chain. Emerging tools and protocols are making possible enterprise integration but also integration with external enterprises as well. For a long period of time information systems were seen as islands where the information generated could not be retrieved by other applications. The intensification of competition, emerging market opportunities and changing customer requirements has motivated firms to start utilising information systems to influence processes comprising procurement, supply chain management, logistics and manufacturing operations. Manufacturing is an information dependent activity. Indeed, it depends not only on the efficiency of manufacturing processes but on the quality of the information received, processed and transmitted. Erroneous data may lead to the generation of wrong production schedules, wrong BaM, wrong purchases and so on. Indeed, erroneous data is accountable for problems experienced in manufacturing such as surplus inventory and excessive lead-time. This has motivated researchers to study fluctuation and amplification of demand from the downstream to upstream of the supply chain, a phenomenon known as the bullwhip effect. Researchers have found that the source of such fluctuation and amplification of orders and inventory is mainly due to the lack of

46

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis F. Kehoe

sharing of production information between enterprises in the chain [48]. Information systems are the facilitators of information flow. Information handling becomes critical when manufacturing organisations start introducing initiatives such as flexible manufacturing, lean thinking and agile manufacturing. In fact, enterprises in manufacturing sectors such as the automotive industry have been trying to reduce costs by building tight links with their suppliers. The introduction of sequencing of operations involving first-tier and sometimes second-tier suppliers pushes to the limit the use and the reliability of the information required. To emphasise the importance of information systems in manufacturing and supply chains, it was considered convenient to present the case study of an information-intense industrial environment. The company participating in the case study is a mid-volume manufacturer of midluxury vehicles. Indeed, the vehicle built is a very complex product. Thousands of combinations comprise the options available to the final customer. Currently, it takes 14 days for a vehicle to leave the plant from the time it was scheduled for production. Annual production of vehicles is aimed at 60,000 units. 10.1. Information dependency and intensity

The activities developed for this study involved using Value Stream Mapping to represent physical operations and input/output diagrams for information flow. Data sets were used to record information on the product, volumes, market and manufacturing operations. Data sets were seen as a repository for information collected during the fieldwork, and as a checklist against which required data could be collected. The Value Stream Mapping methodology presented by Rother and Shook [49] was employed to identify the value stream of study. From the analysis, the seating system stream emerged as one value stream adequate for the objectives of this work. Particular characteristics of the seating systems of these vehicles include: (1) seats are independent modules, (2) with complex assembly processes, (3) with a complex sequence of use during vehicle assembly, (4) multi-tier in their own right and (5) very costly. Moreover, seating systems for these vehicles cover up to three tiers of suppliers. Figure 8, depicts the supply chain presented in this example. The seating systems manufactured for these mid-luxury vehicles have the following options: two-basic styles (classic and sports), two different materials (cloth and leather), six different colours, plus power, safety and adjustment options. Deliveries from the 2nd tier to the l " tier supplier are in batches of 28. Deliveries from the 3rd tier to the 2nd tier supplier are in batches of 20. Only the 1st tier supplier manufacturing facility is based next to the vehicle assembly operations plant. The current offset lead time between the yd tier supplier and the primary demand at assembly vehicle operations is 3 days. During the study it emerged that non-value adding time is skewed towards the upstream processes, especially raw material storage and inventory analysisestimates showed 22 days worth ofadditional stock. Value adding time was 12.2 hours.

.... .....

<,

Figure 8. Manufacturing organisations supply chain .

3'" tier supplier Slitched headres t covers

Seat tracks

2'" tier supplier

Seat headrests

2'" tier supplier

/

<,

1· tier supplier Seat Assembly

Vehicle Assembly Opera tions

48

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis F. Kehoe

10.2. Information flow and operation of the supply chain

The interface from vehicle assembly to seat assembly is demand driven. Assembly of a unique seat is triggered by the launch of its destination vehicle into the final assembly sequence, at which time the actual seat requirement is sent to the first tier supplier via a sophisticated broadcast system. Previous to the broadcast of the actual seat requirements, aggregated daily seat requirements have been communicated to the supplier via an electronic file. Each day that file shows the next ten daily requirements, followed by a further forecast requirement in tentative weekly and monthly buckets. The first tier seat supplier uses the information from the file to run its own internal material requirements planning system. The file is loaded each day, and once per week the MRP is run. The suppliers schedules are produced for each of the first tier component suppliers. In the past, schedules were sent to the suppliers via FAX, nowadays schedules are accessed by the suppliers via a web-based information system. These schedules normally contain daily requirements for the following week, as well as more tentative forecast requirements in weekly and monthly buckets. Figure 9 illustrates the flow of information observed in this example. The diagram presented in figure 9 shows information systems involvement at an inter-enterprise level. In the manufacturing industry, the flow of information along the supply chain is as important as the flow of materials. To guarantee a reliable flow of information, manufacturing organisations have installed fibre optic links between them and their customers and their suppliers. In the case presented in figure 8, the first-tier supplier has a fibre optic link to vehicle assembly operations. The first tier supplier runs its own MRP system once a week and the output IS sent to the second tier supplier. Information systems involvement covers the inter-enterprise level (as presented in figure 8) and shopfloor systems as well. Inputs to the systems are provided by sensors placed at each of the workstations located along the assembly line and by buttons and signals triggered by the workers assigned to each workstation. Information systems control the flow of components along the assembly line based on the final assembly sequence provided by the vehicle manufacturer. Figure 10 depicts information systems controlling the operations involved in the assembly of seating systems. The assembly line of the seats shown in figure 10 comprises ten different operations. These are sequenced operations and each ofthem is dependent on the final assembly file received from the manufacturer. The LCD displays situated along the assembly line tell the operators the number of the sequenced seat to be built. The seat components (e.g. headrests, tracks, frames, etc.) used in the assembly process have been put in sequence at the company's warehouse. The final assembly sequence file is transmitted via fibre optic link, giving the seat manufacturer a time period to deliver the assembled seats to the point of fit in the vehicle assembly line. In the diagram shown in figure 10 the displacement of the seats being assembled from one workstation to the other is directly controlled by the PLC. Moreover, the PLC is wired to buttons and signals triggered by the operators assembling the seats. Other activities undertaken to ensure the smooth

'I:>

....

,

~

Ir

Figu re 9. Flow o f info rma tion in th e manufacturing supply chain.

-----

L.-

~

I

I

Kanban calculator

MRP

11 L Sf tier supplier

I

~

: r

I I I I

I

I

_

I

I

I

~LCOnfigUrat()rJ

I BO~

r------ - - --

•

•

~

o

Tn

C:l lnnliorC!

Weekly sched ules fID! ed

Seat assembly

--- --- ------------------------

•Production SChl dule 10-f1ay requirem nts

Requirements Calculation Netted Against float

Vehicle Assem bly Operations

Stock down dated

requiredin 4 hours

On-wheels, gate release

Launch build

Sequ need Order bank

° Broadcast sequence

Automat ic Storage and Retrieval System

------------ - -- - - - - - - ---

Press shop, body shop, paint shop

l

• n","<'

I

SUbstitutiod control

I

~~ence ad ·~

--

--

Sequen ced Daily seqmei t

l----

-

reference available 6 days

Develop daily segme nts

°Fi/esh ows next 9,000sequences

° Se. wence

From actual dea ler orders, assembly plant ope rating plan, seque ncing rule ;

r------------ ------------------------------ --- ---------------- ----------------From net weekly volumes

<:> '"

led display

Figure 10. In for m atio n syste ms invo lvemen t.

seating systems assembly line

led display

Icd display

Icd display

led display

h . ...

database

mi ni computer

I

assembly op erations

riiI-- link to veh icle

Information systems frameworks and their applications in manufacturing systems 51

Table 3 Seat assembly major operations controlled by information systems Number of operation

Description

ONE

Place in position set of pallet guides Empty (adjust) Cushion assembly Empty (adjust) Back seat assembly Empty (adjust) Cut labels and check sequence Load seat

****

TWO

****

THREE

****

FOUR FINAL

Time duration (sec) 70 0.0 174.1 0.0

132.3 0.0 56.6 19.4

Upper levels

(e.g. corporate accounting , prod uction scheduling. Procurement, invoices)

Manufacturing process level (e.g. sequencing. control , human- machine co-operation )

Figure 11. Information systems involvement at different levels in the enterprise.

running of the assembly line involves database recording of the codes of the seat sets manufactured. A minicomputer runs the system that processes (breaks down) the files received from vehicle assembly operations via fibre optic. Table 3 shows major operations involved in the assembly of passenger vehicle seats. Each operation is done in coordination with the final vehicle assembly sequence. The use of information systems to control the operations comprising an assembly line represents the use of information systems at the manufacturing process level. This basic level comprises the interaction of devices and machines controlled by information systems with the operators working in the assembly line. In the manufacturing sector, it becomes evident that the performance ofinformation systems at the tool level will have a direct impact on the upper levels of the organisation. Upper levels for information systems comprise manufacturing planning and scheduling, corporate accounting and finance, procurement and human resources. Figure 11 depicts the involvement of information systems at the manufacturing process and the upper levels. The flexibility ofmanufacturing operations enables the possibility ofassembling seats with several options available. In fact, the adoption of flexible manufacturing is the first step towards the adoption of information systems that will support synchronised operations within the enterprise business units and with external enterprises.

N

""

Q.....'

Q~

'1-

•

i

Y

Q~

'"

Q.....

I

i

...., ,~

'\

Q.....

,1\

~

"

" i

~ Q.....

I

I

I

1\ I\ I -- I

~ Q.....

I ,\:' '\

oI

50

I

'0 Q.....

I

'I /

I

Q.....

Q)

\

Q~

"

,'1-

Q.....

'l ,'

" i

,<:>

Q~

i

f\.

I

II

I\

I \

t

Figure 12. Comparing 3'd tier supplier deliveries against primary demand.

c

~

200

250

,~

Q~

i

I

Q.....

, to.

,,"

Q~

i

Y . I' ''t: '

i

'

,

-cto ~ ,'0 ,Q) Q..... Q..... Q..... Q.....

If

"', / '\

<:I'

<£>

'r----

'1-' Q.....

'i "

If

-

-

3fd llef supplier delrven es

- - - Daly Kanban,lrom 1st tier 10 2nd 1181' supplier

- - - P,wnaryoem and

Information systems frameworks and their applications in manufacturing systems 53

10.3. Analysis of information accuracy

The management ofinformation used at the manufacturing process level (e.g. shopfloor systems linked to Pl.C's and other devices) did not represent a main concern for manufacturing enterprises. The main problem was with the data used to generate production plans, a problem closely linked to the accuracy of information. Indeed, the purpose of analysing the accuracy of information has been to isolate the effect of inaccurate demand information, and to determine the extra stock held in the pipeline to cover for this. This can be achieved by measuring the accuracy of demand information at various points in the supply chain. One particular trim option has been chosen to illustrate the problems generated by information inaccuracy. The problems observed in the analysis of information accuracy are shown in figure 12. The graph shows significant peaks with deliveries of over 150 units twice and over 200 units once. On the other hand, building of vehicles with that particular trim option never reached 70 units any single day of that month. The above results are a direct consequence of the flow of information currently in place in the supply chain and suppliers' batching policies. The problems observed prompted the development of a prototype information broadcasting architecture for the supply chain under study. Under this alternative all orders already launched into build are gathered in a single file and presented to the suppliers (i.e. suppliers access the file from a URL, Uniform Resource Locator, using a web browser). This scheme represents 3 days of production which is much more accurate than the original build plan being used in the supply chain. Rather than using a "go-see" method of production scheduling, the early release of this launch broadcast enables 2nd and some 3rd tier suppliers to redesign their operations so that manufacture and assembly can be driven by required rather than forecasted build. Implications anticipated from the adoption of the proposed information broadcasting architecture to a build-to-order scheme include: (1) using electronic channels to broadcast information along different tiers ofsuppliers, facilitating customers the modification of products and (2) lower tier suppliers may have the opportunity of getting involved in handling product variety. The alternative information architecture specified for the supply chain under study is depicted in figure 13. The architecture presented in figure 13 has the potential ofmaking 100% transparent the flow of information and material along the supply chain. The configuration works in the following way, deliveries from the 3rd tier supplier become the stock of the 2nd tier supplier the following day. The stock at the 1st tier supplier is calculated as the difference between the stock in the 1st tier supplier the day before minus the stock available in the 2nd tier supplier the day before plus the current stock available in the 2nd tier supplier. The results of this configuration are shown in figure 14. Furthermore, the current stock at the l" tier supplier has the potential ofbeing altered significantly if the output of the MRP system is followed. This implies that the stock at the 1st tier supplier is equal to the stock available the day before minus the number of components required by vehicle assembly operations plus the number of components received in batches of 28 by the 2nd tier supplier.

<, 21'ld ner supplier Seat tracks

2rw:1 1ier supplier Seat headrests

",

<,

.... ---- -

.------------- ------ --------------

Figure 13. Information broadcast at 011 levels of the supply chain.

covers

31'1l tier supplier Stitchedheadrest

1 lier supplier Seat Assembly

1l

K ~~

~~

~~

~ ~

~~

~

Vehicfe Assembfy Operations

--- ------------.

---------------.... ®

Broadcast to 1- and lower tier suppliers

'" '"

~

<:J....'

o1

~

<::J.. .

~

<::J"'"

~

<:J....

~

<:;J....

~

<:;J....

~

,c

Q"'' ' Q"'' ' Q"'''

~

<:J; ....

-,

~~

~

<:;J....

/ '--A--v'Jr

,~

Q"'''' Q"'' '

,~

....."J

....ro

Q#> Q"'' ' Q"'' '

....""

<:;J....

o

,~

<{i>

~'

Q"'' ' Q"'' ' Q"'' ' Q"'' '

,~

I

+-1- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -1

SOl "

100

Figure 14 . Prim ary dema nd against 3'd tier deliv eri es with wide access to data.

C

+-1- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ----1

1SO +-1- - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - - - -1

200

2SO TI- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --...,

-

-

suppli<

7

3rd tier supplier deliveries

- - - Stock figures lor 2nd tier

- - - Primary Demand

56

~

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis E Kehoe

...........................•

_

'iJ

- -. ~

········ 3

Figure 15. Srages used to guarantee IT positive contribution in manufacturing.

The results plotted in figure 14 show that the 3 rd tier supplier deliveries could be significantly reduced in size. In fact, having a transparent access to information would enable suppliers to adjust their deliveries based on the figures of primary demand. Moreover, the stock of the 1st tier supplier could be significantly reduced because it depends on the deliveries of the 2nd tier supplier. The stock in the 2 nd tier supplier is the same as in the 3 rd tier supplier the day before. 11. ENSURING A POSITIVE CONTRIBUTION OF INFORMATION SYSTEMS TO THE ENTERPRISE

The case study presented in the previous section and numerous examples available in the literature have shown the use of information systems to improve the operations of manufacturing enterprises. A set of guidelines have been proposed to help managers understand the contribution of information systems to manufacturing enterprises. The review of important initiatives in manufacturing such as lean thinking, agile manufacturing, mass customisation and build to order suggest that although information systems are important components that keep running the organisation, much of the impact is dependent on the design and implementation of sound business strategies and efficient manufacturing processes. An IT strategy in place is critical for having information systems aligned to business and manufacturing plans. The convergence of business, manufacturing and IT strategies in manufacturing organisations motivated the development of a framework for information systems use in manufacturing. The proposed framework is based on the dominant alignment perspectives planned by Henderson and Venkatraman [50) and consists of three main stages explained in the following paragraphs. The first stage starts with developing efficient manufacturing processes based on a sound business strategy. The second stage consists of having a business strategy supported by an IT strategy. The last stage contemplates implementing an IT strategy to lead the company once it has been possible to achieve efficient manufacturing processes. Figure 15 depicts this framework. The following steps give details of the possibilities of improving manufacturing operations by using an IT strategy [53).

Information systems frameworks and their applications in manufacturing systems

57

1. Development of enhanced manufacturing operations based on a sound business strategy

This stage consists of defining a sound business strategy that is the driver of all changes to manufacturing operations. Information systems at this stage are required to support critical operations, IT strategy is absent at this first stage and has no influence in the organisation. The purpose of the business strategy is to start developing the operations side of the company towards improving its manufacturing operations. For example, companies should develop the flexibility in the shopfloor (e.g. reduce of set-up costs, develop a flexible manufacturing base) where applicable. 2. Definition of an IT strategy to support the business strategy

The feedback received from the outcome ofthe implementation ofthe business strategy targeting the effectiveness of operations entails the definition of an IT strategy. Updates to the business strategy would involve the definition/utilisation of an IT strategy. An IT strategy is intended to support upgrades to the business strategy after changes have been introduced to business processes. For example, an organisation has finished or has made significant progress in developing flexibility in the shopfloor and it is ready to seek best IT competencies to further develop its business strategy. 3. Implement an IT strategy to lead the company once it has been possible to improve its manufacturing operations

Once it has been possible to achieve effectiveness in the operations side of the business and an IT strategy has been used to make upgrades to the business strategy, the next step is the exploitation of emerging IT capabilities to impact new products and services. This would enable IT to influence the business strategy of the company and develop new forms of relationships (e.g. extended inter-enterprise cooperation, formation of virtual enterprises). An organisation implementing an IT-led strategy seems to be a sound methodology to ensure the competitiveness of manufacturing operations and other business activities. Stage three is ready for implementation once the organisation has achieved substantial performance levels that can be considered benchmarks for the industry. An IT strategy used to influencing the business strategy of the company not only ensures the sustainability of improvements to manufacturing operations but also it increases the contribution and support of information systems to the firm. This stage has been envisaged to show that it is possible to have an IT strategy leading a company, enhancing the performance levels (benchmarks for industry) in manufacturing operations and other business processes in the organisation. The case study presented in this chapter plus the numerous cases of information systems failure to deliver expected benefits in manufacturing [51] have motivated the use of a new tag-label to designate the role of information systems in manufacturing. This new tag calls information systems as "enhancing agents" of benchmark-like performance. E-commerce, virtual enterprises, electronic market places, and other IT-based tools should be re-named as second-order enablers or enhancement agents of benchmark-like performance. Indeed, companies with effective manufacturing operations may regard information systems behind other factors such as flexibility of

00

..

Operational Level

.-..t'

~

~

: ...

Figure 16. Phases of support of information systems.

~~

Adoption of manufacturing enhancementagenffi: -Flexible manufacturing -Reduce set-up times -Constant price per unit -Employee training

Informat iona l Level

,'''',/'ro f t .J \. .J I I' 2~"C' r

) ~.

Manufact uring Operations: Enhanced performance by the use of IS

Adoption of IS to enhance manufacturing performance Infrastructure, Planning & Execution

'

~ -----------j.~ ~ D

Manufacturing Operations Performance: World class manufacturing, Benchmarked by others

~ -----D -----r.

Manu factur ing Operations ' e.g. 1 ~t ier Manufacturer of Seating systems

..

:.: A ::-IC

Information systems framewo rks and their applications in manufacturi ng systems 59

Informational Level

Actions that the system has to perform resulting in enhanced manufa cturing performance and value to users

Adoption of IS to enh ance manufacturing performance Infrastructure, Planning & Execution

e.g. Criteria for information systems design/adoption: - Business modelling , use cases

Figure 17. Specification tasks for infor mation systems.

sho pfloor operations. Figure 16 depicts the three stages involving the use ofinformation systems to support manufacturing operations. The adoption of information systems to enhance manufacturing performance may includ e infrastructure, plannin g and execution systems. However, the adoption process requires specifying the action s that the system will perform , such as enhancing the performance of operations. Tools providing support for those tasks include Use Cases. Figure 17 depicts the use of Use Cases to shape the design/adoption process of information systems. Defining metrics to measure the contribution ofinformation systems to manufacturing operations in frameworks like those depicted in figure 16 is of extreme importance. A set of metrics that may be helpful to measure the contribution of information systems is presented in figure 18. Actually, mo st manufacturing organisations are familiar with the list of measures provided in figure 18. Benefits presented in figure 18 have been classified as strategic, tactical and operation al in nature [52]. Several more metrics might be added to the above list. Information systems measurem ent is a research field on its own and not conte mplated in the stru cture of this chapter.

60

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis F. Kehoe

Tact ical bene fits

Strateg ic benefits

leader in the use of new technology market leadership improved growth and success improved market share product added value

Contribution of

flexibility improved response to change s improved manufa cturing control improved organisational teamwork improved data management improved accuracy of decisions improved performance monitoring improved product and service quality integration with other functions reduce manufacturing costs reduced manufacturing lead-times

~. . . . . Informa tion systems in manufacturin

Operational benefits

information availability to customer capacity planning improved product traceab ility improved data availability and reporting improved commu nicatio n improved increased productivity increase d plant efficiency reduced delivery lead-times reduced levels of WIP reduced labour costs increased throughpu t enhanced, or speeding up of data entry

Figure 18. Metrics that reflect the contribution of information systems

10

manufacturing.

12. CONCLUSIONS

The relevance of information systems in manufacturing will continue to grow in the future with the development of new tools and technologies. Certainly, researchers like Turowski [33] have agreed that the deployment of efficient and effective information systems architectures is a key success factor for organisations implementing competitive strategies like ETO or mass customisation. Tools such as UML will continue to offer the possibility of translating business requirements that may be used to outline the architectures required to access remotely for example, manufacturing plants through Internet-like networks. Moreover, UML is a key tool for the translation of business requirements into information systems design. A tough business environment also demands rigorous procedures to justify investments in IT /information systems. New types of applications and development tools will require the definition of new metrics to measure the performance of information systems investments. With the use of new development tools, implementation costs of information systems may be reduced since essential software components are reusable, platform independent and adaptable to meet particular customer needs. Further on, manufacturing organisations will need closer and more reliable information links between its manufacturing operations, supply chain management, finance

Information systems frameworks and their applications in manufacturing systems 61

and human resources. Reliable information links will make it possible to reduce information variability not only at the enterprise level but also at the multi-enterprise level. In manufacturing industry the transparency of information across tiers of the supply chain has proven to be of extreme importance to eliminate excess. Indeed, unreliable information has been responsible for having high stock levels at all tiers in the supply chain. Unreliable information may lead to stockout incidents and backorders. In order to ensure a positive contribution of new IT tools/information systems applications, improvements to manufacturing operations (e.g. flexible manufacturing operations, reduced set-up costs, etc.) have to be continuously made. Indeed, any positive contribution of IT /information systems to manufacturing enterprises is dependent on the successful implementation of improvement initiatives in manufacturing operations. REFERENCES [11 Boar, Bernard H. 1994. Practical Steps for Aligning Information Technology with Business Strategies: How to Achieve a Competitive Advantage. Wiley, New York. [2] Shaw M., Seidmann A. and Whinston A., 1997. Information Technology for Automated Manufacturing Enterprises: recent Development and Current Research Issues. The International Journal of Flexible Manufacturing Systems, 9,115-120. [3] Plekhanova, Valentina, 2001. Engineering the information technology requirements and framework. (945-947). Managing Information Technology in a Global Economy. 2001 Proceedings of the Information Resources Management Association International Conference, Toronto Ontario. [4] Ezingeard J. N., 1996. Heuristic methods to aid value assessment in the management of manufacturing information and data systems. Ph.D, Thesis from the Department of Manufacturing and Engineering Systems. Brunel University, West London. 151 Laudon K. C. and Laudon J. E, 1998. Information Systems and the Internet. A problem solving approach, 4 th edition. The Dryden Press: Fort Worth, TX, USA. 16] ShewchukJ., 1998. Measures of of design change potential for manufacturing information systems: an architecture-based approach. InternationalJournal of Industrial Engineering, 5, 1, 38-48. 17] Next Generation Manufacturing Project, 1997. Vol. II Imperatives for Next Generation Manufacturing. U.S. Department of Energy, Washington nc. USA. [81 Geffen. 2000. It is not enough to be responsive: the role of cooperative intentions in MRP II adoption. DATABASE. The database for advances in information systems. Volume 31, No.2, 65-79. [91 Noori H. and Mavaddat E, 1998. Enterprise integration: issues and methods. International Journal of Production Research, 36, 8, 2083-2097. [10] Kathuria R. and Igbaria M., 1997. Aligning IT applications with manufacturing strategy: an integrated framework. International Journal of Operations and Production Management, 17,6,611-629. [111 Randall T., 1999. The value of IT in the Manufacturing Sector. Compass Consulting Analysis White Paper. [12] Broadbent M., Weill E, and Neo B., 1999. Strategic context and patterns ofIT infrastructure capability. Journal of Strategic Information Systems, 8, 157-187. [13] Robinson B. and Wilson F,2001. Planning for the market: enterprise resource planning systems and the contradictions of capital. DATABASE. The database for advances in information systems. Volume 32, No. 9,21-33. [14] Glass R. L. 2()()]. The software practitioner little red riding hood meets critical social theory. DATABASE. The database for advances in information systems. Volume 32, No.4, 11-12 [15] Hackathorn R. 1995. Data warehousing energises your enterprise. Datamation, Vol. 41, No.2, February 1,38-45. [16] Farbey 13., Land F. and Targett n, 199911. A Taxonomy of Information Systems Applications: the Benefits' Evaluation Ladder. Working paper ofthe Department ofInformation Systems, London School of Economics and Political Science. [17] Saaksjarvi M. 2000. The Roles of Corporate IT infrastructure and their impact on IS effectiveness. Proceedings of the 8th European Conference on Information Systems, 1, Vienna, Austria, 421-428. [18] Hanseth 0. and Braa K., 1998. Technology as Traitor: Emergent SAP infrastructure in a Global Organisation. Proceedings of the 19 th International Conference on Information Systems, 188-196.

62

Adrian E. Coronado Mondragon, Andrew C. Lyons, and Dennis E Kehoe

[19] Kettinger Wand Hackbarth G., 1999. Mastering Information Management, part seven: Electronic Commerce Special Supplierment. Financial Times, Monday March 1S. [20] Yusuf Y Y, Sarhadi M. and Gunasekaran A., 1999. Agile manufacturing: the drivers, concepts and attributes. International Journal of Production Economics, 62, 33-43. [21] Kidd P. T., 1994. Agile Manufacturing, Forging New frontiers. Addison Wesley, Wokingham UK. [22] Mahadevan B., 2000. Business models for Internet based e-commerce: an anatomy. California Management Review, vol. 42, 55-69. [23] Gunasekaran A., 1999. Agile Manufacturing: A framework for research and development. International Journal of Production Economics, 62, 87-105. [24] Childe S., 1998. The extended enterprise-a concept of co-operation. Production Planning and Control, 9,4,320-327. [25] Marchand D., 1999. How to keep up with hypercompetition. Mastering information management, part four: The smarter supply chain. Financial Times, Monday February 22. [26] Kasarda J. and Rondinelli D., 1999. Innovative Infrastructure for Agile Manufacturers. Sloan Management Review, Winter, 73-82. [27] Reid R., Tapp J., Liles 0., Rogers K. and Johnson M, 1996. An integrated Management Model for Virtual Enterprises: Vision, Strategy and Structure. IEEE International Engineering Management Conference, Vancouver B.C., 522-527. [28] Venkatraman N. and Henderson J., 1998. Real Strategies for Virtual Organizing. Sloan Management Review, Fall, 33-48. [29] Fouletier P, Park K. and Farrel ]., 1997. An inter-organisational information systems design for virtual enterprises. Proceedings of the 1997 IEEE 6th International Conference on Emerging Technologies & Factory Automation EFTA'97, 139-142. [30] Womack J. and Jones n, 1996. Lean Thinking, banish waste and create wealth in your corporation. Touchstone, London UK. [31] Goldman S., Nagel R. and Preiss K., 1995. Agile Competitors and Virtual Organizations, Strategies for enriching the customer. Van Nostrand Reinhold, New York. [32] Da Silveira G., Borenstein 0. and Fogliatto E, 20CJ!. Mass Customisation: Literature Review and Research Directions. International Journal of Production Economics, Vol. 72, 1-13. Permission granted from Elsevier. [33J Turowski K., 2002. Agent-based e-commerce in case of mass customisation. International Journal of Production Economics. Vol. 75, pp. 69-81. Permission granted from Elsevier. [34] Gunasekaran A., 1998. Agile Manufacturing: enablers and an implementation framework. International Journal of Production Research, 36, 1223-1247. [35] Huang C. and Nof S., 1999. Enterprise agility: a view from the PRISM lab. International Journal of Agile Management Systems, 1,51-61. [36] Beynon-Davis P., Owens I. and Lloyd-Williams M., 2000. Melding Information Systems Evaluation with the Information Systems Development Life-Cycle. Proceedings of the 8 th European Conference on Information Systems, Vienna Austria, 195-201. [37] Hevner A. R., Collins R. Wand Garfield M. J. 2002. Product and Project Challenges in Electronic commerce software development. DATABASE. The database for advances in information systems. Volume 33, No.4, 10-23, 2002. [38] Oberg R., Probasco L. and Ericsson M., 1998. Applying requirements managementwith use cases. Technical Paper TP505, Rational Software Corporation, pp. 2-3. [39] DeVor R., Graves R. and Mills J., 1997. Agile Manufacturing research: accomplishments and opportunities. liE Transactions, 29, 813-823. [40J Song L. and Nagi R., 1997. Design and implementation of a virtual information system for agile manufacturing. liE Transactions, 29, 839-857. [41] Cheng K., Harrison n K. and Pan P. Y, 1997. Implementation of agile manufacturing-an AI and Internet based approach. Journal of Material Processing Technology, 76, 96-101. [42] Bullinger H., Fahnrich K. and Linsenmaier T. 1998. A conceptual model for an architecture of distributed objects for the integration of heterogeneous data processing systems in manufacturing companies. InternationalJournal of Production Research, 36,11,2997-3011. [43] Whiteside R., Pancerella c., and Klevgard P, 1998. A CORBA-B.~ed Manufacturing Environment. Proceedings of the 1997 IEEE Conference on Internet Technologies, 34-43. [44] Wolfe P, Smith R. and Chi Y, 1998. WWw, Corba and Java: New information technologies for industrial engineering solutions. Proceedings of the 1998 IE Solutions Conference. Institute ofIndustrial Engineers, 1-6.

Information systems frameworks and their applications in manufacturing systems 63

[45] Bocks P, 1995. Enterprise data management framework for agile manufacturing. Computer Engineering Division, American Society of Mechanical Engineers. New York, 7, 41~46. [46] Zhou Q., Souben P and Besant C, 1998. An information management systems for production planning in virtual enterprises. Computers and Industrial Engineering, 35, 1/2, 153-156. [47] Herrmann]., Minis I. and Ramachandran V, 1995. Information models for partner selection in agile manufacturing, Proceedings of the 1995 ASME International Mechanical Engineering Congress and Exposition, San Francisco CA, 7, 75-91. [48J Lau J, Huang G. and Mak K., 2002, Web-based simulation portal for investigating the impacts of sharing production information on supply chain dynamics from the perspective of inventory allocation. Integrated Manufacturing Systems, 13, (5),345-358. [49] Rother M., and Shook)., 1999. Learning to See, version 1.2, (Lean Enterprise Institute Inc.) [50] Henderson]. and Venkatraman N., 1999. Strategic Alignment: Leveraging information technology for transforming organisations. IBM Systems Journal, 38, 2/3, 472-484. [51 J Ewusi-Mensah K., 1997. Critical issues in abandoned information systems development projects. Communications of the ACM, 40, 9, 75-80. [52] Coronado A., Sarhadi M. and Millar C, 1999. An Evaluation Model of Information Systems for Agile Manufacturing. Proceedings of the Sixth European Conference on Information Technology Evaluation, 4-5 November 1999. St. Johns, BruneI University, West London, 203-213. [53] Coronado Mondragon Adrian E., 2002. Determining information systems contribution to manufacturing agility for SME's in dynamic business environments. Ph.D. Thesis from the Department of Systems Engineering. Brunel University, West London.

MODELLING TECHNIQUES IN INTEGRATED OPERATIONS AND INFORMATION SYSTEMS IN MANUFACTURING SYSTEMS

Q. WANG, C. R. CHATWIN, AND R. C. D. YOUNG

1. INTRODUCTION

State-of-the-art production facilities require a wide variety of intelligent devices and automated processing equipment to be integrated and linked together through a manufacturing network in order to achieve the desired, cost effective, co-ordinated functionality. Devices within a manufacturing system may include: programmable logic controllers (PLCs), direct numerically controlled (DNC) machines, sensors, robots, vision systems, co-ordinate measurement machines (CMMs), personal computers (PCs), and mainframe computers, supplied by different vendors, using different operating systems, with different communication needs and interfaces. The successful integration of existing equipment using existing communication protocols and networks is crucial to achieve the functionality required for computer integrated manufacturing (i.e., CIM) systems. As a result, the performance of communication networks has become a key factor for successful implementation of integrated manufacturing systems, particularly, for time-critical applications. Hence, the analysis, design and performance evaluation ofmanufacturing systems can no longer ignore the performance ofthe communication environment. Until recently, however, system designers lacked feasible and practical combined modelling and simulation methods or tools, which would permit them, at the early design stage, to assess such things as how the maximum message delay impacts the shortest machine processing time. That is because most research on the performance of a manufacturing system using modelling and simulation has focused on the 64

M odelling tech niqu es in inte grated ope rations and info rm atio n systems

65

'operational system's aspect' . The term 'simulati on ' used in a narrow sense always indicates the performance of manu facturing operation s. T he ' information processing system's aspec t' has had very limit ed or often separate investigation witho ut considering th e overall performance by taking both aspec ts int o acco un t within th e manu facturing plant. O ne of the major reason s why th ere are so few studies related to this area is the high level of complexity. R ecent reviews of manufacturing system mo delling meth od s have co ncluded that, despite th e significant number of int egrated mod elling methods that have been reportedly develop ed, such as: GIM (G R AI integrated meth odology), SIM (Strathclyde int egrati on meth odology) and ICAM DEFinition (ID E F) simulation meth od s, there is no single conceptual modelling me th od which can co mpletely model a manufacturing system or describ e most of its sub-systems based on th e cur ren tly developed simulation tool s. Alth ou gh it is argued that it is not practical or possible to mod el all aspects of manufacturing systems during th eir life- cycle engineering and ongo ing development, th e mo delling simulation protagonists cont inue to enhance mo dels to incorporate an increasing number of features such as model conceptuality, function ality, dynamic aspects and so on. On the other hand , it is gene rally accepted that traditional planning methods and mathematical or analytical mod elling techniques are no t appropriate if det ailed analysis is required for complex manu facturing systems [2, 3, 4, 5, 6, 7,8, 9,1 0, 11,1 21. The performance of the communicatio n system is related not me rely to the electroni c characteristics of th e transmission media, but also to the pro tocol requ irements. For example, many manu facturing companies across th e EU have implemented and continue to use th e IEE E 802.3 CSMA/ C D (carrier sense multiple access/collision detection-eth ernet) proto col within th eir manufacturing environment to improve the performance character istics of rando m access LAN s (local area net works) at extremely low cost. One of the main drawb acks with using this protocol is that it uses a content ion rand om meth od to gain access to the network , i.e., th e media access tim e is non-deterministic. Consequent ly, this leads to un certaint y when a station, whi ch need s to transmit, has to wait an und etermined amo unt of tim e before it is able to send a message to its destination . U nder certain circumstances, th is tim e, which is referred to as maximum me ssage delay, may be cruc ial in production , as a long tim e-delay between tw o communicating devices may result in lost production or even damage to the system especially when handling peak traffic network load. Previ ou s stud ies from Higginbottom [13J have shown th at th ere are almost no delays in a station's access time to the CSMA/ C D protocol networ k at low or medium network traffic load, but performance is dramatically reduced wh en the load is heavy. It is imp ortant to determine, at the early design stage and und er all conditions, that the maxim um message delay through a LAN is less than the sho rtest workstation (machine) processing tim e. T his enables the manu facturing system to ope rate without breakdown in produ ction . Howe ver, it is often difficult to det ermine the maximum message delay as it is subj ect to factors, which are controlled by the characteristics of the complex flexible manu facturing system (i.e., FMS) and its stoc hastic system behaviour. For instance, Hi gginb ottom's [13] recent work based on a mathem atical analysis of LAN performance only works out

66

Q. Wang, C. R. Chatwin, and R. C. D. Young

the mean delay as a function of network throughput or network utilisation. However, system designers lack a feasible and practical combined modelling and simulation method or tool, which allows them, at the early design stage, to assess such factors as to how the maximum message delay impacts on the shortest machine processing time. Frequently, the LAN designer just simply increases the capacity of the network until it delivers a reasonable performance for the manufacturing system. The approach herein offers a quick and visible overview (or preview) of the system performance by considering both the above factors. This can also help the designers obtain some useful information in advance on alternative solutions to meet both operational system and communication system requirements by providing them with an estimate of network efficiency for the assumed conditions. This will also reduce unnecessary investment in systems that have excessive capacity in order to achieve a common commercial objective: to build a network with very good performance for a minimum cost. There are a few publications in the literature that analyse and compare the performance of three IEEE 802 standard networks for manufacturing systems. A classical comparative study is often made based on open system interconnection (OSI) transport and datalink layers' performance in order to determine the relative merits of CSMA/CD, token bus, and token ring networks. For manufacturing environments, the major problem of the ring network is its physical topology, which is always a poor fit to the layout of most processes and assembly lines. Some delay in gaining access to the ring is encountered at low network load because the station has to wait for the token. The disadvantages of the CSMA/CD network include: a limited cable length (2.5 km with repeaters), which may restrict the layout of the manufacturing plant. The network efficiency drops as the network load increases. At high network load, message collisions are a major problem and the network performance deteriorates rapidly. Obviously, such a situation cannot be allowed to take place for real-time applications in manufacturing. In contrast, the bus network is the most popular topology for a factory's local area network because its layout can be made to closely match the layout of machines in the factory. The token bus protocol network has excellent throughput and efficiency at high loads, which is supposed to satisfy requirements for process control applications. But the major concern is that the token bus is a complex protocol, which can raise the cost of the communication equipment. These advantages and disadvantages are always debated when implementing communication systems for manufacturing industries. The debate is greatly curtailed if the protagonists take an integrated modelling and simulation approach, and simultaneously investigate the performance of the communication and manufacturing systems. 1.1. Review of integrated modelling simulation methods or tools for manufacturing systems analysis, design and performance evaluation

Because of fierce competition, industry is now being forced into implementing expensive factory automation and is, therefore, carefully re-examining its operating

Modelling techniques in integrated operations and information systems 67

policies and procedures. For the past decade, several computer-based modelling and simulation methods or tools for modelling, analysing and designing different aspects of manufacturing systems have been developed. The following reviews some of the major developments in the modelling simulation methods and the integrated modelling simulation tools that are used for manufacturing systems [14, 15, 16, 17, 18, 19, 20,21]. • The GRAI (graph with results and actions interrelated) method was developed based on the early development of a variety of graphical modelling methods, which are explored in a branch of mathematics relating to graph theory. The GRAI is based upon a conceptual reference model, which uses graphical tools and a structured approach. The reference model is decomposed into three sub-systems, namely: physical, information and decision systems. The GRAI graphical tools consist of GRAI grids and GRAI nets. The GRAI grid is represented by a table of rows and columns, and is constructed using a top-down analysis approach. The columns of the grid represent the types of function and the rows contain the decision time scales. The relationships between decision centres are represented on the grid by a simple arrow (an information link) and a double arrow (a decision link). The GRAI net describes the structure of the various activities in each of the decision centres identified in the GRAI grid and is constructed using a bottom-up analysis approach. The activities are the fundamental elements in the grid. Each activity has an initial and a final state, and requires the support of information and produces results. An activity result can be the connecting resources or input to another activity. Another wellknown graphical application is Petri nets, which can be used to model more complex systems. • The rCAM (integrated computer-aided manufacturing) DEFinition (IDE F) consists of a hierarchy of diagrams, text and glossary. IDEF includes three different modelling methods: IDEFO, IDEFl, and IDEF2 for producing a functional model, an information model, and a dynamic model respectively. The IDEFO functional modelling method is designed to model the decisions, actions and activities of the system. It allows the user to 'tell the story' of what is happening in the system. The diagram represents the main component of the IDEFO model. It presents the system functions as boxes, and data or object interfaces as arrows. The attachment point between arrows and boxes indicates the interface type (input, control, output or mechanism). The generation of many levels of detail through the model diagram structure is one of the most important features of IDEFO as a modelling technique. The IDEFO model starts by representing the whole system as a single box (the highest level), which is labelled AO. The AO box can be broken down into more detailed diagrams until the system is described in the necessary detail. The top level of the model presents the most general system objective and is followed by a series of hierarchical diagrams to provide more detail about the system being modelled. Some simulation tools have been developed based on IDEF models, such as Design/CPN, Mapping IDEF3 and ARENA.

68

Q. Wang, C. R. Chatwin, and R. C. D. Young

• SADT (structure analysis and design technique) uses a number of graphical tools including diagrams, actigrams, datagrams, node-lists and data dictionaries. Actigrams describe the relationships between the activity elements and datagrams describe the relationships between the data elements in the diagram structure. A node list is a record of the node contents (title and number) of the actigram or datagram used to provide the structure of the subject system. A SADT model depends upon topdown decomposition, starting with a single function, which is broken down into child-actigrams and datagrams in order to achieve the necessary level of details. • SSADM (structured system analysis and design method) provides interfaces between the method procedure and techniques. It breaks the system down into modules containing activity steps. Each step has several tasks as inputs and outputs. SSADM contains a number of techniques including data flow diagrams (DFD), logical data structure (LDS), entity life histories (ELH) and relational data analysis (RDA) to support its modelling methodology. The role ofthe DFD in the SSADM is to provide a functional model ofthe data flows throughout the system being modelled. The LDS is used to identify system entities from the source of information flow and specify the relationship between these entities in order to build a diagram, which represents the logical data structure. The ELH is used to validate the DFD and investigate system data dynamics. The RDA supports the data structures, which are stored in data tables. Due to the limitations ofthese methods and techniques, a number ofother integrated modelling simulation methods and modelling simulation tools based on the above techniques have been developed by different groups: • GIM (i.e., GRAI integrated methodology) was developed to support an overall systems analysis and design. Therefore, the GIM method integrates four different modelling domains: functional, information, decisional and physical, and presents them in a GIM modelling framework. Furthermore, GIM combines three modelling methods: GRAl (to model decisional systems), MERISE (to model information systems) and IDEFO (to model physical systems). GIM is supported by a computerised graphical editor called IMAGIM, which offers access to the graphical editors of method formalisms. The package utilises the GRAl grid and net, IDEFO and entity/relationship editors. In addition to providing unclear support of dynamic aspects ofmanufacturing systems, the linking of the GIM formalisms is not well supported by IMAGlM. • SIM (i.e., Strathclyde integration methodology) comprises two modelling methods ofDFDs and GRAI grids to model information systems in the manufacturing environment. The application ofIDEFO was introduced into the method to complement the use ofDFDs. SIM is an effective method for modelling manufacturing information systems but it does not consider dynamic aspects of physical sub-systems in the manufacturing environment. • The GI-SIM (i.e., GRAIIIDEF-Simulation) integrated modelling method has reportedly been developed to meet the needs of analysis and design by capturing the

Modelling techniques in integrated operations and information systems

69

characteristics of a manufacturing system 'completely'. Precisely, the GI-SIM tool provides three interfaces which can link (integrate) three existing modelling simulation tools (GRAI grid, IDEFO and SIMAN), which have been used for evaluation of manufacturing systems. SIMAN (now called ARENA) is a powerful simulation package, which is mainly used to model and simulate various manufacturing environments. The interfaces, which appear as an enter-information window and have been developed using a visual programming language, can also translate data between different simulations tools. However, this integrated modelling method does not provide a function or facility, which can be used to model the information (communication) systems aspect. The above integrated modelling simulation methods or tools are designed to be used to either model operational functional dynamic behaviour or to model the information system for manufacturing. Most authors agree that there is no single mature technique, which can completely model both aspects of a manufacturing system. Nevertheless, manufacturing system's analysts, designers and their clients have an increasingly important requirement for a 'full' system evaluation, which can involve modelling the basic manufacturing operations incorporating the effect ofthe information (communication) systems particularly for investigation of highly integrated time-critical manufacturing systems. These factors eventually lead to the development and implementation of an integrated approach that will be presented in this chapter. 1.2. Research objectives

In resolving these problems, we have developed an integrated modelling simulation methodology to a mature stage; this technique permits users to determine the relevant impact on logical interactions and interrelationships between operations and information (processing) systems within a manufacturing environment. This has been achieved by formulating an integrated model, in which both operational system's function and information system's function can be modelled, simulated and examined together based on existing simulation tools, along with other statistical techniques. In addition to this major task, this established integrated simulation model has the capability to help designers gain a comprehensive preview of the system's performance and behaviour and provide the performance prediction that allows designers to build a system that gives an optimal solution before implementing a real system. This tool particularly provides a distinct improvement in optimising a system's performance within a time-critical manufacturing environment. In principle, this technique is valuable for analysing a wide range of manufacturing systems (CIM, FMS, dynamic process control systems, etc.). Since manufacturing systems involve many aspects, including financial and marketing systems (especially within CIM systems), it is essential to narrow the scope of the analysis and focus on the systems that will benefit the most from an efficient communication network. In this project, a fairly complex flexible manufacturing system for printed circuit board assembly (i.e., PCBA) is selected as a case study, and the integrated simulation model of its operational system and communication system

70

Q. Wang, C. R. Chatwin, and R. C. D. Young

has been built using two major modelling simulation packages, namely: ARENA 3.0 and COMNET III. An interface has been established to allow analysts to analyse and convert relevant statistical data between them. A series of'countless' pilot simulations have been successfully executed. Since the generated simulation results are of considerable volume, only those that are valuable for use in the specific research and to meet users' requirements are customised and chosen. In this case, the simulation results, which represent the interactions and interrelationships between the operational and the information processing systems, are reported, analysed, plotted and discussed herein. This chapter presents a detailed description of this integrated modelling simulation methodology with its established integrated simulation model, supported by an overview of previous research work and the fundamental knowledge of the updated modelling simulation approaches and the most popular optimisation techniques for evaluation ofmanufacturing systems. Furthermore, this method has been implemented, tested and demonstrated based on an application of the printed circuit board assembly (i.e., PCBA) system, the feasibility and benefits of using this tool and an analysis of various simulation outputs are also presented and discussed. Our research has shown that this approach can provide a useful basis for developing existing modelling frameworks and a practical means of exploring existing integrated modelling simulation methodologies. The research work has been described in a series of international publications [22, 23, 24] which report the research achievements. 2. THE PCBA SYSTEM

Automated assembly lines are used for the assembly of products in most repetitive assembly sectors. In general, an automated assembly line consists of a number of machines or workstations that are linked together by a conveyor or some other material handling systems. The transfer ofwork-parts occurs automatically and the workstations carry out their specialised functions automatically. The present-day automated assembly system increasingly uses software-controlled equipment and performance tends to be more and more determined by organisational and logistical constraints rather than by technical constraints. Automatic assembly ofprinted circuit boards (i.e., PCB) constitutes a core manufacturing process in the electronics industry. The PCBA system is a very highly integrated, automated, flexible and time-critical manufacturing system, which is normally configured as several independent flexible working cells or assembly areas that are mainly equipped with SMT machines using advanced robotics technology and a sophisticated vision system. These assembly cells are basically linked together by a conveyor system and are integrated by a communication network to co-ordinate individual assembly systems. Some manufacturing integration companies such as Universal'P' provide the networking software that can be used to integrate a number of equipment units for electronic production systems. The integration software tools also allow the user to transfer data between a host computer and any of the devices used in the various

Modelling techniques in integrated operations and information systems

A typical SMT placement machine cell r-·------·····--- ··---- ·---·----- ·----··-·---··-·---··---- ···----·---1

i

!

i

I

o

••o

71

Gripping head Transferring head Placing head Empty head

Figure 1. A typical placement cell for PC13 assembly.

assembly processes, which are controlled by computerised controllers (such as PLCs) that direct all of the functions and operations throughout the PCBA system. The operation programs can be downloaded and uploaded from the host to the assembly units. The PCBA system represents a typical case, in which large amounts of electronic components are placed automatically on the boards by using so-called surface mount technology (SMT). An SMT machine is equipped with one placement head and two component carriages. One or more parallel assembly lines are usually available to place the components on the boards. These lines consist of several placement machines that are linked together by a conveyor system. Each machine in the line places a subset of the required components on the board, and the last machine completes the assembly. Figure 1 illustrates a typical line layout of PCBA cells. The PCBA line consists of two placement machines. Each machine is equipped with one placement head and two component carriages, one at each side of the machine. These carriages can move horizontally. Feeders that contain the components are stored at the stock positions of the carriages. The small vertical lines at the component carriages denote these feeders. The placement head can move in both horizontal and vertical directions. To place a component the head moves to the fixed pick position (indicated by the little black square), where the feeder that contains the required component type has already been moved. The head picks up the component and places it at the appropriate position on the board. During the assembly of the PCB at a machine the board cannot move. In the last decade PCBA companies have been faced with very high service level requirements, in terms of throughput times and delivery reliability. The size of PCB

72

Q. Wang, C. R. Chatwin, and R. C. D. Young

assemblybatches, the demand for different PCB types aswell asthe types ofcomponents at the assembly lines varies with time; as a result, the layout of the PCBA system is re-configured frequently and therefore must be determined. A closer look at the PCB assembly process reveals that the planning and scheduling of PCB assembly is usually very complicated. For example, an unbalanced distribution of the assembly workload of a particular PCB type between the SMT machines in a line can cause loss of machine assembly capacity. If the workload that is assigned to an SMT machine is high compared to the workload of the other machines in the line, then the latter machines have to wait until this machine has completed its part of the assembly. When the number of orders increases it becomes very difficult to achieve a good balance for each order; this results in idle times for the SMT machines. Therefore, there is a line balancing problem, this requires investigation via animated simulation to minimise the load imbalance between the machines; simulation results provide insight into such factors as the assembly capacity utilisation at each SMT machine or workstation. This is a very important factor at the machine planning level for the PCBA system. If the workload is equal for each machine in a line for a particular order, then the line is said to be perfectly balanced. The workload consists of picking and placing components and gluing to the boards. Sequencing is another complicated issue during PCB assembly due to the requirement for different types of PCB components for different board designs. To solve this problem, the PCBs are tracked from their entry into the system and throughout the processes by making use of bar code labelling and scanning. Due to the lack of management structure in planning the assembly lines in the PCB industry, line balancing decisions were often left to the operators [25, 26, 27]. Figure 2 shows a hierarchical planning and scheduling approach by Fokkert's [26] research that dealt with the complexity of PCB assembly lines. This relatively complete approach consists of three planning levels: department level planning, line level planning and machine level planning. However, it does not give any details as to how to implement this approach for scheduling and planning, particularly with a focus on the two latter levels ofa complex PCBA system. Furthermore, the fatal weakness ofthis development is that the developed models and the method are all based on a deterministic approach. However, the PCBA system is a typical stochastic system. Moreover, it also ignores the effects from the PCBA communication system, which plays a key role in such a highly automated time-critical integrated system. Therefore, the emphasis on developing a comprehensive but practical integrated approach for analysis, scheduling and design of complex systems such as the PCBA system, is essential in order to achieve cost-effective operations for a wide range of product types. 3. SIMULATION TOOLS USED

In this project, two simulation tools (ARENA 3.0 and COMNET III) have been utilised to capture the main modelling characteristics (functional and information dynamic aspects) of the complex flexible PCBA system. This has been achieved by developing an integrated simulation model through an interface that allows analysis

Mod elling tech niques in integrated opera tions and information systems

Line requirements for the customer orders Analysis of set-u p and placement times Compo nent types loaded on

BOM s of forecas ted and accepted customer orders Earliest and latest assembly period of the orders Avai lable capacity of the lines

'h";~ ~

~

Positions requirements of the component types Component types loaded on the machine s Machine requireme nts of the compo nent

~

73

" '_ _

Feeder requi rements for the component types Packaging of comp onent types Avai lable carriage positions

..I.~

f",~ ;",

...

Feeders to be unloaded for each set-up

Sequence in which components arc placed

Assignment of feeders to carriage positions

Figure 2. A hierarchical appro ach for plannin g and scheduling of a peBA system.

and conversion of relevant simulation output data from the ARENA model to the COM NET model. This integrated model can help the system designers modify and j ustify the system model parameters in order to detect the system bottlenecks and to improve its design so as to obtain optimum system performance. 3.1. Operational system model development based on ARENA 3.0

T he AR EN A software (Systems Modelling Co rporation), which is developed using the SIMAN simulation language, divides the simulation process into three steps:

74

Q. Wang, C. R. Chatwin, and R. C. D. Young

• System model development. • Experimental frame development. • Simulation data analysis. SIMAN is one of the most popular modern simulation languages specially designed for modelling large and complex (discrete, continuous, and I or combined) manufacturing systems. SIMAN is designed around a logical modelling framework in which the simulation problem is segmented into a 'model' component and an 'experiment' component. The model describes the physical elements of the system (machines, workers, storage points, transporters, information, parts flow etc.) and their logical interrelationships. The experiment specifies the experimental conditions under which the model is to run, including elements such as initial conditions, resource availability, type of statistics gathered, and length of run. The experimental frame also includes the analyst's specifications (specified external to the model description) for such things as the schedules for resource availability, the routing of entities, etc. The ARENA modelling framework draws a fundamental distinction between system model and experimental frame. The system model describes the physical elements and their logical interrelationships by placing and interconnecting a thread of logic simulation modules with specific rules to form a model ofa system from its engineering description. The experimental frame defines the experimental conditions, including analyst's specification, under which the model is run to generate specific output data. The experimental conditions are specified external to the model description; therefore, a given model can link up with different experimental frames resulting in many sets of output data without changing the model description. Once a system model and experimental frame have been defined, they can be linked and executed by ARENA (i.e., through a link processor) to generate output data files. The output data can be displayed as statistical bar charts, functional plots and data tables, which may be customised to accommodate the analyst's needs [28, 29, 30]. 3. 1. 1. Operational system model development

Figure 3 illustrates an example of part of the logic program to build a model of the printed circuit board assembly (peBA) system based on ARENA. Figure 3 shows an ARENA model that is constructed by placing and connecting modules, which have already been developed individually as integrated 'blocks or modules' using the SIMAN simulation language to represent distinct process modelling functions in the model window. The appropriate input data can be entered through the modules' dialogues. A model is constructed by selecting standard modules from the available set. The blocks are arranged and linked in a linear logical sequence, based on their functional operation and interaction, to depict the process through which the entities move in the system. A system model generally consists of a number of individual logic modules and data modules. The logic modules are connected in a logical sequence to define the process through which entities flow (customers, work-pieces, patients, communication packets, etc.). During the simulation run, entities may arrive at and depart from logic

Mod elling techniques in inte grated oper ation s and information systems 75

ARENA simulation on-line

?t:;£S

;'\ ;;~~trt!Jly FJ!J'J'J !... i l1 ~

by iim s at SII.'l..\ ex University

Data mod ules

.,

S im u l al e

~~oo

Shloon'

_ ~O U I .

\fodulcs for the pcnfinal a..sembly Uta

Figure 3. ARENA system model development methodology.

modules that remain dormant until th ey are activated by th e arr ival of an entity. In cont rast, data modules are used to define data associated with th e model. Unlike logic m odul es, data modules are not connected to other modul es. Entit ies do not arrive at or depart from a data module. Da ta modules are passive in natur e and are used only to defin e data associated with th e system parameters. There are three categories of AR.ENA modules for manu facturing systems' models: work-centre modules, compo ne nt modules and data modul es. T hey consist of simulation models to represent the real system [30J.

• Workcentre modules describe the logical portion of the manu factu rin g system. Each of the wo rkcentre modules inco rpo rates all of th e logic necessary for processing a part in a specific area, hen ce: ente r th e workcentre, exit material handling pro cess, determine the next wo rkcentre , get materi al handling, and move to th e next wo rkcentre. The wo rkcentre modules are R eceivin g, Work centre, Buffer, Assembly, and

76

Q. Wang, C. R. Chatwin, and R. C. D. Young

Shipping. The callout box in figure 3 represents a typical group of workcentre modules to describe assembly operations of PCBs. Arriving entities (PCBs) are generated or transferred from the 'ARRIVE module' to another station or module (SERVER to be processed). The ARRIVE module essentially contains the Create, Station, and Leave modules combined into one module. An entity is created, immediately enters a station, and is transferred to another station or module. In the 'SERVER module', an entity (PCB) enters a station, seizes a server resource (components to be assembled onto the PCB), experiences a processing delay (such as EXPO (1) or 1 minute), and is transferred to another station or module. The SERVER module defines a station corresponding to a physical or logical location where processing occurs. The 'CHOOSE module' provides entity branching based on the 'If conditional rule' in conjunction with the deterministic 'Else and Always rules'. When an entity arrives at the CHOOSE module, it examines each of the defined branch options and sends the original arriving entity (the primary entity) to the destination of the first branch whose condition is satisfied. If no branches are taken, the arriving entity is disposed of. When an entity arrives at an 'ASSIGN module', the assignment value or state is evaluated and is assigned to the variable or resource specified. If an attribute or picture is specified, the arriving entity's attribute or picture is assigned the new value. The 'SEIZE module' allocates units of one or more resources to an entity. The SEIZE module may be used to seize units of a particular resource, a member of a resource set, or a resource as defined by an alternative method, The 'RELEASE module' is used to release units of a resource that an entity previously had seized. For each resource to be released, the name and quantity to be released are specified. The 'ROUTE module' transfers an entity to a specified station, or the next station (station 4) in the station visitation sequence defined for the entity. A delay time to transfer to the next station may be defined. • Component modules are single elements of workcentre modules. These modules are typically used when the logic within a workcentre module is not sufficient to represent all or portions of the system concerned. Shown in the right and left icons of figure 4, there are 29 component modules available to be chosen. They can be categorised by two types of purposes: 1. For processing operations. 2. For material handling and transfer operations. • Data modules allow the definition of specific detailed information about objects that are referenced into workcentre modules and component modules to represent the logical flow of a manufacturing system. The detailed information may include the Process Plans at Machines, Operators, and Parts modules. There are 17 data modules available. Figure 4 illustrates some of them. The following is a list of the data modules and a brief description of their functionality. 1. Parts-part name, process plan, attribute assignments. 2. Prodel'lan.Creation of parts into system. 3. Dispatch.Creation of requisitions and transfer requests into system. 4. Machines.Machine name, capacity, breakdowns, statistics. 5. Areas-Area name, capacity, statistics.

Modelling techniques in integra ted operations and information systems 77

""""'N_

_5"

Ir.~

rS-.o_

~' N

k
Machine2

Rt'so.«.N .....

r- .me.,

r."a.eFut h.oeCen:

r

''''AlI"",c..::n:.

5peciy

y... To hk

~-

..... 0 ..

~~

r. Do-

'-

r.

Seo;;I

EO.. t[ ndd ht>

r

R--.-St'!,

s-_

Soa.:!:IelIb;

11-«-

''''' u.=J~~ ==.o~~1 ~

e.-o

I~

Fig ur e 4. Data module s and component modules for ARE NA modelling.

6. Operators-Operator nam e, capacity or schedule name, statistics. 7. M oveO pet .Moveable operation names , sche dule information, locations, veloc ities. 8. O peroched. O perator sched ule name, information. 9. Opersets. Set name, operations in set, selection rul e. 10. Transporter. Trans porter name, velocity, characteristics. 11. C onveyo r. C onveyor name, velocity, type, and characteri stics. 12. Analysis-Simulation run tim e, number of replications, detailed statistics. 13. Variab les-System variable names, initi al values. 14. Paths-Unconstrained, moveable operator, transpor ter and conveyor, animated paths for movement of parts. 15. Networks-Paths for guide d transpor ters. 16. Proc Plans-Seq uen ce of work centre steps a part takes with associated processing inform ation . 17. Proclrata.Gro up s of pro cessing information (statistics) for use wit h proc ess plans.

78

Q. Wang, C. R. Chatwin, and R. C. D. Young

In summary, the development ofan appropriate conceptual, logical simulation model by programming is one of the major tasks in simulation model construction. Although there are many simulation languages commercially available and there are hundreds of other locally developed languages being used by companies and universities, the trend for simulation software development has been an emphasis on an integrated simulation environment to provide ease of use. However, the definition of the model boundary is usually a trade-off between accuracy and cost, a valid model should include only those aspects of the system relevant to the study objectives. Model verification is a process of determining the computer code of a model to ensure that the simulation program is a correct implementation of the model. This process does not ensure that the model appropriately represents the real system; it only ensures that the model is free of errors. Validation is concerned with the correspondence between the model and reality, i.e., model validation is a process of determining that a model is a sufficiently adequate approximation of the real system that the simulation conclusions drawn from the model are correct and applicable to the real-world system. Although most simulation tools can automatically detect certain types of errors introduced by a programmer and may be able to display intentional errors in a model's logic, it cannot automatically correct or debug the errors. It is also unable to find errors of the model to represent the system, as in this situation the program is often correct. A manual verification process is used to avoid common errors, such as: data errors, initialisation errors, errors in the units of measurement, flow control, blockages and deadlocks, arithmetic errors, overwriting variables and attributes, data recording errors, and language conceptual errors. It is found to be very useful to detect and expose such errors by running animation as a verification aid; such direct observation of errors in model execution, speeds the debugging process. 3.1.2. Experimentalframe developmentfor ARENA models

It is important to have appropriate data to describe or represent the real system activities in order to drive its simulation model. In most simulation studies, the determination of what data to use is a very difficult and time-consuming process especially for the case of the design of a stochastic simulation model. Regardless of the method used to collect the data, the decision of how much to collect is a trade-off between cost and accuracy. Perera [20] has summarised and ranked a number of factors that affect accuracy of analysis and identification of the collected data, namely: 1. Poor data availability. 2. High-level model details. 3. Difficulty in identifying available data sources. 4. Complexity of the system under investigation. 5. Lack of clear objectives. 6. Limited facilities in simulation software or packages to organise and manipulate input data. 7. Wrong problem definitions.

M od elling tech niques in integrated operations and informa tion systems 79

In general, we can try to obtain data about the systems from a number of source s [28, 31, 32]: • H istorical records • O bservation data • Similar systems • Operator estimates • Vendor's claim • Designer estimation • Th eoretical considerations

,

•. \=; Input Analyser

Stochastic systems contain one or more sources of randomn ess. C ommon sources of randomness for manufacturing systems are: • Inter- arrival times of entiti es, such as parts, jobs, raw materials to the system. • Processing or assembly times for an entity required at various machine s. • Operation times for various processing machines. • R epair/ or breakd own tim es for a certain machine . Therefore, the sources of input data for a manufacturing simulatio n model may include inter-a rrival times, demand rates, loadin g and unloading times, processing times, failure times for different machines, repair times, etc. M ost of which are probabilistic. Three methods are used to process data from stochastic systems for random simulation models. We can sample directly from the empirical distributi on , or, if the collected data fits a theoretic al distribution, we can sample from the theoretical distribution, or we can choose a prob ability distribution based on the oretical considerations, prior knowledge, or past research . If empirical data are to be used, they are input in the form of a cumulative prob ability distribution. Observed values are input in the form ofan empirical cumulative distribution by arranging them in ascending order, grouping identi cal values, computing their relative frequencie s, and th en computing their cumul ative probability distribution. T he collected data can also be used to fit a theoretical distribut ion, which can then be selected as an input data generato r for the simulated model. First, the collected data are summarised and analysed manually or by using existing software packages: several excellent computer software packages are available to perform these functions. These packages can simplify the task of selecting and evaluating a distribution. Figure 5 presents an example of statistical procedures using an ARENA 'Input Analyser' facility to analyse and pro cess the external modelling (empirical) data in ter ms of a histogram to fit and to form a standard distribution for model uses. The right window shows the input data and the left window displays the entire shape of the histogram that conforms to a normal distribution . The bott om window displays a summary report of the recommended distribution . Input Analyser can be used to determine the quality of fit of probabili ty distributi on functions to the input data for the system's mod el. The collected data files that can be loaded in and are processed by the Input Analyser

80

Q. Wang, C. R. Chatwin, and R. C. D. Young

1-

Figure 5. Fitting empirical data as a sample distribution using 'Input Analyser'.

typically represent the time intervals associated with a random process in terms of a histogram in the Input Analyser Window. Once a specific distribution to fit the histogram is selected, it is always essential to assess the quality of its fit (i.e., to fit the best distribution to the data). This can be achieved by using formal statistical tests or by employing a simple graphical method in which an overlay of the theoretical distribution is displayed on a histogram of the data and a visual assessment is made to determine the quality of the fit [20, 30, 33]. ARENA contains a set of built-in functions and provides an interface (through various dialogue windows) to allow users specifying 'operands' for random variables to obtain samples from the commonly used probability distributions. Each of the distributions has one or more parameter values (mean, standard deviation, etc.) associated with it, which depends on the distribution of the random variables. Figure 6 illustrates

Modelling techniques in integrated operations and information systems

81

? x

\{ariables:

.

Mach2Rale. Mach3Rale. Mach4Rale. Mach5Rale. Mach1UpT ime. Mach2UpT ime. Mach3lJpT ime.

I

OK

III

Add... IdL

I

Qelele

~

~41 I nT Kn"

I

...

Cancel

III

tielp

- a

ttAlBulfer2l-

iJEJ

Variables

M""nun # 01 ttAlB ull. ,3 ttAlB ull.,4 Choice ChoiceB Foc'", Load Tine Machl 0 0wnTime MachlRaie MachlUpTime

-

l~VakJe' .. I I

I

.j

If= ~

I

..... , I

OK

III

Coree!

I

llelp

I

Figure 6. Random variables used for the peBA system modelling.

the 'VARIABLES module' to be used to specify user-defined global variables and their initial values. The main idea of statistical inference is to take a random sample from a population (i.e., the entire group from which we may collect data) and then use the information from the sample to make inferences about particular population characteristics such as the mean (measure of central tendency), the standard deviation (measure of spread) or the proportion of units in the population that have a certain characteristic. A sample is generally selected for study because the population is too large to study in its entirety. The sample should be representative of the general population. This is best achieved by random sampling. Because a sample examines only part of a population, the sample mean will not exactly equal the corresponding mean of the entire population. Thus, an important consideration for those planning and interpreting sampled results is the degree to which the sample produces an accurate estimate of reality. In practice, a confidence interval is used to express the uncertainty in a quantity being estimated. Inferences are based on a random sample of finite size from a population or process of interest. Therefore, one gets different data (and thus different confidence intervals) each time [21, 28, 32, 33,341· The sampling distribution is the probability distribution or probability density function of the statistic. It describes probabilities associated with a statistic when a random sample is drawn from a population. If the parameter in a system varies continuously then it is possible that it conforms to one of the standard statistical probability distributions, such as: Uniform, Normal, Exponential, or Poisson. Thus, this behaviour can be sampled from a distribution. For instance, operation times at a workstation can be sampled from a distribution. First, the type of distribution must be determined, and its parameters must be

82

Q. Wang, C. R. Chatwin, and R. C. D. Young

calculated. To do that, the actual operation times are studied and plotted as a frequency distribution. If the shape of the distribution suggests that it does conform to one of the standard distributions, then the 'goodness of fit' of the observed data can be assessed and the parameters for that distribution can be computed. If the frequency distribution of the actual times do not conform to a standard distribution, then the observed data can be expressed as a histogram and samples drawn from that. It could also be sampled from the histogram giving the probability of an operation being performed at each workstation [31, 33, 35, 36]. The definitions of each of the distributions used for ARENA models in this project will be summarised together with those used for COMNET models in section 3.2.1.2. 3.1.2.1. INPUT DATA ACQUISITION AND ANALYSIS FOR STOCHASTIC SYSTEM MODELS. The essence of this procedure is abstraction and simplification; the real difficulty in modelling is to determine which elements should be considered and included in the model [36, 37]. For establishing a flexible manufacturing system (FMS) model, these inputs could be abstracted by considering: 1. The basic configuration of the FMS, and its production scheduling, which defines the entities and activities involved in the model and the logic sequences that occur for each activity. 2. Number of workstations or machines that should be included in the simulation model. 3. How many types of processed parts need to move through the FMS, do they have similar processing requirements or not? 4. Buffer capacities for each machine. 5. Transport: conveyor or AGV and their track. 6. Profile of operations allocated to each workstation or machine. Once these elements, together with logical functional relationships and their relevant descriptive information (descriptive variables) are determined, the simulation model can be built as a logical flow block (or pseudo-code) to describe and represent the real system to be investigated [30, 34, 35, 38, 39]. The authors believe that the input data collection and analysisplaya key role in successful implementation of simulation model construction and simulation execution. Typically, more than one third of project time is spent on identification, collection, validation and analysis of input data. Although very little research work has paid attention to the development of systematic approaches to input data gathering, a number of researchers have raised issues surrounding data collection [20]. Basically, the quality of available data is a key factor in determining the level of detail and accuracy of the model. Stochastic models typically depend upon various uncertain parameters that must be estimated from existing data sets if available;otherwise, if the data does not exist they can be sampled directly from theoretical probabilistic distributions. With manufacturing systems, there is no standard method for collecting the required information [36]. Data

M odelling techniques in integrate d operatio ns and infor mation syste ms

83

resources can possibly be collected from a literature survey, interviews with domain experts, indu strial data reviews and state of the art assessme nts. System design document ation includ es data such as: drawin gs. specifications, production records and so on, it is imp ortant th at such data reflects th e cur rent configuration of the system. Althou gh th ese resources are usually reason ably accurate, th ey may be inaccurate or insufficient , as histori cal records often do not represent the performance of the cur rent system. Even thou gh there is frequ entl y copio us data from reliable sources, simulation experts always argue over how we sho uld use th e data. If wc sample directl y from the empirical data, we may faithfully replicate the past but no values other than tho se experi enced in the past can occur. If we fit the data to a th eoretical distribution and then sample from it, the simulation may give values eith er bigger or smaller than the histori cal data, so the accura cy of representin g the system is in doubt. This debate still cont inues. and an appropriate solution is still unclear. If empirical data is to be used, it is input in the form of a cumulative probability distribution, which can be plott ed by appropriate tools such as th e so called 'Input Analyser' which arranges data in ascending order, grouping identi cal values, and computing th eir relative frequencies. To organise raw data, first the collected data can be summarised and grouped into classes or categories so that we can determine the number of ind ividu als belon gin g to each class. The observed number is called the class frequ en cy. We can th en form frequen cy distributions by determining the largest and smallest numbers in the raw data, thereby defining the range, and breaking the range into a convenient number of equal class inte rvals. N ext, we can determine the number of observations falling in each class inte rval to find out th e class frequencies, and then th e frequ en cy distribution can be graphically plotted as a histogram, which repr esents a relative frequency distribution . Several excellent software packages. including ARENA, can perform these function s. These packages can simplify manual tasks in selecting and evaluating a distr ibut ion for model input data [20, 2H, 32, 40]. T he most difficult case in simulation studies is when th e data for mod elling systems does not exist either because th e system does not exist or because it is not possible to obtain the data. Ne vertheless, there are a number of possibilities to get data input for system's models : estimati on or th eoretical distributions. Vend ors, designers and mod ellers can make the estimation s. This greatly depends on factors from different peopl e wh o have different experienc es and use different measurement systems. The research has shown that people are very poor at estimating events even though they are very familiar with the systems. Therefore, the input data based on estim ations may be highly unr eliable; also in many cases it is hard to estimate. Instead, more popularly, we can choose a probability distribution based on theoretical considerations, i.e., using well-known statistical kno wled ge, so that we only need to determine how close this distributi on is to reality by specifying th e appropriate parameter values associated w ith the spec ific system [28, 30, 351. O ne of the imp ortant skills of a simulation expert is to know how to summarise the data, to simplify th e mod elling pro cess and to minimise the sensitivity of the results to errors in data estimates. Th anks to past studies of indu strial engineering statistics, we already know many statistical distribution fun ction s that can be used parti cularly to

84

Q. Wang, C. R. Chatwin, and R. C. D. Young

'represent' (or generate) various types of activity in industrial processes. For instance, it is already known among simulation experts that for a random process, inter-arrival times of customers (assembled parts) normally follows the exponential distribution, represented as EXPO. (ParamSet), which is thus often used to model random arrival times of events (and breakdown processes), but it is generally inappropriate for modelling process delay times. Also, the exponential distribution is typically not a good choice for representing service times, as most service processes do not exhibit the high variability that is associated with the exponential distribution. The normal distribution is used for the processing times when the mean is at least three standard deviations above zero. The uniform distribution is used when all values over a finite range are considered to be equally likely, which is generally used to represent 'worst case' results. Each distribution has one or more parameter values (mean, standard deviation, etc.) associated with it. However, the parameter values associated with relevant distributions are also based on statistical estimations that often depend on the phenomena being represented. For example, the mean value ofinter-arrival times can be estimated, if the times vary independently and randomly, and the estimated value is not large, then the time between arrivals can be modelled as an exponential distribution. This estimation can be considered reasonable [28, 33, 40]. 3. 1.3. Simulation data analysis

The simulation results include all the statistical summary reports in terms of textbased tables or graphics, which show the system's performance measures pre-defined in the experiment file. In general, a standard simulation result reports in text, presenting statistical data in the format of the sample mean, coefficient ofvariation, and minimum and maximum observed values to represent such factors as all kinds of times, queue length, machine utilisation, etc., within the investigated system. Simulation results can be used for designing new systems, and/or modifying and improving the operation of existing systems. One of the ultimate goals in the project is to compare and select the simulation-generated data to make inferences in order to improve the real-system performance. For instance, we want to use the model to draw conclusions about the expected maximum time that a job spends at each processing station so that we can find the system bottleneck in order to modify the system model to obtain a balanced system performance and to maximise the effectiveness and efficiency of the system. ARENA provides a facility called 'Output Analyser'. Similar to the 'Input Analyser', output files generated by simulation models can be transferred, plotted, displayed and analysed in the 'Output Analyser Window'. This can be useful for comparing the results of a simulation run with actual system data (by loading an external data file into the Output Analyser) for the purpose of validating a simulation model. ARENA also provides a facility to allow users to export output data files in one of two standard ASCII (American standard code for information interchange) file formats by using the 'Generate DIF' file and 'Export' options through the menu items. The Generate DIF (a standard file format) file option converts the data in the specified data file to the DIF file format. Since a variety of software packages use this DIF format, this

Modelling techniques in integrated operations and information systems 85

allows supplementary analysis or display of simulation results. The Export option reads unformatted data from an output data file and creates an ASCII-formatted file. This option is used when the results of a simulation need to be transferred to different types of computer operating systems or read into other software packages. On the other hand, external data files can be imported into a data group window with the 'Load ASCII' file and 'Import' options. The Load ASCII file option reads a free-unformatted data ASCII data file (without an output data file header) and creates unformatted data for use in the Output Analyser. This interface is used to exchange information between the two simulation tools [30], ARENA 3.0 and COMNET III, the latter being used to simulate the information system's performance. In addition, during the simulation the ARENA animation function can be displayed on the ARENA window, thus progress of the simulation can be observed and inspected by users. 3.2. Communication system model development based on COMNET III

The COMNET III package [41] was developed based on its former version Network II.5, LNET, and Simscript 2.5, which is written in a high-level, object-oriented simulation programming language MODSIM II. COMNET III is a graphic-oriented simulation tool that can be used to analyse and predict the performance of existing networks ranging from simple LANs to complex enterprise-wide systems, and to allow designers to evaluate alternative network designs by collecting simulation performance statistics. COMNET III supports a building-block approach where the blocks are 'objects' consisting of a model representing the real-world network. This network modelling approach allows a wide variety of network topologies and routing algorithms to be accommodated. This includes LAN, WAN, MAN and inter-networking systems; circuit, message and packet switching networks; connection-oriented and connectionless traffic; adaptive and user-defined algorithms. The network's operation and protocol parameters are set through as a series of IEEE tab dialogue boxes, which perform all functions of model design, model execution and presentation of results. COMNET III divides its simulation process into three phases [14,40,42,43,44,45]: • Network description and model construction. • Network simulation. • Simulation results and analysis. 3.2.1. Nctwore description and modelling constructions

This process can be split into two phases: • Building a network architecture model. • Building a network load profile for the resulting model network. COMNET Ill's graphical user interface allow users to create and modify the network's topologies with various nodes and links and to enter its operation and protocol parameters data through a series ofIEEE tab dialogue boxes which perform all functions

co

""

Background text

c......

8CA (x) rs",

Conveyor (-I

J-~

•

;:

PCllAnelwCllk

PlCl ·2

SemOf()IIretg

r;:;;;;:;;;; , -

S"""" (x)Mesg

w----

M I·)

,.- ,-

"

Cel contr ""

~J "

Shltng 0'-' (x) MeS\l

" S_ AS'"

Slopper Me$O

Six/brigIn (xl re",

Mbl)l esp

M[x)Me.g

SIOllP<'lx)

a.sA...,

.~

,.-

Cerolrol LovelSytlem

"

S""",,(xl

CLSMsg

I

Beport Li!:!rary Help

j .: : :1 Ii

~imulate

8C Ae.ode< (x)

Sources MessageSources ~ession Sources

Clouds

~p l;c.,tion

Figu re 7. CO M N ET II! user - int erface for net wor k mo delling .

Background map

Remote link

Source socket

Call source

Background shape

Ses sion source

Application source

Response source

Message source

Point-Io-point link

Token-passing link

I

CSMA/CD link

WAN Link II~ .• ~ I ~

Cloud

Transit Network

Ne twork Device

Processing node

Diagonal arc

Select object (s)

L.....s

'Modes

file fdit Y:iew bayout hreate Q.efine

Cel

conb"" A.,p

Cel conIr"" MS\l

SIWIing O'-'I_!

Sh/lng O'-' (x) rs",

Modelling techn iques in integrated operations and infor mation systems

87

of model design, model executio n and present ation of simulation results. As show n in figure 7, th e C O M N ET III tool palette facility is used to create: N ode s (communicating devices), Links (a link to which nod es may be conne cted and prot ocols or rul es for scheduling applications and routing traffic), Traffic Sources (workload across th e network ) and other to ols for editing. The C O M N ET main menu bar and its pulldown me nus, which follows th e standard format of Mic rosoft Window s and Mi crosoft NT, give users easy and quick access to use other fun ction s of the COMNET window int erface [41]. 3.2.1.1. MO DEL LING OF NETWOHK TO POLOG IES. T he first step of build ing a CO MN ET III simulation model is to co nstruc t a top ology tor the physical network to be investigated. That is because an automated manufacturing system is a large complex on-line system , which furth er consists of several distributed systems. Each distributed system involves any kind of inte lligent devices (robots, PC , PLCs etc.,) of a computer network (including sub-networks). If a communication network is to support manufactur ing applications, the network topology must be designed and determined so that th e overall system is maintained on-line. As show n in figure 7, th e physical layout of a network mo del (w hich consists of a networ k top ology shown in figure 8 in manufactur ing) for th e PC BA communication system can be built based on three basic compo nents: N odes, Links, and Arcs. N ode s to represent hardware (comp uters or switches), Links that carry traffic between nodes, and Arcs to show a nod e's port connection to the link . In addition to these basic facilities, there are th ree obje cts with interna l topologies: Sub net, Transit Ne t and WAN cloud. The WAN cloud is used for mod elling WAN service s, while the others are used for mod elling ind epen dent routing domains and hierarchical topology.

" Nodes N odes in C O M N ET III models can be switches, hub s, network devices, end systems, pads and general network compo nents. C O M N ET III provides four basic types of nodes including N etwork Devi ce N ode; Processing N ode; Computer Gro up N ode; which generate or receive messages, and R outer and Switch N odes, which are on ly used for routing traffic. Processing Nodes model computer hosts as well as comm unication pro cessing devices. Each Processing Node has an internal processor that execu tes software and process packets. The Processing N od e that is represented in th e mo del support the following applications: an input buffer for each link transmitting packets to it; a processor to execute co mmands and proce ss packets; an output buffer for each link to which it can route packets; local disk storage capacity for mo delling local read/ write commands; a pending application list of cur rently scheduled applications; a received message list for saving received messages until th ey are used; a list of files that may reside in local disk sto rage.

"Links As shown in figur e 9, Links are used to model a variety of different transmission m edia, ranging from LAN s to wide- area point-to-point links.

88

Q. Wang, C. R. Chatwin, and R. C. D. Young

-

Bus

D

Figure 8. Three-network topologies for manufacturing (supported by ARENA 3.0).

COMNET III provides various types oflinks corresponding to the types ofmedium access protocols for users to select, including Aloha, CSMA, CSMA/CA, CSMA/CD, DAMA, Dial-up, FDM/FDMA, FDDI, Link Group, Modem Pool, Priority FDDI, Polling, Point-to-Point, Satellite (STK), TDM/TDMA, Token Passing, Virtual, and WAN. The CSMA/CD library provides parameter sets based on the IEEE 802.3 standard. The token-passing library provides parameter sets based on the IEEE 802.4 and 802.5 standards.

• Sub-networks, transit networks and WAN clouds The Sub-networks (Subnets) in the COMNET III model are used primarily for modelling interconnected subnets of independent routing algorithms. A complex network may be built hierarchically using subnets hiding detail from an upper view. Transit Nets can be considered as an intermediate network modelling the flow of packets through the transit nets and can behave both as a subnet and a link. WAN services are abstractly modelled in terms of Access Links and Virtual Circuits using the WAN Cloud object [41].

""

00

,,""*'" ...

IPLC 1·2'NIl

I'~

IroM I"""

c.ruo

I

OK

I

1'0<41 0.0)

.:J

I"",....... .:J

::.J ..J

I

x

~

I·......I......-....II C-I

I"",

StaCistict

::.J.-J

x

Figure 9. Building a network load profile for different types of networks.

r

_100<1

RecfNln q

~

::J .-J

3

@~~ ~_ I O_ I T'~ E,i7,....,..j~.......

N~

OK

.... ~,...

Fid ""ivai

lim....

I·..,....I....--Ie-·I

koo

I..~.·.".I"E-od-'0~~~--::J~..-J

AJrN5tmH lr eurdrt

Ls~l ~ .......lo---IT'" s_ .. .:J

IN..., lPl.Cl ·2 Mesg

PLC '·2 I

f

,r

U>od

Aloha CSMA CSMAlCO Poli'lg

DAM/<

PoilI·T<>PoilI

Modem Pool

OioUp Poi'lt·I<>Pori

lr>k G""",,

FOMA

TOM UN TOMA FOM

Vwtual WMllrlk

P,icri)l FO OI

Priotiy TokenRing

FOOl B",ic

CSMAJCl\

I~ Tok.en Pass:ng

PCBA network

u,."

ICSMA/Ct)

1etm4
::J

1

.::J

~

COr..

Ilft< 5_1 s.... I I_ancedI

Parametels

T,.,.

Ui< T"",

l'l""'IPCBA I"""

x

90

Q. Wang, C. R . Ch atwin, and R . C. D. Young

3.2.1.2. N ETW ORK T RAFFIC AND WORKLOAD . Figure 9 also illustrates traffic sources in C O M NET III that include 'Message Sour ce' , 'Session Source' (not shown in the figure) and ' R esponse Source' . T he Message Source is the combination of an application source with a 'Transport Message Co mmand' and is used for modelling specific user or proto col- control messages. The Response Sour ce is th e combination of an 'Application Source' with an 'Answer M essage Command' and is used for modelling replies or ackn owledgements to messages. The Session Source is the combination of an Application Source with a 'Set- up Co mmand' and is used for modelling sessions of multipl e message, bursts of messages, or messages that are routed in virtual circuits. In addition to the sources mentioned above, Call source is the source to use for mod elling circuit-switched calls. Th ey specify calls by means of inter-arrival times, duration and the bandwidth requirem ents. COMNET III allows external sources to be introduced into the COMNET model through an extern al traffic file by using the 'Extern al' traffic menu. The external traffic file is a formatted text-based file containing a record for each traffic event : each record contains information abo ut the time of the event , th e source and destination and other information that occurs in a real network. Th e traffic file may come directly from various netw ork analysts or it may be created from some other tools. C O M NET III can int erpret events in the files as being messages, sessions, or calls. The C O M NET Baseliner utility is used to read in external traffic files and format them into an inter mediate file for COMNET III use. This utility allows multipl e traffic sources to be merged into a single intermediate file. Thi s useful function has been applied to link AR ENA simulation results to COM N ET. Th e C O M N ET input data interface will be illustrated furth er in the next section. T he parameters of sources for network traffic and workload are added through the (call, message, response, session, packet flow or packet rate matrix) 'So urce Dialogue Boxes' to dri ve the simulatio n. N etwork traffic refers to the messages sent between nod es in the network top ology. T he wo rkload is the internal activities of the nod e's processors or busses. Application sources execute commands that introdu ce either traffic into the network or workload inside the node. Message, R esponse, Session and Call sources simply generate traffic between nodes . Since nodes may have processing requirements for traffic moving between them, the workload commands can delay traffic by utilising the pro cessor wh en the traffic needs to use it. Co nnectionless traffic and the response as the receiving part of the connection are mod elled using Message Source, while connection - ori ented traffic can be mainly modelled using Section Sources. Traffic sources can be scheduled in three manners: iteration tim e, received message text and trigger. The iteration time meth od allows sources to be scheduled according to an interval from the previou s arrival, while the received message text method provides scheduling sources depend ent on messages received at the node . Application s consist of several different commands specified within the nodes. They provide a flexible means to mo del both traffic generation and wor kload at a particular node. Some of the se commands are R ead, Write, Transport M essage, Set- up Session , Answer Message, Process, and Filter 141].

Modelling techniques in integrated operations and information systems 91

In most cases, activities or events in a communication system as well as a manufacturing system in production are stochastic. Furthermore, its samples for random variables within the system can be obtained from the commonly used probability distributions, such as exponential and uniform distributions, etc. ARENA 3.0 and eOMNET III provide a set of built-in analytic distribution functions, which can be used to generate input data for models to drive simulation engines. Figure 10 illustrates such a case; these distributions have been used for ARENA and eOMNET models to represent the peBA system, they are summarised below. More information related to engineering statistics can be found in references: [28,30,33,40,41,46].

• Exponential distribution-Exponential (Mean) This distribution is widely used to model arrival times of events that follow a Poisson pattern. Each sample chosen from the exponential functions specifies the time that will elapse before the next arrival. This is called the inter-arrival time. Samples have a high probability of being less than the mean. This implies that the distribution has a long tail and will occasionally provide a sample significantly higher than the mean. This behaviour is very useful in modelling random arrival and breakdown processes, but it is generally inappropriate for modelling process delay times.

• Uniform distribution-Uniform (Min, Max) All values between minimum and maximum are equally probable, excluding values of the minimum and maximum. This distribution is used when all values over a finite range are considered to be equally likely and is sometimes used when no information other than the range is available. Because of its large variance, this distribution can be used to model 'worst case' results such as message response time for fixed (or maximum) message SIzes.

• Normal distribution-Normal (Mean, StdDev) The normal distribution is often used empirically for many processes that are known to have a symmetric distribution and for which the mean and standard deviation can be estimated. The distribution should only be used for processing times when the mean is at least three standard deviations above zero. In eOMNET III, the normal distribution is truncated so that it does not produce negative numbers. If the mean chosen is more than about three times the standard deviation there will be little effect, since there will only be a very small portion of the normal distribution to the left of the origin. A message could be described as having a mean size of 20000 bytes and a standard deviation of 5000 bytes.

• Triangular distribution-Triangular (Min, Mode, Max) The triangular distribution is commonly used in situations in which the exact form of the distribution is unknown, but estimates for the minimum and maximum, and most likely values are available.

N

'"

or.

I

(

a1~~~"\'m

,I.

..

.

I'-I~I-I

~

~:

'

,.. ,

-.., .

'.

~.

MG4ll!r95.D.EJlP,I.O.E'4l.101

~Ofo"~",'OO Ol

~O.OOI .EIIP.l00 lJl

H~o.lQ01l5 1 I~J OI

Go
6.-(2.0.1.01

' ''';0.11 ElCllHUJI

8~ri200o.mOIIIP.l

Boll1.5.5.0.00.l.0.J

1.0

_-a

I.....""""!e-.-I

1-"",- ...

~-- .3.[J

..-...lw__I_ l r...

...

N'OciNlIlla

]!'.

,, 'O az.~

p......

I

.,.,

Figure 10. Building a traffic load profile using standard probability distributions.

PLC1 ·2Mesg

p

--

I PLC ~~

~_ I~l.r",

N_.,

.

q

q

0.0006

0.0008

t

10

20

--

1500

2000

0.0 50

0.2

0.4

0.6

0.8

1.0

I

2500

l""l

!;l

0.2 3000

10.0

0.4

0.6 ~

0.8

1.0

CDF · £>
.0

PDF · Unlll 02 4.0.2500,0) CDF • Unlll 024 .0.2500 .0)

v: 1000

30

COF - PDF Graph

PDF · Exp(10.0)

0.0000 500

0.000 2

~ 0.000.

~

0.10

CDF· PDF Graph

Modelling techniques in integrated operations and information systems 93

3.2.2. Network simulation

After a model has been built, COMNET III can test the model automatically using Verify command for correctness and completeness and using Run Parameters Dialogue define a simulation experiment including the replication time for the duration of the simulation for: statistics collection, the warm-up period when statistics are not collected, the number of replications for the number of reports, and two check-boxes for re-setting the system to empty and idle at the end of each replication and for running a warm-up for each replication. COMNET III can perform animation during the simulation by setting animation parameters, though this will significantly reduce the speed of simulation. 3.3. Simulation result analysis

COMNET III provides two forms of reports, namely real-time and non real-time reports. The former provides a graphical on-line representation of selected performance parameters oflinks and nodes during the simulation. Figure 11 shows such an example. The latter provides all the various statistical results selected based on various objects or items (nodes, links, traffic sources, etc.), which can be viewed at the end of the simulation run for further studies. The textual reports at the end of each replication of the model can be selectively turned on by choosing the 'reporters dialogue' box. However, the major reports include: message delay reports, response and session resources, and channel utilisation reports for links [41]. 4. INTEGRATED MODEL APPROACH

Figure 12 illustrates the principle of the integrated model and the connection between ARENA 3.0 and COMNET III. The model is established based on these two simulation tools; the output of one provides important input data to the other. Since ARENA and COMNET are two separate software packages, an interface has been developed that links the two packages together. This allows the statistical results (SDF files) generated by ARENA to be passed to the COMNET model. 4.1. Establishment of an integrated model

As mentioned in section 1, this research attempts to investigate the performance of time-critical flexible manufacturing systems, in which all the communicating facilities or equipment are beneficially integrated through an efficient communication network. In order to allow an overall investigation, those factors, which may potentially affect the system's behaviour and may cause a fluctuation around the system's bottleneck, must be modelled. Such a fluctuation depends on complex factors, which may have a significant impact on the dynamics of the system; these should be identified and included in the model. For the PCBA system, the performance evaluation should be completed based on those factors (or performance measures), which will influence the entire system. These factors stem not only from aspects of the PCBA operational system but also from aspects of the PCBA communication system. Therefore, these factors and their

-0

....

I

I

I

Figu re 1 1. Networ k sim ulatio n on line.

PLC1·2Mesg

~I q

PLC1·2

~

PlC 1·2resp

q

C>

60

o

.

PlC 1·2 re . p Oeh.y 'roll1 PlC 1·2

10

A....

20

'"

60

1':

!

40 .!. 30

i

.0 188

70

-

90

t

..

100

~

110

1'0

......

Response Time (u eJ ....ean

Slrnul.don Time Isec)

80

...,..-_

PL.C 1·2 't Ip Delay hom Pl.C1·2

M ean

Delay Isec)

Sknuladon lime lic e)

80

~ ~ .I ' I'!/ I

f • . tclll ~- I:i ~

A.""

30 20

40

't- bot>

120

120

l,o ~~ '-~- '-

-

!

~ ..

~ 1 ..11d1 ~ .I 'I ~ I

E"

1·1

q

q

!:!~

: 120

1

130

150

C>

¥

i

Mean

Res-paRse Time Isec.1

130

150

160

lsecl Delay laeel Mea n

Si mu lation Time

140

- ,

PLCI ·2 Meag Oclay lrom PlC 1·2

170

170

--- ~- ---

160

---_._--~

0.0 120

1.0

2.0

3.0

.-

Pl.e 1·2

Simul atio n Tlrae IsteJ

140

Pl.C 1·2 tr.p D elllY from

[j' 1.-:11,;l1 ! 11il J1f'191'/1 11I11

:.

flo Edt ~- H'"

.....

20

30

40

1

-

!

i

~~91'11II11

[" .E.$ li-

180

180

I ·j

n ix

I- j

Mod elling tech niqu es in integrat ed op erations and inform ation systems 95

--------------------------~ Capt ured networ k traffic Model descrip tion data

T raffic link

CO MN ET summary

reports

summary reports

i) ARENA 3.0 '

ii ) CO MN ET Ill '

Figure 12. Flow chart for the int egrated mo del based on ARENA 3.0 and COM NET Ill.

variables should be identified. However, in this particular investigation, we do not include those resources (such as the design department for production scheduling, etc.) that are also linked and are inte grated into the system but operate in a relatively slow manner and thu s do not significantly affect the performance of on-line production. O ther assumptions for the established integrated peBA model are summarised below : • The produ ction operates cont inuously and co nstantly witho ut breakdo wn du e to physical failures of machin es and devices on the assembly lines. • The number of available machin es or devices and th eir capacities at th e assembly lines to perform both the assembly and changeo ver activities are fixed and kn own .

96

Q. Wang. C. R . Char wi n, and R . C. D. Young

• Each work station consists of one assembly robot (i.e., SMT machin e) with input and output buffers. T here are no buffers between workst ations . Each product is assembled at least once in th e assembly schedule, wh ich was determined previously. • For communic ation networks, assume that all communicating devices (or machines) are able to communicate wit h each other properly, and each device has a suitable int erface that allows connec tion to the communication networks w itho ut physical failures. • This study does not evaluate issues co ncern ing production costs, such as costs arising from assembly, labour, changeove r, and miscellaneous thin gs. Based on the above assumptions for building an integrated model some key requirement s and steps are summarised as follows. 1. Define all the relevant equipment (or workstations) in th e real-system to be modelled, including number ofmachines and devices or workstations, number of part types and number of operations for each part, capacity of each machin e, buffers and thei r capacities at each machine, and mach inin g sequence. 2. Analyse the operati onal fun ctions of the system in order to form all approp riate input data (i.e., so- called experime ntal files) for ARENA mod els, such as physical and logical sequencing and tim ing parameters to represent the operational function for individu al items of equ ipm ent . T he timing parameters includ e loadin g/unloading times for each machin e (and/or each loadin g/ un loading station), materi al handling time s for each assembly robot , machining time s for each operation, arr ival times for each type of part , and other parameters such as batch size, conveyors' (and/ or AGVs') spee d and travelling distances. These data resourc es heavily depend on different systems to be investigated. 3. Determine how to schedule the information proc essing fun ction related to information events. The timing of information events is deri ved from the statistical analysis of th e operational simulation output (statistical results) provided by ARENA simulation m odels. 4. Analyse and extract the output of statistical data from ARENA simulations and interpret it into statistical distribution functions (SDF s) as inpu t variables with th e particular parameter values, which are required for COMNET models. T he SDF files for C O MN ET models are mainly related to the type of distribution for representing the information flow activities betw een two communication devices via the com mu nication network. M oreover, th e mo st important consid eration in choosing th e specific probability distribution with one or mo re parame ter values for the specific rando m communication beh aviour is the degree of closeness to whi ch it resembles th e real information events. O bviou sly, th e output statistical data from ARENA simulatio n mod els that represent the rand om operational aspects (or processes) will certai nly affect the probability distribution chosen for C O M N ET modelling. Moreover, the selection of the distribu tion 's parameter values is one of the mo st critical procedures, because ifit is not accurate, simulation wo rk using C O M N ET will not represent the real-system's behaviour.

Modelling techniques in integrated operations and information systems 97

5. Ensure that the logical sequence and interaction of all components and the interrelationship between the operational function and the information processing function in the system are precisely defined, thus, a complete simulation model can be implemented. Once the generic model has been verified and validated, it can be run to represent the actual (physical and logical) operations ofthe real-world system without re-building the system model for the different system's investigation scenarios, and it is also easy to add or remove components to investigate the effect of system alterations. The aim is to utilise the simulation-generated data to observe the impact on both systems' function and to assess the entire system's performance by making inferences with the system model. From this, an optimal system specification can be drawn up. 4.1.1. Operational system

The process of assembling electronic components provides a typical flexible manufacturing system, which involves complex items (part types) being produced in limited quantities. For example, there may be short-term variations in size, quantity and frequency of the lots to the system. This stochastic variation is termed 'flexibility' [25]. This type of assembly system is also a time-critical application.

• ARENA model of the PCBA system A flexible manufacturing system from the printed circuit board assembly sector represents a stochastic manufacturing scenario extremely well. Figure 13 shows a layout of the PCBA system, its graphically animated simulation model was constructed using ARENA 3.0. The system is composed of pallets, load/unload machines, pick and place machines, shifting-in devices, shifting-out devices, fixing devices, sensors, bar-code readers, stoppers, assembly robots (or so-called SMT placement machines), cell controllers, carriers and flexible conveyor systems, which are routed and controlled by Pl.C's throughout the system. To summarise the operational sequence: unprocessed components (printed circuit boards) are held in pallets for transport into the 'Enter System' and then loaded onto the loop conveyor by the loading machine at station M1. The assembled PCB's are unloaded at station M2 with a high priority given to exiting the PCBA system. Unfinished components enter another cycle until all assembly operations are complete. The sequence of operations at stations is arbitrary. The entire operation is controlled by one of the cell controllers, which interact with others at the relevant workplace locations to accomplish the individual activities. The detailed system description and its parameters are explained below: 1. The arriving unprocessed palletised PCB enters the buffer area, where a sensor senses the arrival of each palletised PCB. The sensor sends a message to notify the cell controller waiting for the decision for access to a gravity slide, which feeds the loop conveyor containing the palletised PCB's queuing for assembly. If not available, the

Q. Wang, C. R . Chatwin, and R . C. D. Young

98

Enter system

Carrier

o

r.. ··· . ·.. ·.. . ·. . ··... ·. ·..··........ i Fina l assembly area

,-

o

- I

I-

Accumulating conveyo rs

-

;'············..·········..········..········r····_····Loop conveyor

Ii

, i

PC B type legend

o

Hoard type I Board type 2

-

-

EI II Carrier

1

t

r-..---

, ,

Exit

Suhassemhly area

.i

Finished I lousing

Type 2 H / Modul e

Figure 13. The layout of the PCBA system.

sensor activates a stopper to halt the palletised PCB and stop it moving into the slide. O nly two palletised PCB 's are allowed on the slide at anyone time to avoid damage to the circuit boards. T he time it takes to traverse the slide follows a normal distribution with a mean of 3 minutes and a standard deviation of 1 minut e. 2. O nce a palletised PCB reaches the end of the slide, it must wait for a space o n the no n-acc umulating loop con veyor that is controlled by a PLC. T he loop conveyor, whi ch has a length of 18 meters, has space for 30 circuit boards waiting for assembly. When an open space becom es available at the end of the slide, the cell controller will inform the PLC to stop the loop conveyor, and the arriving palletised PC B is loaded onto the loop conveyor at station MI . This loading pro cess requires an operation time that follows a triangular distribution with a minimum of 0.2, mo de of 0.3, and maximum of0.5 minutes, wh en the loading operation is compl eted, the loop conveyor is re-ac tivated by the PLC. 3. T he palletised PCB s then travel on the loop conveyor at a speed of 9 met ers per minute unt il they reach their required assembly lines: final assembly area for type 1 parts and the sub-assembly area for type 2 parts. Bar- cod e readers scan bar- cod e labels to identify the status of each arrivi ng PCB . PCB types arc notified to the cell controller to update their status. Th e queue in front of each assembly operation has room for two circuit boards. If the queue is full, th e palletised PCB continues around th e loop conveyor until it can enter the queue. If space is available in the queue, the

Modelling techniques in integrated operations and information systems 99

SIn (x), STop (x), SEns (x), BCR (x)

Main conveyor belt ( I)

Branch conveyor belt 2

WK (x) SOut (x), STop (x), SEns (x), BCR (x)

Robot (x)

Figure 14. Schematic layout of the peBA final assembly area.

palletised PCB is automatically diverted off the loop conveyor into the appropriate assembly system. The diversion of palletised PCBs from the loop conveyor does not cause the conveyor to stop. 4. The entire processing times at the sub-assembly area and final assembly area conform to a normal distribution with a mean of 6 minutes and 7 minutes, and a standard deviation of 1 minute and 1.5 minutes, respectively. Once the assembly operation has finished, it exits the assembly operation and enters an accumulating roller conveyor; each one is 3 meters long, and parts travel at a speed of 8 meters per minute. However, if the accumulating roller conveyor is full, the assembled parts are not permitted to leave the assembly operation, thereby blocking its operation. The bar-code readers will scan finished parts at the end of the roller conveyors and send a message to the cell controller to update their status and request transportation by one of two available carriers. The carrier (AGV) moves to the end ofthe roller conveyor; picks up the processed parts, and leaves towards its destination, which is selected according to the shortest-distance rule [28]. For an individual workstation WK (x) at the final assembly area shown in figure 14, when each palletised PCB is coming in on belt 1, if the PCB is to be processed at station WK (x) where the buffer in front ofWK (x) is not full, it will shift to belt 2x via the shift-in device SIn (x) to start the assembly operations by robots or queue in the buffer area. During the assembly, the pallet is held stationary by the fixing device.

100

Q. Wang, C. R . Chatwin, and R . C. D. Young

Other pallets behind it have to wait beh ind the shift-in devi ce until the sh iftin g process is co m pleted. The pallet m oves o n belt 2x to SOut(x) w he re th e palleti sed PCB will be shifte d back to the system co nveyo r, belt 1, with priority ove r th ose com ing from th e left o n Beltl. After definition and validatio n of the system model, eac h sim ulation was run by sim ulating 8 hours of assem bly activity w ith a 30-minute warm-up time. T hi s took appro xima tely 25-30 minutes on a 266 Mz Pc. T he result s fro m initial sim ulatio ns were used to optimise th e syste m m odel to give a co n tinuo us flow o f PCBs. In additio n, all time var iable statistica l info rmation, such as tallied frequencies of the time interval between th e arr ivals o f two palletised PCBs at eac h wo rkplace, was analysed as a reference resource to determine their statistical distr ibution func tions related to communication events. This was used as input data to the COMNET m od el and represen ts the traffi c reso urce required to handle th e peBA co m m u nicatio n traffic. 4. 1.2 . b!{tmllatiofl processing system

Wi thin th e rCBA system, the co m m u nication network (LA N) m ust be able to provide a m ech anism for co m m u n icatio n and sync hro n isatio n among several workstations (ro bo ts, etc.) wo rking together to accom plish assem bly ope ratio ns without system 's failure du e to network problems (such as overloa d et c.).

• COMNET model of P CBA system Figure 15 shows th e establishe d COM NET model for th e PCBA co m m un icatio n syste m th at was de sign ed for th e PCBA system. T here are 93 co m m u n icating dev ices th at are co nnected to a single local area net work (i.e ., LAN). All the se d evices have a suitable interfa ce that allows co nnec tio n to th e lo cal area network w itho ut ph ysical problems. The network co m m u n icatio ns link into : load /unload machines at statio ns M 1 and M2, shifting-ou t devices, shifting- in dev ices, fixing devices, con veyor system s, sensors, bar- code readers , stopper s, cell-cont ro llers, assembly ro bo ts, PLCs and a central co ntrol- level system (PC) . Their fun cti ons have been de scr ibed in section 4.1.1. C O M N ET sim ulation m odels for larg e ne twor ks often divid e the traffic into two typ es: foreg ro u nd traffi c and background traffi c. Foregrou nd traffic represents detailed models of applications and th eir prot ocols , an d backgro und traffi c represents the existin g ut ilisation that compete s with th e foreg ro und traffic. Suc h models require a m ech anism for modellin g back ground or baseline loading of th e network . Often , th is load is know n onl y from a m easur ement of utilis ation on th e link without any informati on as to th e nature o f that traffic, M odelling background traffic w ith me ssage so urces is often impractical becau se th e size of the network requires to o many message so urces to be co nfigure d, and th e m essage sources th emselves require ent ry of m any detailed attr ibute s. It is very co m m o n th at most of the above det ails on background traffic are un know n or th at the o nly informati on known abo u t the traffic is the u tilisation th at is pr esent on th e individu al links in th e netw ork. Ther efor e, th e C O M NET

§

BCA I_Iresp Sensor[xltesp

PLC 1·2

::~

k

M I_I

i];

ShiftingIn I_I resp

.........

....,

/'

Cellcontrolle'

StopperResp

,.'"

~,

~

Stopper Mesg

ShillingInI-JMesg

M(xlresp

M[_J Mesg

~

StopperI-I

/ / CLS Resp

SensorIxJMe$g

Figure 15 . M od el of th e PC BA communica tion system built by C O M N ET II!.

Conveyor(-Jresp

H

Conveyor I_I

Conveyor I_IMesg

Carrier (1/21 resp

Cenier(1I2J Mesg

CLS

Cenilal LevelSystem

Out I_I

<,,

Cellcool/oller Aesp

~

Cell controller Msg

S h~ting

Shifting Out I_I resp

102

Q. Wang. C. R . C harwin, and R . C. D. Young

model requ ires statistical distribution functions (i.e., SDFs) as inp ut variables, whic h characterise the actual information flow activities of th e network. To implement a simu lation, traffic event s among th e system 's devices sho uld be scheduled individually by formulating a series of information flow charts, show n in figure 16, to identify infor mation activities. These traffic events are further summarised statistically using the matr ix analysis metho d to indi cate traffic events in terms of SD F files. The SD F files must then be formatted as so-called external traffic files, which are a formatted text-based tiles containing a record for each traffic event . Each record contains information abo ut the timin g variables of the event, the source and destinatio n, and other relevant information [2, 36, 4 I]. C O M N ET III provides a tool called 'Baseliner' that has been used to read in external traffic tiles and to form at them int o an intermediate tile for use of COM NE T models. This ut ility allows multipl e traffic sources to be mer ged into a single intermediate event tile that contains input data in th e form of SDF tiles. Since th e activity in the operational function is a random process, the activity in th e in form ation processing fun ction is also a random process. Establishing a meaningful statistical distribution fun ction corr esponding to each gro up of information activities between two devices is a critical issue for the simulation . D eterm ining whi ch ARENA simulation statistics are valuable for formulating the C O M N ET simulatio n input is a difficult time co nsuming task. Th ere are three meth ods to select appro priate prob ability distributions. The first is to use actual data values, the second is to derive an empirical distribut ion , and th e third is to use th e best theore tical distribution. In the absence of data the best way is to choose the most suitable th eoretical distrib ution . Neve rtheless, some of th e required data does not exist, other data does exist but with limited resolution, plus the re will always be cont roversy over which meth od of statistical analysis is suitable for pro cessing existing data. For example: should we use empirical data as input in the for m of a cumulative distribut ion or sho uld we use historical data to fit a theoretical distri bution and then sample from it. In th is project, both approaches are used, since the accu racy ofselection of inp ut data plays a very important role in represent ing the real system in a precise mann er. To ensure th at the simulation is realistic, a so-called 'Inp ut Analyser' , whi ch is provided as a standard co mpon en t of the ARENA environm ent , has been applied in order to determine and exam ine th e quality of fit of prob ability distribution functions to input data for the COMNET model. H owever, all information flow activities between two communication devices have th e followin g features: 1. Transmitter sends the message to a specified destination , i.e. receiver. 2. Receiver receives the message from the transmitter. 3. R eceiver sends respon se message to transmitt er to con firm receipt . 4. Transmitter receives th e confirmation message. It is assum ed th at all communication devices except th e cell cont rollers have sufficient memory to temp orarily store data or processing programs to accomplish th eir specific

Modelling techniques in integrated operations and information systems 103

0/

~ The fixing device responds to cell controller for completion of releasing the pallet (moved by one of carriers)

senses arrival of pallet

Yes

Sensor sends the message

+

Cell controller sends message to the PLC to stop the loop conveyor

10

the stopper to stop the

palletised finished PCB and

notify the cell controller

+

NO

PLC confirms

of load/unload station at MI and sends instruction

Yes

(message) to fixing device at

M2 to fix the pallet

1

Cell controller sends instruction to load/unload station at M I to begin loading unprocessed new PCB

Delay for fixing

+

~

~

+

Load/unload station M1 sends message to cell controller for completion ofloading

I

+

Delay for unloading

~ Load/unload station M2 sends message to cell controller for completion of unloading

~ Cell controller updates WIP status and notifies fixing device to release Ihe pallet

Cell controller sends message to the PLC to activate loop conveyor

+

Bar-code reader scans the barcode label and sends new PCB serial-code to cell controller

Delay for loading

The fixing device sends message to cell controller for completion of fixing pallet

Cell controller sends instruction to load/unload M2 to unload assembled PCR

1

completion with the cell controller

Cell controller updates cell status file ofthe station queue

+

Cell controller updates WIP status of new coming PCB for assembly

~

3

Note: I.

2. 3. 4.

Activity Initiator: Load/unload station MI, M2; Fixing devices at M2; Sensor & stopper at M2, BC Reader at M I, the Cell controller; the PLC Initiator Action: Send, Processing, Read file, Write file, Response Information Flow Activities: SENDER_send - RECEIVER - received & echo- SENDER_received & confirm - RECEIVER Scheduling Method: Iteration time (SDFs); Received message text

Figure 16. Information flow and activities on load/unload operations.

104

Q. Wang, C. R . Cha twin, and R . C. n Young

tasks. The cell controller has a 140 Mb ytes built-in hard disk to sto re all necessary data from other devices or processing program files from the system level com puter. T he system co ntrol level co mputer only communicates with one of two cell controllers to send processing program data or receive an hourly WIP status report collected by this cell cont roller that also has a du ty to distribute files int o individu al workstations. Another cell controller will communicate with any of th e devices to give instructions or receive fault reports, etc., to control system 's operatio ns. The SDF for th is device is exponent ial. 5. SIMULATION RESULTS, ANALYSIS AND DISCUSSION

Extensive simulation results can be produced from th e established inte grated generic model after simulatio ns are co mpleted. Selection of which simulation results will prove valuable and useful for analysis depends on the specific investigation required or on the client s' need s. The purpose of analysing and using th e simulation results for this project is to assess the fun ction of the opera tional and information systems to ensure that th ere are no funda mental weaknesses in th e peBA system. This includes:

1. A full investigation of th e capacity and equipment utilisation within the operational system and the information pro cessing (communicatio n) system that consists of the entire peBA system in ord er to identify any bottl eneck that is involved in production proce sses, such as product flow, parts routing, resources' assign ment, assembly line-bal ancing and the network efficien cy. 2. Besides this, th e key contribution of this research is to have developed an integrated met ho d that allows system designers and analysts to det ermine the relevant impa ct on logical int eractions and interrelationships between the operations and information processing systems based on an analysis of various simulatio n results. This unique feature will be particularly stressed and dem onstrated throughout thi s chapter. 3. Furthermore, a comparison of th e performance of alterna tive systems using different communication prot ocols for th e peBA communication system was also investigated to maxim ise the system performance and to obtain an optimal solution. Since it is impossible and un nec essary to display and analyse all the gene rated ARENA / COMNET simulation output data in this forum , not just because the data volume is vast but also because the displayed data is always heavily dependent on the end users' requirements for th eir specific investigation . Figur e 17 present s an example of a text-based summarised simulation output (report) generated by th e ARENA model of the r eBA system. Table 1 summarises the network performance m easures and th e related parameters that are investigated for end users in this study. For the operational aspect of a manufactur ing system, th e performance measures co rrespo nding to related param eters are mainly presented in table 2.

Modelling techniques in integrated operations and information systems

~

FREQUEtiCIES Identifie ..

Catego..y

-- Occ u.... ences - tlulllbe .. AvgUllle

STATE(l1achine1)

Sta ..ve d Busy Failed Blocked

553 5 553

STAT E(l1achine2)

Standa ..d Pe..cent

Restricted Pe..cent

1

.30100 .611131 29.821 .92784

0.03 33.75 14.91 51 .31

0.03 33.75 14 .91 51.31

Sta ..ved Busy Failed Blocked

8

355 16 347

13.263 1 .11371 21.789 .511131

111.61 36.82 34.86 17 .71

111.61 36.82 34.86 17 .71

STATE(l1achine3)

Sta ..ved Busy Failed Blocked

18 25 2 7

23.6311 16.21111 13.515 211.373

42.54 411.511 2.711 14.26

42.54 411.511 2.711 14.26

STATE(l1achine4)

Sta..ved Busy Failed Blocked

489 493 4 5

.86773 . 74682 8 .5726 34.641

42.43 36.82 3.43 17 .32

42.43 36 .82 3.43 17 .32

STATE(Hachinl'5)

Sta ..ved Busy Failed

5114 5114

.87726 .66964 27.545

44.21 33.75 22.04

44.21 33.75 22.114

8

Silllulation ..un tillle: 7.55 Illinutes . Sililulation run cOlllplete.

Figure 17. An ARENA summary report after simulation.

Table 1 Performance measures and related parameters for end user's interests (1) Performance variables

Parameters

Application functionality User friendliness Response time Throughput Queue length/ delay Network utilisation/bus utilisation Physical topology Transparency /protocol compliance Reliabilityfloss probability Capital cost

Number of Stations Message/packet/address sizes Station delay Network topology Redundancy Load characteristics/data rate etc. Buffer size Protocol interface/channel access scheme Hardware delays Hardware/software, channel access etc.

105

106

Q . Wang, C. R. Chatwin , and R . C. D. Voung

Tabl e 2 Perform ance me asures and related parameters for end user's int erests (2) Typical ARENA template (co rresponding to)

Modelling elements

Modelling data (Added/ removed)

Measurem ent (Any bot tleneck)

Parts

N um ber of part types Arr ival/leave time Batch size Max. batches Processing times N um ber of machin e types N umber of operations for each part Mach ine sequence C hoice of machinin g process Machine time for each part Input/ output buffer etc. N umb er of machines Capacity of each machine Loading/unloa ding time of each machin e Inp ut/ output buffer capacity etc. N um ber of AGV Speed /travellin g distance between

O utpu t/ queue

C R EAT E ARR IVE/ DE PART

Work-in-process Utilisation

DELAV SERVER/

SEIZ E/

R.ESOURCE

RE LEASE

Machines

Load/un load station

AGV

R ob ot Co nveyor Workers Buffers

C H O OS E/ ROUTE

C HOOSE Delay/ capacity U tilisation

SERVER

Delay

TRANSPORT

umb er of robots Pro cessing rim e N umber of co nveyors Speed/ travelling distance Q uantit y

U tilisation

SERVER /S EIZ E

Capacity

C ONVEYO R

Utilisation

RE SO URCE /

Types Capacity of each butTer

Delay/ capacity

DELAY

tw o statio ns

Schedule (Priority rule)

SEIZ E/ R ELEASE

/ R ELEASE

SEIZ E/ R ELEASE

T he following provides an analysis and discussion of th e performance of the PCBA system based on its simulation results, wi th an emphasis on its communication system 's effects and in particular on how to use th e int egrat ed m eth od to quantify the logi cal interactions and int errelationship between the operations and information proc essing systems of th e entire system in order to ob tain an optim al design solution. 5.1. Operationa l syste m's aspects

Shown in figure 13, the system is divided into two assembly (final and sub-) areas and two load/ unload statio ns. T he final assembly area contains 5 wo rkstatio ns (assembly cells) and the sub- assembly area co ntai ns 3 wo rkstations . As illustrated in section 4. 1.1, th e simulatio n m odel ofthe r C BA operational system has been built up using the ARENA tool based on techniques int rodu ced in section 3. A series of pilot simulations

Modelling techniques in integrated operations and information systems

107

Figure 18. An overview of ARENA on-line simulation results.

have been utilised to capture and optimise the dynamic behaviour and the characteristics of the PCBA assembly process. An analysis and interpretation of statistical simulation results serves as the primary basis for much of the client's decision-making. However, the main tasks in evaluating the operational system for this particular study involve: 5. 1.1. Line-balancing and collectin}; critical data

Line balancing is a major requirement of the ARENA investigation for the PCBA system throughout the simulation. The line-balancing activity attempts to arrange the individual processing and assembly tasks at the workstation so that the total-time that is required at each workstation is approximately the same. In most practical situations it is very difficult to achieve a perfect balance, so the slowest station generally determines the overall production rate of the line. Figure 18 shows a performance snapshot of the optimised final assembly area during the course of the simulation. It shows the states of each workstation, their percentages of utilisation are 43.83, 47.79, 52.54, 47.74 and 43.73 respectively. These figures are quite close to each other indicating that the throughput at each workstation is in balance; moreover, the entire optimised final assembly area also satisfies the production requirements. During the simulation, it is also observed that the system entered a steady state almost immediately after the simulation started, even though the time between arrivals is described by an exponential distribution. This can be explained by the fact that the loop conveyor serves as a buffer area that is good enough to absorb the variability that is caused by the exponential arrivals into the system. However, as seen in figure 18,

108

Q. Wan g, C. R.

Chatwin, and R . C. D. You ng

Ta b le 3 Minimum processing time s in seco nds for PCB of type I and type 2 Stat ion

Asse mb ly area

WK l W K2 WK3 WK 4 WK S W K6 WK 7 WK H MI M2

Final Final Final Final Final Su bSub SubLoading/u nloading Loadin g/unloadin g

Type 1

Type 2

14 25 10 23 15 12

15 2H 16

lH

21 4 4

13

24 23 1') 16 4 4

work station 1 has a higher percent age of'blocking', this is thus identified as a bottleneck that wou ld constrain th e system perfo rmance. Typically, manufacturing utilisation ranges between 40 and 60 percent, and auto mated manufac tur ing systems can have an average device utilisation between 85 and 95 percent [1]. Table 3 shows a summarised report of minimum proce ssing time s for completion of each process (in seconds) tallied for each type of PCB on each workstation. It is a valuable reference when it is compared with maximum message delay obtained from th e C O M N ET simulation results, whi ch will be shown and discussed in section 5.2. 5.1.2. UsillJ( animatedsimulation to invcstioate system peifoTmllllces

Some definiti on s: • S tarving of a mach ine or work station-IDLE _RESource If a machin e or workstation canno t continue to operate because it has no parts to work on , the state of a machin e or work station is defined to be starved or idle. • Blockillg of a machine o r work srarion-e-Blockage.Il) This occ urs when a machine or workstation has completed its processing cycle and cann ot transmi t its part to the downstream mach ine or buffer. T he state of the proceeding or upstream machin e or worksta tion is said to be blocked. In some circumstances, this is a dangerous situation during production. • Failed machine or workstation-FAILED _RESource A resource is in the failed state when a failure is currently acting on th e resource . • Busy machine or wo rkstation-BUSY_RESo urce A resource is in the busy state wh en it has one or more busy unit s.

Co mputer simulation , especially with animated graphi cs, can be very useful for assessing the performance ofth ese complex production systems and for identifying their design flaws and operating prob lems. Simulation animati on brings a simulation model to life by generating a moving picture of the model operation , th erefore , prod uction problem s are easily visualised, it also helps system s designers determine the capacity of th e flow line's storage buffer, etc.

Modelling techniques in integrated operations and information systems 109

For example, the ARENA animation system allows analysts to visually find any bottleneck at each machine on the computer screen and then allows them to make some modifications by re-setting system parameters in models until obtaining the best performance results. Figure 19 shows a set of PCBA animated pictures (snapshots) at different stages of operations of the final assembly area during the same period of simulation. For instance, it can be clearly observed on the ARENA screen that a workstation can only release the work if the next input buffer has a free space available. If not, the work must be held, this phenomenon is known as 'blocking'. One of the factors that cause this problem is the inadequacy of the input buffer at the station. Simulation animation can provide a visual overview of this kind of bottleneck, which occurs at each station. This helps system designers to modify the system's design, and hence its model, to gain improved performance. Other primary benefits of simulation animation include: 1. Verifying and validating the model: A successful process of model verification of simulation programmes does not ensure that the model appropriately represents the real system; this only ensures that the model is free of errors. Animation is the most effective way to tackle most problems (i.e., errors in logic) of model verification. This decreases the likelihood of undetected errors. As discussed earlier, model validation is the process that determines that a model is a sufficiently adequate approximation of the real system. Animation allows us to communicate model operation to our clients who know the real system but have little knowledge about modelling. 2. Providing visual insights into dynamic interactions within the model: Such as material flows and work-in-process levels that are not easily obtained by examining statistical simulation outputs and presenting instant on-line simulation results in terms of figures or histograms. 3. Furthermore, the simulation animation can communicate modelling analysis and results to manufacturing managers and convinces them that the results are valid. However, we normally cannot draw conclusions regarding system performance just from watching an animation ofthe system; therefore, using text-based simulation results is essential for evaluation of manufacturing systems [1, 5,14,16,25,28,29,30,32, 35,39,47,48,50,51,52,53]. 5.2. Information system's aspects

Since the PCBA system is an integrated system, any problem with any device will affect the operation of the entire system. For example, a LAN designed with a high message delay time will certainly fail to deliver timely messages to networked devices and will certainly fail to inform the cell controller to stop the entire system when some fault has occurred. In all such production scenarios, it is crucial to ensure that the maximum message delay must be less than the shortest workstation (machine) processing time as shown in table 3. This guarantees each piece of equipment access to the network

"

..... ... ::.

.. • •

•

aL

""

~

{ I

~

---. .

f

I

I

-, '-

i

"I

•

~ ~'I

i-

1

I

f

t

,! ~

,i

~

. .,

• .•Q

~ Q •• II

I•

I

I

Q

i

(i

I

~

;

I

d

~

.

I

.. • .Q

i

Q

,~ i

.1

I

110

~ ----

I

i

'I

"

.,

•

111

Modelling techniques in integrated operations and information systems

·SBASE2 -10BASE2 --100BASET

100 I

,

~ ~ c

,,

,

e-

i

... ... __ .......

..... _. ..-

,

...

0

~ ~

5 a;

c c oJ s:

o

10

--

_.

I

1--

e-

1··--

-

:

.-

...

...._..._.. -

-

. - f--

--

-_.-

--

._-

--

--

I

i

I

+-

~

~-

-

-I

......

-

....

-

.-

,-

- 1-

10

-.25

'_._- -----

I 50

75

125

Maximum Message Size (KB) Figure 20. Channel utilisation vs maximum message sizes for three different transmission rates.

within its production cycle time without breakdown. Applying the integrated model, which examines both functions, has the capacity to provide answers at the early design stage to ensure that the maximum message delay does not cause production problems. In this study, simulations have been executed repeatedly using the IEEE 802.3 CSMA/CD protocol, IEEE 802.4 and IEEE 802.5 token passing protocols by setting different parameters to compare the performance of the communication system. Various generated simulation outcomes including network capacity, message throughput, loss probability, message delay, and channel (or LAN) utilisation can be used to investigate the network performance, depending on the user's requirements. For this project, an investigation of critical factors which affect communication systems' performance and have an impact on the operational and information processing systems have been detected, extracted and displayed in graphical forms and are analysed and discussed below. 5.2.1. Channel utilisation (%)

The Channel (also called LAN or network) utilisation is one of the most significant factors affecting network performance. In this investigation, the channel utilisation is the total usage time divided by the simulation run length that expresses the period of production. Figure 20 shows that the channel utilisation is affected by two factors that must be taken into account. One is the LAN transmission rate, which is provided by the communication protocol (CSMA/CD) with LAN transmission rates at 5 mbps, 10 mbps and 100 mbps. The other is the maximum message size sent between communicating devices. In this case study, the maximum message sizes were set in a range from 1 Kbytes to 125 Kbytes. This corresponds to a channel utilisation increase from 1.43% to 2.42% when using IEEE 802.3 CSMA/CD 100BASET, and from 14.19% to 23.92% when using IEEE 802.3 CSMA/CD lOBASE2 and from 28.39% to 48.34% when using

Q. Wang, c:. R. Chatwin, and R.

112

C. D. Young

w 100000

-S >.

CIl

a:;

10000

0

Q)

OJ CIl

(f) (f)

Q)

:2'

E E 'x

1000

''z'e--

r-=-

,

-

r-,

..

-_.._-

·_····..·..····t··.... "'--...;:" -+

100

-

"""'=:.----

I'

I-~

:::J

CIl

:2'

10

-

-

I

I

-

•

_..-

.... .... _..... _..... ._....

I

.

-- .._.

_....- _..__..-".

5

., 20

10KS 50KS f----

~

.1.

'.

10

...

--- _.

I

3

__

- ----.-

-~--

....•

........ ..

50

., 100

Transmission Rates (mbps) Figure 21. Maximum message delay vs transmission rate for two different maximum message sizes.

IEEE 802.3 CSMA/CD 5BASE2. According to practical experience and published reports, the communication-load (the amount of network traffic) on a LAN should typically be 5-10% of the maximum loading. Therefore, a LAN utilisation of more than 33% may be unacceptable for the control system of a manufacturing plant. Often, there is a misunderstanding by system designers who think that the selection of high-speed processors to minimise data processing time must significantly reduce LAN traffic congestion. The research shows that there is no direct link between these factors. For a high load communication system, high-speed devices lead to a very busy LAN especially at peak times or at the moment when the whole system starts up. This reduces any advantage gained by using high-speed devices and can adversely affect the performance of the whole system. In reality, network designers simply increase the capacity of the network until it delivers a reasonable performance for the manufacturing system. Nevertheless, it is always a commercial objective to build a network with a very good performance for a minimum cost. The approach presented herein can provide network designers with a useful system overview at the design stage. It can also help the designers obtain information on alternative solutions to meet capacity requirements and provide them with an estimate of network efficiency for the assumed conditions. This will reduce unnecessary investment in systems that have excessive capacity. 5.2.2. Maximum mcssacc delay (ms)

It can be seen from figure 21 that the maximum message delay increases rapidly as the LAN transmission rates decrease for both maximum message sizes (10 Kbytes and 50 Kbytes). It is observed that the maximum message delay is also affected by the maximum message sizes that are transmitted between the devices via the LAN. It is interesting to see that, for both maximum message sizes, the maximum message delay is relatively small when the range oftransmission rates is set to more than 10 mbps

Modelling techniques in integrated operations and information systems 113

Table 4 Collision-based protocols investigated for implementation of the PCBA network Protocol standard: IEEE 802.3 CSMA/CD

LAN (%)

Maximum Message Sizes Protocol Standard 1BASE5 STAR 3BASE2 5BASE2 10BASE2 20BASE2 50BASE2 100BASET Gigabit

Data Rates (mpbs) 1 3 5 10 20 50 100 1000

Access Method

10KB

50KB

125KB

collision collision collision collision collision collision collision collision

92.96 t 50.24 30.03 15.13 7.57 3.00 1.50 0.16

93.71 t 61.16 36.31 18.33 9.11 3.66 1.82 0.19

93.79 tt 80.20 48.28 24.04 12.04 4.86 2.42 0.25

TResults in first 1200 seconds simulation time tt Results in first 600 seconds simulation time

and relatively large when the range of transmission rates is set to less than 3 mbps. It is also observed that when the transmission rate is set at 1 mbps, the LAN has a very high channel utilisation of92.96% and 93.71 % (see table 4) for both message sizes in the first 1200 seconds of simulation times, the simulation collapsed after 1200 seconds. This indicates that a LAN with transmission rates less than 1 mpbs is incapable of handling the required communication load for the PCBA communication system. Furthermore, using the integrated simulation model of the PCBA system enables the designer to make a comparison between the maximum message delay obtained from COMNET simulation results and the minimum tallied machine processing time. Figure 21 shows that the maximum message delay is 1850 ms for a maximum message size of 10 KB, and 3776 ms (corresponding to the LAN at 3 mbps) for a maximum message size of50 KB. By inspection ofthe minimum machine processing times shown in table 3, it can been seen that a LAN with a transmission rate that is more than 3 mbps for both maximum message sizes guarantees that the maximum message delay should be less than the shortest workstation processing time. This ensures all facilities access to the network in time, during the PCB assembly process. The simulation results show that for both maximum message sizes, a LAN with a transmission rate at 5 mbps has a maximum message delay of 396 ms and 1499 ms, and has a maximum message delay of 51 ms and 233 ms when the transmission rate is 10 mbps, these are very small delays. Therefore, a LAN with a transmission rate ranging from 3 mbps to 10 mbps would certainly guarantee operation of the manufacturing communication system without failure. 5.2.3. Comparative dynamic performance of LANs for the PCBA system

In this study, simulations have been executed repeatedly using IEEE 802.3 CSMA/CD protocol, IEEE 802.4 and IEEE 802.5 token passing protocols by setting different parameters to compare the performance of the communication system based on the user's requirements. Various simulation outcomes include message throughput, loss

Q. Wang, C. R . Charwi n, and R . C. D. Young

114

100

-

90

- ' - Token - -. CSMAlCD

80

Maximum message size: 50Kbytes

~ 70 .!! 60

..."

~

50

S

;; 40 e e s: 30 0 20

.

..- - - - -'"

10

..

0 10

5

3

15

20

Transmission rates (mbps)

Figure 22. Chan nel utilisation vs transmission rates for token passing and CSM A/CD LANs.

Ul

g

>0-

4000 3500

m

Qj

3000

"Cl

cu

2500

UI

2000

CI

co

Ul

cu

E 1500 E :s 1000 E .;( 500 co :E 0

- ' - Token ---CSMNCD Maximummessage size: 50Kbyles

3

5

10

15

20

Transm ission rates (mbps) Figure 23. M aximum message delay vs transmission rates for tok en passing and CSM A/CD LANs.

probability, message delay, and channel utilisation, etc. The factors, wh ich significantly affect the system 's performance, are displayed and analysed in graphical form below. 5.2.3.1.

(%) AND MAXIMUM MESSAGE D ELAY (MS) VS TRANS MISFigure 22 and figure 23 indica te th e variations ofchannel utilisation and maximum message delay against transmission rates for both token passing bus and C SM A / C D LANs . The results are obtaine d by setting a maximum message size of 50 Kbytes across th e netwo rk. It ind icates that the channe l utilisation and the maxim um message delay increase rapidly as th e transmission rate dec reases from the point of 10 mbps. This is a partic ular problem for the case of th e maximu m message delay of th e CSMA/CD CHANNEL UTILISATI ON

SION RATES (MBPS) .

Modelling techniques in integrated operations and information systems 115

LAN. Hence, the effect of transmission rates must be taken into account for both LAN protocols. It is interesting to observe that, for both LAN protocols, the values of their channel utilisation, which corresponds to the same transmission rate between 2 mbps and 20 mbps, are approximately the same. In contrast, for both LAN protocols, the values of their maximum message delay, which corresponds to the same transmission rate (less than 10 mbps), are significantly different. For the example shown in figure 23, at the transmission rate of 3 mbps, which corresponds to a channel utilisation of nearly 60% for the token bus LAN and a channel utilisation of 61% for the CSMA/CD LAN, the maximum message delay is 927 ms a~d 3776 ms respectively. This indicates that at the same network load, especially for a heavily loaded network, the performance of token bus LAN is much better than CSMA/CD LAN. For a network load ofless than 18% (i.e., the transmission rate is more than 10 mbps) there is no significant difference in performance for either type of network. From figure 23, it can be seen that at the transmission rate of more than 10 mbps, the corresponding maximum message delays for the two LANs are very close and relatively small. However, for a transmission rate less than 10 mbps (i.e., a network load of over 18%), there is a big difference in maximum message delay between the two different LAN protocols. For transmission rates lower than 2 mbps, the difference in maximum message delay between the two LANs increases sharply. When the transmission rate is 1 mbps, the channel utilisation of token bus reaches 100% in 2675 seconds of simulation time; whereas the channel utilisation of CSMA/CD LAN reached 93.71 % in about 1200 seconds of simulation time (the simulation collapsed after 2675 and 1200 seconds respectively). The corresponding maximum message delays for both LANs are very different: 1081 seconds and 663 seconds respectively. They are much higher than the shortest machine processing times shown in table 3. This indicates that for both LAN protocols a minimum transmission rate of 2 mbps is essential to successfully operate the PCBA communication system. A cross comparison also shows that at high load the performance of token bus LAN is better than CSMA/CD LAN. This is further illustrated by figure 24. Figure 24 is a combination offigures 22 and 23, it illustrates the relationship between network load and maximum message delay. It can be seen that at a channel utilisation of over 16 % the maximum message delay starts to increase sharply for the CSMA/CD LAN compared to the token bus LAN. This also confirms the previous studies that under certain circumstances (for instance, at higher network load with a lower network transmission rate) that token passing can be more efficient than CSMA/CD for the PCBA system LAN. (%) AND MAXIMUM MESSAGE DELAY (MS) VS MAXIMUM MESSAGE SIZES (Kb). The maximum message size is an important factor, which affects the performance ofthe networks. For both LAN protocols, it can be seen from figure 25 that the channel utilisation increases rapidly as the maximum message size increases from 1 Kbyte to 125 Kbytes. Figure 26 shows a comparison of a non-linear variation of maximum message delays against the maximum message size for both LANs. It is 5.2.3.2.

CHANNEL UTILISATION

116 Q. Wang, C. R. Chatwin, and R . C. D. Young

en S

4000 3500

>- 3000

III

~ 't:l

Ql

Cl III

CIl CIl Ql

E

toOO

III

:E

Maximummessage Si7C : 50Kbytcs

, ,,

2000 1500

')(

,,

2500

E

E j

Token

- e- CSMNC D

,

,,

,J

,,

500 0

8.94

11.97

17.99

35 .89

59.67

Channel utilisation (%) Figure 24. Maximum message delay vs channel ut ilisation for tok en passing and CS MA/CD LANs.

-t c:

50 CSMAlCD

45

.2

~ :;

Token

Transmission rate: Smbps

40

.-/""-----~

-z-:

/

35 c: c: s: (,) 30 ~ III

25 -+-- -----,.- - - ---r-----.------- --.-- - - - .,.-- - - --. 10 25 50 75 125 Maximum message size (KB) Figure 25. C hannel utilisation vs maximum message sizes for token passing and CSMA/CD LANs.

observed that the maximum message delay is greatly affected by th e maximum message size th at is transmitted between network devices, the affect is significantly different for the different LAN protocols'. As shown in figures 25 and 26, although the same higher maximum message size results in a higher message delay for the different LAN prot ocols, the magnitude of th e gap between corresponding maximum message delays for different protocols is significantly widened w hen increasing the maximum message size in the network. For instance, for a maximum message size of 50 Kbytes, the maximum message delay for the token bus LAN is 241 ms with a channel utilisation of 35.89%, compared

Modelling techniques in integrated operations and information systems 117

3500

Ul 3000

S,..

.!!1 25OO Gl

== == -

==

"g' 2000 ::::: ''"" f= E 1500

-

Gl

Gl

- - - 4 - CSMAlCD

:=:: ------- Token

Poly. (CSMAlCD)

Transmission rate: 5mbps

E

-

.~" 1000 co ::;:

500 ,

o

"

-'

10

25

50

75

125

Maximum message size (KB)

Figure 26. Maximum message delay vs maximum message sizes for token passing and CSMA/CD,

to a maximum message delay of 1616 ms for the CSMA/CD LAN with a channel utilisation of 36.36%. This is further evidence that at the same network load, the token bus LAN has superior performance to the CSMA/CD LAN for the PCBA communication system. This benefit becomes significant when the LAN is heavily loaded. Furthermore, applying the integrated simulation model of the PCBA system enables designers to make a comparison between the maximum message delay obtained from the COMNET simulation results and minimum tallied machine processing time. From figures 22 and 23 and based on COMNET text-based simulation reports, it shows that at the same maximum message size of 50 Kbytes, the maximum message delay is 4748 ms (89.28% busy) and 927 ms (59.67'/\, busy) for the token bus LAN at 2 mbps (not shown in figure 23) and 3 mbps respectively; and 42701 ms (91.04% busy) and 3776 ms (61,16% busy) for the CSMA/CD LAN at 2 mbps (not shown in figure 23) and 3 mbps respectively. By inspection ofthe minimum machine processing times shown in table 3, it can be seen that a LAN with a transmission rate of over 3 mbps for both the token bus LAN and CSMA/CD LAN will guarantee that maximum message delay will be less than the shortest workstation (machine) processing time. This ensures all facilities have sufficient time to access to the network during the PCB assembly process. Moreover, an analysis based on the simulation results concludes that a CSMA/CD LAN with a transmission rate between 5 mbps and 10 mbps has a maximum message delay from 1499 ms to 233 rns, corresponding to a channel utilisation of 36% and 18% respectively, these delays are relatively small, hence performance is reasonable. For a token bus LAN, a transmission rate between 3 mbps and 5 mbps is fast enough to undertake communication duties, and a transmission rate ofmore than 10 mbps leads to a very small maximum message delay for both LANs. Within this range, the simulation results show no data lost during the transmission across the network. Therefore, it was

118

Q. Wang, C. R. Chatwin, and R. C. D. Young

Table 5 Token passing-based protocols investigated for implementation of the PCBA network Protocol standard: IEEE 802.4 and IEEE 802.5 Maximum message size (50 KB) Data rates (mpbs) I 3 5 10

15 20

Access method token token token token token token

Token bus

Token ring

LAN (%)

Max. del. (ms)

100.00t 59.57 35.99 17.97 11.89 8.98

724.66 293.12 110.39 61.12 44.41

LAN(%)

ioo.oon

5a9.67 35.89 17.99 11.97 8.94

Max. del. (ms) 927.36 240.47 112.12 51.53 39.77 t

tResults in first 2675 seconds simulation time tt Results in first 2160 seconds simulation time

finally suggested that the PCBA CSMA/CD LAN or the PCBA token passing bus LAN with any transmission rate ranging from 5 mbps to 10 mbps would certainly guarantee the operation without failure of the PCBA manufacturing communication system (this is for a maximum message size of 50 Kbytes in the PCBA network). Since the operation of token bus and token ring is similar [45], the simulation result for token ring LAN shown in table 5 is extremely close to the result for token bus, hence, the discussion relating to token bus also applies to token ring, though the token ring is not physically suitable for the PCBA communication system. 6. DISCUSSION AND CONCLUSION

As outlined and discussed in section 1, for increasingly highly automated computercontrolled manufacturing systems, successful integration of manufacturing devices and automated equipment using existing communication protocols and networks is crucial to achieve the desired, cost effective, co-ordinated functionality required for CIM systems. As a result, the performance ofcommunication networks has become a key factor for successful implementation ofintegrated manufacturing systems, particularly, for time-critical applications. Hence, the design and evaluation of manufacturing systems can no longer ignore the performance of the communication environment or conduct a separate investigation without considering the performance ofthe operational system. Section 5 presented an assessment of the operational system's aspect for the PCBA system to ensure that the system has no fatal bottlenecks and weaknesses in system operations. It addressed four issues presented in sections 5.2.1 and 5.2.2 respectively, which discussed the impact on logical interactions and interrelationships between operations and information processing systems within the PCBA environment and determined the relative performance merits of the three IEEE 802 standard networks in which the token bus LAN performs best when implemented for the PCBA communication system. The outcome also shows that token bus is better suited to process control applications (since they are time-critical applications) than the CSMA/CD protocol network, which is well suited to standard computer network applications, where the network loading rarely exceeds 8-17%.

Modelling techniques in integrated operations and information systems 119

A comprehensive review of the current literature reveals the lack of a feasible and practical modelling and simulation method or means that has the ability to investigate manufacturing systems by taking both aspects into account. In fact, there is no single conceptual modelling method or tool available, which can completely model a manufacturing system and easily describe most of its sub-systems due to the high level of complexity of manufacturing systems. It is generally accepted that traditional planning methods and mathematical/analytical modelling techniques are not appropriate to deal with complex manufacturing systems. Nevertheless, manufacturing system's analysts, designers and their clients have an increasingly important requirement for a 'full' system evaluation (particularly for investigation of a highly integrated time-critical manufacturing systems), which will model the basic manufacturing operations and combine the effect of the communication systems. Therefore, the aim of the research reported herein was to focus on: The development of an integrated method, in which both the operations and information systems within a manufacturing system could be examined concurrently using the currently developed simulation tools and techniques so that the relevant impact on logical interactions and interrelationships between them could be determined. Moreover, this technique should be implemented based on a real system to test thefeasibility of this approach becoming a strategic planning tool for systems analysts and designers to quickly provide a visible preview of the integrated system peiformance at an early stage ill the design process.

The major work of this treatise is to present a methodology that has been developed to examine a manufacturing system by the modelling and simulation of its integrated operational systems and information systems. This approach has been implemented on a relatively complex flexible manufacturing system: a printed circuit board assembly (i.e., PCBA) line; in order to determine its feasibility and capability. The key features of this technique has been demonstrated by analysing and comparing various simulation results (in terms of graphs and tables) that were generated by the established integrated model of the PCBA system using the two powerful simulation-packages that were specially selected for use in this integrated domain. The research has shown that applying this integrated method allows system designers and analysts to comprehensively predict system behaviour in order to obtain an optimal solution that maximises systems performance. The integrated model can allow users to see the impact on logical interactions and interrelationships between operations and information processing systems within a manufacturing environment so that they can make design judgements that satisfy systems' and production requirements. From this, an optimal system specification can be drawn up. The research has shown that this approach contributes a useful basis for developing existing modelling frameworks and a practical means of exploring existing modelling simulation methodologies. The research indicates that in principle, this technique is valuable for analysing a wide range of manufacturing systems (CIM systems, FMSs, process control systems, etc.). Finally, the concept of economic performance control came into being during the 1970s petroleum crisis when industrial circles realised that process control systems

120

Q. Wang, C. R. Chatwin, and R. C. D. Young

that excluded economic variables were not guaranteed to benefit enterprise economic planning. To avoid this difficulty, economic variables must be selected as the ultimate control variables of the control system, and specific costs and market information must be taken as the input that disturbs the control system. However, some of the economic variables are not measurable on-line; therefore, model prediction may be used to generate data for them, but model reliability and system stability are difficult problems. It is wise at present to develop process economic performance display (rather than control) software for industry. This will yield manufacturing profitability with lower economic risk. For example, the economic variable to be displayed for a chemical plant may be instantaneous profit IP: IP = SP - PC

where, SP is selling price; PC is production costs, and most components of PC are measurable on-line. It is widely accepted that in general the economic performance of an enterprise is a function of 8 Ms: 1. Man (Personnel and manpower) 2. Machine (Equipment) 3. Material (including energy) 4. Money (floating capital) 5. Market 6. Method 7. Moment (time) 8. Message (information). Obviously, the objective of the enterprise is to maximise profit. Therefore, a good system model should optimise response to the above variable. A first step to take is to make available not only technical data, but also instantaneous information on the economic performance of the enterprise concerned, without which decision making is often misguided. P.S. This work may match the following subject areas:

• New computer technology for enhanced factory modelling and visualisation • Integration of design with manufacturing planning • Process modelling in an integrated design and manufacturing environment • Optimisation techniques for factory design • Advances in discrete event simulation • Enterprise resource planning Keywords:

Manufacturing systems, computer networks, modelling and simulation, integration, FMS, CIM.

Modelling techniques in integrated operations and information systems 121

REFERENCES

[1] Groover M. P.,2000. Automation, production systems, andcomputer integrated manufacturing (Prentice-Hall, Inc.). [2] Wong W. M. R., 1993. Modelling andsimulation ofthecommunicatio11 protocols usedin typical CIM equipment. Bradford University. [3J Mansharamani R., 1997. An overview of discrete eventsimulation methodologies andimplementation. Sadhana, Vo1.22, Part 5,611-627. [4] McCarthy I., Frizelle G., Efstathiou j., 1998. Manuf{,cturing complexity network meeting, University of Oxford. EPSRC engineering and physical science research council. [5] Chou Y. C, 1999. Configuration desion of complex intecrated manufacturing systems. lnternational [ournalof Advanced Mallllfacturing Technology, 15:907-913. [6] AL-Ahmari A. M. A., Ridway K., 1999. An integrated modelling method to support manufacturing system analysis and desion. Computers in Industry, 38 (1999), 225~238. [71 O'Kane j. E, Spencekley j. R., Taylor R., 2000. Simulation as an essential tool loradvanced manuiacturino technology problems. Journal of Materials Processing Technology, 107 (2000), 412-424. [8] Kim C H., Weston R., 2001. Development of an integrated methodology for enterprise engineering. InternationalJournal of Computer Integrated Manufacturing, 14 (5), 473-488. [9] Balduzzi E, Giua A., Seatzu C, 2001. Modelling and simulation of manufacturing systems with first-order hybrid Petri nets. International Journal of Production Research, 39 (2),255-282. (10] Cunha P. E, Dionisio J., 2002. An architecture to support the manufacturing system desion and planning. Proceedings of the 1st CIRP(UK) Seminar on Digital Enterprise Technology, Durham, UK, 129134. [11J Bernard A., Perry N., 2002. Fundamental concepts of product Itechnology Iprocess inionnationai inteoration for process modelling andprocess planning. Proceedings of the 1st CIRP(UK) Seminar on Digital Enterprise Technology, Durham, UK, 237-240. [12] Cantamessa M., Fichera S., 2002. Process and production planning ill manuiacturino enterprise networks. Proceedings of the 1st CIRP(UK) Seminar on Digital Enterprise Technology, Durham, UK, 187190. [13] Higginbottom G. N., 1998. Performance evaluation ofcommunication networks (Norwood: Artech House, Inc.). [14] Mitchell E H., 1991. CIM systems (Prientice-hall Ltd.). [15] Colquhoun G., Baines R., Crossley R., 1993. A state of the art review of IDEFO. International Journal of Computer Integrated Manufacturing, 6 (1993), 252-264. [16] Doumeingts G., Vallespir B., 1995. Methodologies for designing CIM system: a survey. Computers in Industry, 25 (1995), 263-280. [17] AL-Ahmari A. M. A., Ridway K., 1997. Computerised methodoloyiesior modelling computer integrated manufacturing systems. Proceedings of 32nd International MATADOR conference, Manchester, 111116. [18] Chryssolouris G., Anifantis N., Karagianis S., 1998. An approach to thedynamic modelling of manufacturing systems. International Journal of Production Research, 38 (90), 475-483. [19] Baines T. S., Harrison 0. K., 1999. An opportunity for system dynamics in manufacturing system modelling. Production Planning & Control, 10 (6), 542-552. [20] Perera T., Liyanage K., 2000. Methodologyfor rapid identification andcollection of input data in thesimulation of manufacturing systems. Simulation Practice and Theory, 646-656. [21] Borenstein D., 2000. Implementation of all object-oriented tool for the simulation of manufacturing systems and its application to study the effects ofjlexibility. International Journal of Production Research, 38 (9) 2125~2142.

[22] Wang Q., Chatwin C R. et aI., 2002. Modelling and simulation of integrated operations and information systems in manufacturing CA' rating awarded), The International Journal of Advanced Manufacturing Technology, Vol. 19, pp. 142-150. [23] Wang Q., Chatwin C R. et al. Comparative dynamic performance of token passing and CSMAICD LANs for a.flexible manuiacturino system, The InternationalJournal of Computer Integrated Manufacturing, in press. [24] Wang Q., Geha A., Chatwin C R. et aI., 2002. Computer enhanced network design for time-critical integrated manufacturing plants, 1stCIRP (UK) International Seminar on 'Digital Enterprise Technology' (DET02), Proceedings of the 1stCIRP(UK) Seminar on Digital Enterprise Technology, Durham, UK, pp. 251-254.

122 Q. Wang, C R . Chatwin, and R . C D. Young

u.

Gastaldi M ., Levialdi N. , 1996. DYllamic analysis of tbe peif
TECHNIQUES AND ANALYSES OF SEQUENTIAL AND CONCURRENT PRODUCT DEVELOPMENT PROCESSES

MARKO STARB EK, JANEZ G RU M, ALES BR EZOVAR , AN D JANEZ KUSAR

1. INTRODUCTION

A company can ente r th e global market only if it can fulfil th e custo me r needs regardin g features and quality of produ cts, Custo mers are becoming more and more demanding and th eir requirements are chang ing all the tim e. "Custom er is the king!" is becoming the mott o of today. In these circumstances only that company can surv ive on the global market , w hich can offer its customers the right prod ucts in terms of features and quality, products w hich are produced at the right time and place, at the right quality and at the right pri ce. A produ ct, which is not manu factured in accordance with needs and requireme nts of the customers, which hit s th e market to o late or is too expensive, will not sur vive. When developing a new produ ct th e company has to pay special attention to fulfilme nt of the basic market requirem ent , i.e. as short new produ ct development time as possible (as short delivery time as possible). Fierce market competition increases pressure on the companies so that the y would hit th e market with new produ cts soo ner th an their comp etitor s. This goal can only be achieved by redu ction ofprodu ct development time , while quality and cost of the prod uct sho uld be taken into acco un t at the same time, whi ch is possible if the co ncur rent enginee ring concept is used. The basic idea of the concurrent eng inee ring is co ncurrent execution of formally sequent ial activities dur ing new produ ct development process. By executing activities conc urre ntly it is po ssible to harm on ise decision s during th e draft ph ase, which prevent s time and engineering changes during manu facturing of 123

124

Starbek et al.

.--"'t""-_

Removal, recycli ng

Marketing. product plann ing. tenders

Making design document ation

Use

Material

Stan of opera tion

managcmcnl

Planning of manufacturin g and assembly. development of production

QC

Assembly

Figure 1. Produ ct developm ent process as a part of the produ ct life cycle.

the prod uct. T he mo tto for successful implementatio n of the concurre nt engineer ing concep t says: "Concurrent engineering starts in the heads of team members." Several authors [1], [2], [3] have analysed activities in individual stages of new product development proc esses, and concluded that the volume and cont ents of produ ct development activities depend on quantity and purpose of the product. There is a substantial difference between new prod uct development activities in individual and mass prod uction [4]. T he chapter presents techn ique s and analyses of sequential and concurrent produ ct developm ent processes, the emphasis being on team work, organisational struc tures and tools needed for transition from sequen tial to concur rent produ ct development process. The chapter also presents the results of implementation of concurrent engineering in an SME which produces civil engineeri ng equipment. 2. SEQUENTIAL ENGINEERING

2. 1. Sequential product development process

The main feature ofsequential engineering is sequential execution ofstages in produ ct developm ent process. Figure 1 presents the sequential product developm ent process as a part of the product life cycle. T he nex t process stage can begin after its preceding stage has been com pleted. Data on curre nt process stage are collected gradually and they are com pleted when the stage is finished-then the data are forwarded to the next stage as shown in Figure 2 [1]. Sequential product development tim e can be calculated as a sum of times needed for individu al stages of produ ct development .

Techniques and analyses of sequential and concurrent product development processes 125

Activities

Gradual building collected of data

Time

Activities Goals Product planning Design Production planning Production prepare Manufacturin"'+ and assembly

-'

Delivery

Time Figure 2. Sequential product development.

126

Starbek et al.

2.2. Characte ristics of sequential engineering

Three typical types of problems exist in sequential produ ct developm ent :

• organisational problems (problems in collaboration, unm otivated employees, requirements and goals are not clearly defined, weak connection betwe en suppliers and custom ers), • problems in productdevelopmentprocess (problems related to explanation of requ irements, probl ems during searching for solution s, problems related to meeting the deadlines), • technical and economic problems with products (problems related to operation of the products, manufacturing problem s, environme ntal prot ection problems, cost-related problems). 3. CONCURRENT ENGINEERING

3.1. Concurrent pro du ct develop ment process

The main feature of concurrent engineeri ng is concurre nt implementation of stages in product development process. In this case the next stage can begin before its precedin g stage has been completed. Winner defined concurren t enginee ring as a "systematic approach to the integrated concurrent product plannin g and similar processes, including manufacturing and sales"

[4].

Ashley defined concurre nt engi neerin g [5] as a "systematic approach to integra ted product development that emphasizes the response to customer expectations. It embo dies team values ofcoo peration , trust, and sharing in such manner that decision making pro ceeds with large intervals of parallel working by all life-cycle perspectives early in the process, synchronized by comparatively brief exchanges to produ ce consensus". Concurre nt enginee ring is based on eight principles:

First principle: EARLY D ET EC TI O N O F PROBLEMS Probl ems that are detected early in the produ ct development process can be solved more easily than problems that are detected later. Second principle: EARLY DEC ISIO N MAKING In early design stages it is much easier to influen ce the product design than in later stages. Third principle: SHARING WORK One man cannot perform several tasks at once, wh ile parallel-connected computers can. Fourth principle: C O N N ECTIO N OF TEAMS C onn ection and collabora tion wit hin a team is not enou gh-it is impo rtant that the re is a connection and collaboration amo ng all teams that strive after a commo n goal: a customer who is satisfied with the product. Fifth principle: USING KNOW LEDG E A kn owledgeable and experienced person is still an indispensable decision-making factor.

Techniques and analyses of sequential and concurrent product development processes 127

Sixth principle: GENERAL UNDERSTANDING Teams work better if they know and understand what other teams do. If one team changes particular parameter then it has to think about how this change will affect other teams. Seventh principle: OWNERSHIP Teams will work more enthusiastically if they have some authorisation for making decisions, and if they get some kind of"ownership" of what they have made. Eighth principle: CONTINUOUS FOCUS ON THE COMMON GOAL Everybody has to (as much as one can) participate in the fulfilment of the given goal of the company; everybody has to enthusiastically (and yet not competitively) collaborate with other individuals and teams. 3. 1. 1. Data transier between activities in concurrent product development process

In concurrent product development the next process stage can begin before its preceding stage has been completed. Data on the current process stage are collected gradually and forwarded continuously to the next stage. The series of data exchange between the current process stage and the next process stage ends when the data on the current stage has been completed. Figure 3 presents the principle of concurrent product development process [1]. 3.1.2. Loops of concurrent product development process

In concurrent product development there are interactions between individual stages of product development process. Track-and-loop technology was developed for implementation of these interactions [1]. Type ofloop defines the type of co-operation between overlapping process stages. Figure 4 presents types ofloops in concurrent engineering with respect to number of interactions between various process stages. 1-T loop means interaction of the process stage with itself, 2- T loop means interaction between two process stages, and 3- T loop means interaction between three process stages. As a general rule, the number of interactions between L process stages is equal to Lx (L-l)/2. Winner [4] proposed the use of 3- T loops, where interactions exist between three stages of product development process. When 3-T loops are used (Figure 5) the product development process consists of five 3- T loops. In 3- T loops each loop is defined as an intersection ofthree mutually covered stages; this can be written as:

Feasibility loop = Coals n Product planning n Design Design loop = Product planning n Design n Production planning Production planning loop = Design n Production planning n Production Production loop = Production planning n Production n Manufacturing and assembly Manufacturing loop = Production n Manufacturing and assembly n Delivery and service

128

Starbek et al.

Activities

Part iall built da ta

Time needed Cor concurrent new

T ime

Activities Goa ls Product plann ing Design Production plann ing Produ ction prepare Manu facturin g -+ and assembl y

_

Delivery Time needed for co ncurrent new roduct dcvc lo ment

Tim e

Figure 3. C on current produ ct development .

O n the basis of the following require me nts and restriction s: • custo me r requirem ent s, • geo me trical character istics, • weight, • reliability,

Techniques and analyses of sequential and concurrent product development processes

Type of loop

Number of acivities

1-

T

1 actvity

•

2-

T

2 actvities

•

3-T

3 actvities

• • •

• 4-T

4 actvities

• • •

• 5-T

5 actvities

• • • •

•

, L-T

Lactvities

• • • •

•

Numberof interactions

•

•

L A

0

•

~~.

1

3

6

10

:~•••...

••

~

L(L-1) 2

Figure 4. Number of interaction between product development process stages.

- safety, - quantity, -lifetime, - recycle, - ecology input is transformed into output [2] in each loop. Each transformation loop is carried out in steps, as shown in Figure 6.

129

130

Starbek et al.

Goals Ql

Cl

co

U;

C Ql E c. o

Design

1 - - - --

a;

iii

-1 c. o

\---~

Q c:

"'0

.Q

Production

::l

~ 0..

'0 ::l

Manufacturing and assembly

"e

Production planning

1----1.8

o..~

~

Product development loop

Figure 5. Track and loop process in product development.

4. step

Optimization and improvements

2. step

1. step

Initial system state of the "r loop

~

3b. step

Improved system state of the "l' loop Initial system state of the "i+1" loop Figure 6. Transformation process in the concurrent engineering loop.

3a. step

Analyses and evaluation

Techniques and analyses of sequential and concurrent product development processes

131

Figure 7. Information flow diagram in the track-and-loop process of product development.

The information flow diagram in the track-and-loop process of product development is shown in Figure 7. Analysis of the track-and-loop process of product development, as shown in Figures 5 and 7, reveals that the concurrent engineering is not possible without a wellorganised team work. 3.1.3. Team work

3.1.3.1. TEAM STRUCTURE IN CONCURRENT PRODUCT DEVELOPMENT PROCESS. We are dealing with team work when a team is oriented towards the solution of a common goal r6]. Team work is an integral part of concurrent engineering as it represents the means for organisational integration.

N

.... ""

~--

~ .# ¥

"

Engineering and analyses (QFD . CFD . FMEA I

,

Cooperation, . . .1communication I.... and cont inuous improvements

t

BAS IC PRINCIPLES AND MET HODS OF CE

Figure 8. Str ucture of a multidisciplinary product development team .

CIS

PHYS ICAL PRESENCE

-- ffi

115

VIRTUAL PRESENCE

Techniques and analyses of sequential and concurrent product development processes

133

Requirements for team work are [1]: • flexible, unplanned and continuous collaboration, • commitment regarding achievement of goals, • communication by exchange of information, • ability to make compromises, • consensus in spite of disagreement, • coordination when carrying out interdependent activities, • continuous improvements in order to increase productivity and reduce process times. 3.1.3.2. TEAMS IN BIG COMPANY. Concurrent engineering is based on multidisciplinary product development team (PDT) [7], [8]. PDT members are experts from various departments of a company and representatives of strategic suppliers and customers (Figure 8). Product development team members communicate via central information system (CIS) which provides them with data about processes, tools, infrastructure, technology, and the existing products of the company. Representatives of strategic suppliers and customers-due to their distance from the company-participate in the team just virtually, using the Internet information system (lIS) which allows them to use the same tools and technologies as the team members in the company [8]. In big companies the PDT structure changes in different phases of product development. The team consists of various workgroups in various phases of product development, and each workgroup consists offour basic teams [1]: • Logical team ensures that the whole product development process is divided into logical units (operations, tasks) and defines interfaces and links between individual process units. • Personnel team has to find the required personnel for PDT, it trains and motivates the personnel, and provides for proper payment. • Technology team is responsible for creating strategy and concept. It has to concentrate on quality of products at minimum costs. • Virtual team operates in a form of computer software and provides other PDT members with data required. Figure 9 presents the composition of a workgroup in a big company. The goal of the concurrent engineering is to achieve the best possible collaboration among the four basic teams in a particular workgroup. The multidisciplinary teams should generally have such a structure that the following goals are achieved: • clear definition of competence and responsibility, • short decision paths, • identification of team members with the product being developed.

134 Starbek et a1.

PERSONNEL TEAM

~

et

... et ...a: W

..J

:l

s

WORKGROUP

TECHNOLOGY TEAM

Figure 9. Workgroup in a big company.

A survey of the published works in the field of team structure planning in big companies [1], [9] has revealed that a three-level PDT structure is recommended in big companies, as shown in Figure 10. Core team consists of the company management and the manager of the level team; its task is to support and control the product development project. Level team consists of the level team manager and the managers of the participating functional teams in this level (loop); its task is to co-ordinate the goals and tasks of functional teams and to ensure a smooth transition to the next level of product development. Functional team consists of the functional team manager, experts from various fields in the company and representatives of suppliers and customers; its task is to carry out the tasks given, taking into consideration terms, finance and personnel. 3.1.3.3. TEAM STRUCTURE IN SMEs. Analysis ofresults regarding setup ofworkgroups and team structure in big companies has shown that the proposed concept for planning workgroups and structure ofteams cannot be used in SMEs as there are too many teams in a workgroup and too many team levels. When developing a workgroup concept, structure and organisation in SMEs it will therefore be necessary to propose: • as few workgroup teams as possible, • as few team levels as possible, and • appropriate organisation of the company. Experts ofthe Production SystemsInstitute made several versions ofworkgroup composition and team structure, and decided-after evaluation ofthe proposed versions-that the following seems advisable for SMEs: • transition from four workgroup teams (personnel, logical, technology, and virtual team) to two teams (logical and technology team); • transition from the three-level to two-level team structure.

-''""

v

tcvel rcam

Figure to. T h ree- level team stru ctur e in a big company.

I·T · functional tcum

L'I

3.

1.

Level :

136

Starbek et al.

PERSONNEL TEAM VIRTUAL TEAM

TECHNOLOGY TEAM

Figure 11. Workgroup in an SME .

In an SME a workgroup therefore consists ofju st two basic teams (Figure 11): • logical team ensures that the whole pro duct development process is divided into logical un its and that interfaces and links between pro cess units are defined; • technology team is respon sible for providing strategy and concept. With proper software tools the C IS perfor ms the role of a virtual team (workgroup members sho uld be well trained to use these tools), and project team manager carr ies out the tasks of the personnel team. For SME, the transition from a thr ee-level to two-level team struc ture is plann ed, as shown in Figure 12. Co re team [10] which suppo rts and contro ls the produ ct developme nt proj ect consists of: • core team manager (permanent member), • department managers (perm anen t members ), and • project team manager (per manent member) . Project team [10] which carries out the tasks given, taking into consideration terms, finance and personnel con sists of: • project team manager (per mane nt memb er), • experts from various fields in the company and representatives of strategic suppliers and customers (variable members). Th e project team in SME is therefore designed similarly as a functional team in a big company, the difference being in that ther e is j ust one team and its composition changes in different phases (loops) of produ ct development process.

..... ..... '"

Figure 12. Two-level team structure in SME.

2.

1.

Level

VARIABLE STRUCTURE OF PROJECT TEAM IN PRODUCT DEVELOPMENT PROCESS

CORE TEAM

PERMANENT STRUCTURE OF CORE TEAM IN PRODUCT DEVELOPMENT PROCESS

138

Starbek et al.

In the feasibility loop the project team should define customer requirements and goals, and make several versions of the product design; the project team should consist of the employees from the marketing, product planning, and design departments, and representatives of strategic customers and suppliers. In the design loop the project team should provide general solutions regarding the product, product planning and design, its parts and components, development of prototypes, and choice of the most suitable versions from the manufacturing point of view; the project team should consist of the employees from the product planning, design and production planning departments. In the production planning loop the project team should select the best technology routings for manufacturing of parts and assembling the components (definition of sequence, operations, selection ofmachines, tools and standard times); the project team should consist of the employees from the design, production planning, and production departments, and strategic suppliers' representatives. In the production loop the project team should define production type (workshop, cell or product-oriented type ofproduction) and select the optimal layout ofproduction means; the project team should consist of the employees from the production planning department, production, manufacturing and assembly, as well as logistics and delivery. In the manufacturing loop the project team should take care of prototype tests, supply of required equipment, layout of production means, manufacturing and test of the null series; the project team should consist of the employees from the production department, manufacturing and assembly, quality assurance, warehouse and delivery departments. 3.2. Organisational structures 3.2.1. Functional organisational structure

Functional organisational structure is a centralised organisational structure. It is based on the requirement that the interdependent partial tasks related to a work piece and operations are done in one place (workshop functional type). Therefore, in this organisational structure the areas, sectors, services, departments and workshops are formed, which perform the required special tasks. So the subordinate employee can have several functional managers besides his line manager. Employee is responsible to his functional managers just for the corresponding functions, while he is responsible to his line manager in the organisational sense. All functional managers on the same hierarchical level have therefore the same subordinate employees. Operation of a functional structure is complicated so it is necessary to precisely define the responsibilities of the functional managers. An example of organisational scheme in a functional organisational structure is shown in Figure 13. Advantages of a functional organisational structure: • division of hierarchical management level on the basis of (business) functions • specialisation and concentration of knowledge in one place,

Techniques and analyses of sequential and concurrent product development processes 139

line subord ination functional subord ination

Figure 13. Organisational scheme of a functional organisational structure.

• centralised decision making by means oflinear type of management, • priority is given to expertise, • it is useful for SMEs with stable production programmes, • it allows for quick adaptation to changes, • intensive development of individual functions (concentration of knowledge) and personnel, • individual function performs specialist operations for the whole company, • there is less bureaucracy. Disadvantages of a functional organisational structure: • coordination between areas is unconnected and unclear, • there are difficulties in precise definition of working duties and responsibilities of the functional managers, • communication structure is complicated, • a lot of coordination is needed when a task should be done which covers several fields, • working discipline is worse than in linear type of organisation, • when employees move to a higher hierarchical level, difficulties arise because tasks are not divided on a functional basis any more.

140

Starbek et al.

COMPANY MA NAG EMENT

Figure 14. Organisational scheme of a project organisational structure.

In spite of disadvantages the functional organisational structure is still a prevailing form of organisation in companies. 3.2.2. Project organisational structure

Projects are activities that are done just once, and they consist of a series of logically interconnected activities. In order to be accomplished they require time and resources which cause costs. Project organisational structure is used if the company runs many large projects which are not interconnected. It is formed so that the projects can be finished in the expected time frame, with costs defined in advance, and in accordance with the requirements of the client. For every project the company forms a fixed organisation, but just for a limited period-the project team (a company within a company), which is completely responsible for execution of the project. Project team starts its mission at the beginning of the project and finishes it when the project is finished. After the completion of the project the team members are employed at other projects or in other departments of the company. An example oforganisational scheme in a project organisational structure is shown in Figure 14. Project organisation is used if one of the following criteria is met: • the project is large and high funds are involved, • some of the project parameters are critical, e.g. time for completion of the project, availability of resources, or costs, • it is the customer's requirement. Advantages of a project organisational structure: • planned, harmonised and controlled organisation throughout of the project duration, • project team is entirely responsible for completion of the project goals and fulfilment of project activities,

Techniques and analyses of sequential and concurrent product development processes

141

• all project-related data are collected and evaluated in a central location, • ensured are central responsibilities of partners, contractors and employees, • high level of development flexibility, using internal or external human resources, • growth, training and education of future project managers, • high motivation of employees as they participate in exactly defined and interesting tasks. Disadvantages of a project organisational structure: • contradictions between project-oriented view and functional dealing with organisational problems, • disappointment of project managers due to unrealistic goals of the project, • unsteadiness of team members due to automatic cease of their roles in a project team after a successful completion of the project, • project managers tend to establish too large project teams, which increases overhead expenses of the project, • integral project information system should be established, as a part of the information system of the entire company. 3.2.3. Matrix organisational structure

Matrix organisational structure is a combination offunctional and project (or product) organisational structures. In matrix organisational structure a permanent project organisation is not established, only the project team manager is defined who is responsible for the project or for the realisation of the programme (product). Project team members, selected for accomplishment of the project-related tasks remain in their functional departments (in the organisational sense). Authorisation for work is given to them by their department head, and project-related tasks are given to them by the project manager. The project (product) manager is therefore just a coordinator for the execution oftasks which are (based on his orders) carried out in functional departments. Project team member has two managers: department head (in view of organisational and technical level) and project manager (in view of project tasks). Matrix organisational structure got its name because of its characteristic shape. An example ofsimplified organisational scheme in a project matrix organisational structure is shown in Figure 15, and product matrix organisational structure is shown in Figure 16. Matrix organisational structure is used when there are several concurrent recurring projects being executed, which require common sources of functional departments of the company (multi projects). Advantages of a matrix organisational structure: • it is based on a team problem solving, • clear coordination of tasks, • project teams temporarily join people from various functional grounds,

142

Starbek et al.

I I COORlJl. ATO R OF PROJECTS

II

lJEVELOPMENT .

I PRODL:CT

-

A

-l

Manager

"II"

PRODUCT

II

II

SUPPLY

II

CD~I PANY

M A NA GEM ENT

II

PRODI.:cTIO.

"II"

.... .

"II"

"II"

II

SALES

II A. A~CE I ...

..

.....

"II"

Manager

-

PRODUCT

C

Manager

, - PROlJUC r

~

"II"

Manager

.."II"

PRODUCT E

Manager •

.

D

"II"

.."II"

...

"II"

...."II"

.

"II"

...

"II"

department participates in a project

Figure 15. Organisationa l scheme of a proj ect matrix organisational stru cture.

• proj ect team stru cture may change during development of the project (concur rent eng ineering), • int erdisciplin ary links are establishe d in th e co mpany so it is very flexible, • conflicts are solved in teams, • knowledge is conc entrated in func tional departments, • pr ior ity is given to expertise. Disadvantages of a matrix organisational stru cture: • it is efficient only if team work is used, • du al system of management and respons ibility (proj ect manager and fun ctional manager), • large communication ne eds, • frequ ent conflicts and compromises. 3.2.4. Organisational strua ure of team wotk ill Slvl E

The tasks which are performed by level teams in big companies should be done by the proj ect team manager in SM E and he sho uld co-ordinate and tune the goals and

Techniques and analyses of sequential and concurrent product development processes

r

COORDINATOR OF PROJECTS

1

~

PRODUCT A

lr

r DEVELOPMENT

lr

....

Manager ---{

PRO~UCT Manager

-

PRODUCT C Manager

I-

PRODUCT D Manager

L.-

PRODUCT E

..

SUPPLY

lr

PRODUCTION

SALES

II

FI ANCE

.....

.....

...

..

...

...

.....

....

.....

....

COMPANY MANAGEMENT

....

'Il

'Il

• lI

.....,

143

'Il"

.....,

......

I

'"

...

....

'Ill"

....

"III

.. III.

II'

M anager

•

department participates in product realisation

Figure 16. Organisational scheme of a product matrix organisational structure.

activities between the project team and core team, and provide for a smooth transition from one phase (loop) of product development process to another. In big companies the members of the core, level and functional teams usually use project type of organisation. This type of organisation cannot be used in SMEs as they have too few employees. Analysis of various organisational structures of companies and teams [11], [12] has shown that in SMEs matrix organisation would be the most suitable for core and project team members (Figure 17). A member of the core team (with exception of the project team manager) would carry out tasks in his/her department part of his working time (for this work (s)he would be responsible to the general manager of the company), and the rest of his/her working time (s)he would work in the product development project (for this work (s)he would be responsible to the core team manager). A member of the project team (except the project team manager) would carry out the tasks in his/her department part ofhis/her working time (for this work (s)he would be responsible to department head), and the rest of his/her working time (s)he would work in the product development project (for this work (s)he would be responsible to the project team manager).

........ ....

~

Z

'-"e

nroicct learn

Manufact uring loo p

I

I

I

I I

I I

I

I

I

o

~I

II Proje cts em ployee s

mJn3CCr

Projec ts depa rtment

~I

II

~

departmen t participa tes in project learn

Ma rketing em ployees

manage r

Marketing department

Figure 17. Ideal ma tri x organisation in SM E.

;;.

I

Produc tion loo p proj ect tea m

::J::5== ..'"~I

Product planning

learn

loo p project tea m

iii",

;;> 1U

;;> '"

Desig n loo p proj ect

nroicct tcam

Feasibility loo p

\I

m. n....r

I t . ...

rrotm

.. . ..-a u

( '. ... l ... m

~ ~I I- ;J SI "u

""

I

~ ~ :e-.:> '" I-

~ ~

:.i "

U ... U

;;>

0

" "r.

:.ww

PROD UCT DEV ELOI'I\I ENT PROJECT

o

I

~ P,,~. plan ni ng ~ employee s

manauer

I Prod uctio n em ployees

mana ter

de part ment

~

I Manu fac turing and assemb ly cm nlovccs

mana rer

Man ufacturing and assem bly

---= II

Production

II---

I

department coo rdina tes the \Ioml in project lea rn

Design employees

mana ier

depart ment

P,,~. plann ing de pa rtme nt

I

G F.:'iERAL ,' I A:'iAG ER

---II

Design

-1

~

I

II Delivery

Delive ry employees

man auer

department

~

I

Techniques and analyses of sequential and concurrent product development processes

145

The project team manager would be excluded from his/her department throughout the duration of the product development project and (s)he would work full time in the project. 3.3. Goals and tools for support of concurrent product development process

Using concurrent engineering, the following goals should be achieved: • considerably shorter new product development time • reduced new product development costs • better quality of new products regarding customer requirements.

a.) Considerably shorter new product development time Product development time is supposed to be reduced by 50% or more due to the following reasons: • activities run in parallel • team members have regular meetings, which allow for fast and efficient exchange of information • responsibility for all product characteristics is transferred to teams (no time is wasted for searching the one "who is to be blamed for failures").

b.) Reduced new product development costs Figure 18 presents the diagram of ideal cost curve in sequential and concurrent product development and use. In sequential development and use of a product we can see that: • due to sequential activities, product development costs increase evenly • costs of production and use of a product increase rapidly because of long iteration loops for execution of required modifications and elimination of defects. In concurrent development and use of a product we can see that: • product development costs are much higher than in sequential development due to intensive activities during the early development stage (team work) • costs of production and use of a product are considerably lower than in sequential product development because of short iteration loops for execution of required modifications and elimination of defects.

c.) Better quality of new products regarding customer requirements Today only those companies are successful which can offer their customers: • right products, • of the right quality, • at the right price and • at the right time therefore the companies which are able to adapt to the requirements of the customers.

146

Starbek et al.

~

~

=

U

...e

g :;

.~

.

'C

"

..

r-: \ cf

1.000"1.000"

~se

,,"1.000"

{cot

0

,;

e .5 c e c 't .:

~~

-

=

'C .. 'C

",,1.000"....

;.

sE

Co

U

e'

.

Co_ 'C c

c

to"

.... .'"'" ..= e

-.;

'" "'-" ~

-..

E '",

E

Co

e

=-

~ ~

C:;'C

'C

Co 'C

-"= c

OJ

.=

'C

Q,.

Time Development - CE Product and pr oces

Produ ction - CE development - CE

Developm ent - SE

Pro duction - SE

Product and pr ocess developm ent - SE

Figure 18. Ideal cost curve in sequential and conc ur rent product development and use.

Figure 19 presents an overview of the "concurrent engineering tool s"; know ledge and use of these tools ensures better quality of products. 3.3.1. Quality Functions Deployment (QFD)

Quality functio ns deployment me thod (also kno wn as House ofQl/ality) is an important too l of concur rent engineering, which sho uld ensure that all customer requirements will be taken into account and realised duri ng development of the product. T he met hod [13] was develop ed in Mitsubishi shipyard in Japanese town of Kobe in 1972. It allows for design of the product development cycle. The method was quickly accepted in other Japanese companies. Toyota made the main contribution to its development and popul arity. In Europe the meth od is not yet widel y used. In USA it appeared in the eighties, mostly related to the Xerox Company. Hou se of quality is a met hod that, by using matrices, shows connections between customer requi rements and technical capabilities of the company. It is a too l that- in the prod uct development proc ess (as well as during its later improvements )-transforms customer requirements into specific technical solutions-product requi reme nts.

Techniques and analyses of sequential and concurrent product development processes

147

Failure Mode and Effects Analysis (FMEA)

Figure 19. Concurrent engineering tools.

Building a house of quality is a team work and it can be used as a communication tool for team members. The purpose of the method is that the customer participates in development of the product and in its later continuous improvements. Goetsch and Davis made the following definition [14]: House of quality is a practical tool for designing a product in such a way that it fulfils the customer requirements. House ofquality transforms what the customer wants into what the company produces. It allows to define the customer priorities, it seeks innovative approaches for their fulfilment, and improves the process up to its maximum efficiency. When implementing the QFD method, it is necessary to consider the following rules: • management has to completely support the implementation of the QFD method, • QFD implementation project manager should be the team member who is the most experienced in the QFD method usage, • each meeting of the team should have a precisely defined goal,

148

Starbek et al.

6. ROOF Correlation between technical descriptors of a product

2. HOW ROOM Technical requirements for the company and its suppliers

4. RELATIONSHIPS ROOM

I. WHAT ROOM Customer requirements, regulations, acts

What does the customer requirement mean for the company Relationships between customer requirements and technical descriptors of the product

3. COMPETITIVENESS ANALYSIS ROOM Comparison of selected solution with the competition

5. HOW MUCH ROOM

Definitionof valuesand importanceof technicaldescriptorsof the product

Figure 20. House of quality structure.

• it is necessary to take minutes during every meeting, • after the meeting the minutes are sent to all team members. 3.3.1.1. HOUSE OF QUALITY STRUCTURE. QFD-quality functions deployment is called a house of quality because of its characteristic shape [13], [15], [16], [17]. It consists of six matrices, called "rooms". House of quality structure is shown in Figure 20. There are six rooms in the House of quality:

1. WHAT room This is a list of what the customer wants. Primary, secondary and tertiary requirements are listed. Standards, regulations and acts may also be included.

Techniques and analyses of sequential and concurrent product development processes 149

2. HOW room This is a list of what the company and its suppliers should do in order to satisfy the customer requirements. It answers the questions of how the customer requirements will be presented in technical descriptors of the product. 3. COMPETITIVENESS ANALYSIS room It lists current situation of the product in comparison with its competitors, and locations of possible improvements. 4. RELATIONSHIPS room This is the core of the house of quality. It consists of a relationship matrix between WHAT and HOW rooms (relationships between customer requirements and technical descriptors of the product). 5. HOW MUCH room This list is used to specify which technical product/process requirements are the most important to satisfy the customer requirements. 6. ROOF of the house of quality It is presented by a correlation matrix between various technical descriptors of the product. 3.3.1.2. STEPS IN CONSTRUCTING THE HOUSE OF QUALITY. Building a house of quality is simple, yet it requires a lot of effort and efficient team work. Size of the house of quality depends on the number of customer requirements. Authors of the house of quality recommend that this method be used for problems consisting of up to 30 customer requirements and just as many engineering requirements, otherwise the method becomes too complex and unclear. The house of quality is constructed in 14 steps. Step 1: Customer requirements

Construction starts by gathering customer requirements. Questionnaires and market research methods are used. The data obtained are classified into primary, secondary and tertiary. The primary ones are general, the secondary ones define the primary ones, and the tertiary ones enable the primary ones. Step 2: Assigning weights to customer requirements

As customer requirements can be mutually complementary or exclusive, each customer requirement is assigned its relative importance (weight). Step 3: Technical descriptors

of the product

Engineering requirements of the product (HOWs) are defined, which enable meeting the customer requirements (WHATs).

150

Starbek et al.

~ CORRELATION

step 5:

GOAL

step 3: TEHNICAL SPECIFICATIONS OF THE PRODUCT

step 2·

step I CUSTOMER REQVIREMENTS

ASSIGNING WEIGHTS TO CUSTOMER REQVIREMENTS

step 7 RELATIONSHIP MATRIX

step 6" FEASIBILITY OF TECHNICAL DESCRIPTORS

step 9:

step 13:

BENCHMARK THE COMPETITION

FOCUS SALE

step 10· ANAL YSIS OF THE BENCHMARK

step 4: MEASURABLE TARGET VALUES

step 11:

TEHNICAL COMPARISION OF COMPETITIVE PRODUCT

step 8: TEHNICAL IMPORTANCE step 14: CRITICAL TECHNICAL SPECIFICATIONS OF PRODUCT

Figure 21. Steps in constructing the house of quality.

When defining engineering requirements the following questions may be useful: -

What is the function and purpose of the product? How does the product look like? How much does the product cost? How will the product be sold?

Step 4: Measurable target values

Measurable target values of technical descriptors of the product are defined (usually these are numerical values; however, they can be defined as a text). Step 5: Goals

Using an arrow, for each technical descriptor of a product we indicate whether a lower or higher value is desired. Correct value is denoted by O.

Techniqu es and analyses of sequential and concurrent product development processes 151

Step 6: Feasibility oftechnical descriptors

An estimation regarding feasibility of technical descriptors of th e produ ct is given on th e scale from 1 to 10, 1 being th e most easily feasible technical descriptor and 10 being th e most difficult one . Step 7: Relationship matrix

Ce ntral part of the hou se of quality is filled with data. Relation ship matri x defin es how the techni cal descriptor s of th e produ ct (H OWs) are related to th e customer requirem ent s (W H ATs). There are four possible relationships: -

strong relationship - weight of 9, mod erate relationship - weight of 3, weak relationship - weight of 1, no relationship (empty cell) - weight ofO.

Practical use has shown that for successful solution of th e problem s it is suitable that less than half of the matrix cells be filled in. After th e data have been filled into the matri x, checks have to be made whether each custome r requireme nt has interaction with at least one techn ical descriptor. If there is no interaction a new techni cal descriptor has to be defined, whi ch fulfils the custo me r requirem ent. If all cells in a matrix co lumn (technical descriptors of the product) are empty th en this particular descri ptor is not imp ortant . Step 8: Tee/mical importance

For each techni cal descriptor of th e produ ct its absolute and relative techni cal imp ortance is calculated. Absolute techni cal importance is calculated using the equation: AT I

=L "

(V R j x I j )

;= 1

ATI - absolute techni cal importance VR, - value of the relationship of th e i-th customer requirement I, - importance of th e i-t h custo mer requirement II - number of all customer requirements

Techni cal importance with highest absolute (relative) imp ortance obtains the highest rank, which mean s that it has th e highest influence on satisfying the customer requi rements. Step 9: Benthmarl: the competition

In this step the competi tiveness room is filled in. C ur ren t design of the product is compared with competitive prod ucts (our and competitive produ cts are rated on a 1 to 5 scale). Benchmark is carr ied out on the basis of questionnaire th e customers and by other market research method s.

152

Starbek et al.

Step 10: Analysis of the benchmark

The points obtained in step 9 are summed up for our and competitive products. Step 11: Technical comparison

4 competitive products

Fulfilment of technical descriptors of our and competitive products is rated on the scale from 1 to 5. Step 12: Correlation

Correlation matrix shows interactions of technical descriptors of the product. Interactions can be: -

strong negative - symbol =, negative - symbol -, positive - symbol +, strong positive - symbol ++.

Correlation matrix makes the roof of the house of quality. Step 13: Sales focus

Those customer requirements are defined which are best fulfilled by our product (in comparison with the competitors). When fulfilling these requirements we take care that we keep ahead of the competition. Step 14: Critical technical descriptors of theproduct

Those technical descriptors ofthe product are defined that achieve the highest absolute (relative) values (using e.g. ordinal ranking from 1 to 8). Those technical descriptors mostly influence the fulfilment of customer requirements. 3.3.1.3. EXTENDING THE HOUSE OF QUALITY. House of quality is a method for finding interactions between product functions and customer requirements. House of quality is extended in such a way that technical descriptors of the product in existing house of quality (HOWs) become requirements in new house of quality (WHATs). First a relationship between technical descriptors of the product and properties of parts is found (second house of quality), then between properties of parts and key process operations (third house of quality) and finally between the key process operations and production requirements (fourth house of quality). An example of such an extension of a house of quality is shown in Figure 22. 3.3.1.4. ADVANTAGES OF USING THE HOUSE OF QUALITY. There are several benefits if a company uses the house of quality method, especially in the fields of improving the competitiveness and quality. They are expressed in: • Focus on the customer Every company that introduced TQM has to be focused on the customer. House of quality allows for collecting input and feedback data from customers, these data are transformed into a collection of customer requirements and they become target values that the company has to achieve.

...

....

Figure 22 . Extensio n of th e hou se o f quality.

154

Starbek et at.

• Better lise of time House of quality redu ces produ ct development time because it shows the most important and clearly defined customer requirements. Therefore time is not wasted to develop features w hich are of no int erest to the custo mer. • Team work As a me tho d, th e house of quality is orie nted towards a team work. All decisions are results of a consensus of team members. • Consistent documentation One of th e results of the house of quality is an exhaustive docu men t, which co mbines all data about processes and shows how they co mpleme nt when satisfying th e customer requireme nts. Doc um ent is being continuo usly updated as new data are ob tained. In order to successfully plan new products and imp rove existing ones it is necessary to note daily information on customer requirem ent s. 3.3.2. value analysis

L. D. Miles wrote that value analysis [18] is an organi sed creative me tho d whose task is to show exactly and efficiently the un necessary costs-i.e. th e costs, which neither cont ribute to the quality, usefulness or life-tim e of a produ ct nor to its aesthetic fun ction or other characteristics desirable by the custome r. Value analysis is a system w hich allows solutions of complex problems which cannot be completely or partially transformed into an algorithmic form . It co nsists of combined actions of the following system elem ent s: • m anagem ent,

• meth od and • mod e of ope ration • with their simultaneous mutual imp act; the goal being to optimise the end result. Value analysis is a professionally applied, function-o riented, systematic team approach used to analyse and improve value in a produ ct, facility design, system or service- a powerful methodology for solving probl ems and/or reducing costs while improving performance/ quality requirements. Value analysis is a system atic meth od whi ch can be used in order to reduce the costs of a product or service. It is a creative process, a systematic searching for facts and alterna tives, whose purpose is to reduce costs to a minimum in each phase of product life- cycle [191 . T he concept and techniqu es of value analysis are called basic when dealing with "ec onomy decisions" . Prop er use ofvalue analysis ensures better results when searching for and reducing unnecessary costs. However, as any other tool, value analysis can be improperly used which means that we do not obtain desired results. Considering the fact that th e meth od has been successfully used in the industry for more than 40 years we can co nclude that improper use is usually the one that obtains unsatisfactory results. Value analysis is not a substitution for design- en gineer ing and produ ction engineering kn owledge- it is an excellent systematic approach to use this knowledge.

Techniques and analyses of sequential and concurrent product development processes 155

Value analysis is an aid, which allows the company to preserve or increase its competitiveness on the market. At the beginning value analysis was used only in mass production (and in great extent it still is used today). However, the attempts to use value analysis in small-series production (or even in individual production) were extremely successful. It is obvious that it is more sensible to use value analysis if quantity or price of the analysed object increases [20], [21]. Today value analysis is limited neither to a product manufactured in mass- or individual production nor to the size of the company or the industry. Objects of value analysis can be: • products, • production systems, • administration, • organisation. Selection of the object of value analysis depends on business decisions, supported by proper analyses. According to VDI 2222 [22], value analysis of a product can be used in all three key phases of product development: • development, • design, • production. Naturally, the most benefits are obtained from the value analysis if it is used in the design phase. The sooner in product development value analysis is used in order to find economic solutions, the higher the benefits will be. In the design phase the value analysis deals with products which exist only as drawings, models or prototypes-things which are not yet in production. In the production phase a value analysis of products on the market is made. Graphical presentation of value analysis presents clearly why it is so important to use it early in the product development phase-see Figure 23. Goals of research made by value analysis arise from the goals of the company [21]. Depending on strategic orientation, the goals of the market research are: • increase of profit, • increase of usefulness for the customer, • achieving competitive advantages. The results ofvalue analysisare usually presented as reduced costs. Additional benefit that customer requirements are fulfilled well, and thus competitive advantage is achieved. Using value analysis an optimum between producer costs and customer benefits is expected:

IS

156

Starbek et al.

Vl

Vi

o

U

\

I I I

I I I I

Costs for changing a produ ct

Possibi lities for redu ction of product costs

r-.

~

-

i---.

r--- r---

f.-- l---

~

-"..,.::::

/'

I

-:

)

/

i

Vl

c:

" E

OJ

c:

E OJ E c,

"'" ::2

c:

0

c:

>

' in

0)

0'"

0.0

'"

0

e, '"

0

e 0

c,

c,

::I

,S ti ::I -c

ec,

,S ;;

tiVl

'"

.S

'-

'"c, O

::I

~ ~

'1

~ c,

c

e

0

c:..

ti (/J

eo

c: .;: ::I

ti
c:

ee

~

Vl

'"

~

(/J

OJ Vl

:J

~ ~

Time

Figure 23. Reduction of product costs with value analysis in different phases of product life cycle [22].

• increase of usefulness for the customer was shown in 80% of all researches, • reduction of throughput time up to 50%, • reduction of costs up to 20%. Depending on the goals the costs may be reduced, or the number of functions may be increased, or the quality may be improved or the processes may be sped up. Value analysis research should increase productivity and increase value for the end user. 90% of all researches revealed an increase ofquality in spite of reduced production costs; the remaining 10% revealed that the same quality was retained. In addition to quantitative results, value analysis brings several additional benefits to the company: • Employees' thinking is oriented towards goals, costs and functions. • All participants are motivated to give their contribution to achieve success. • Collaboration inside the company is improved. • Capabilities of team work are improved. • Creativity of all employees is used.

I. PROJECT SET UP 1.1. appointing a moderator 1.2. undertaking the order. finding general goals

J.3. definition of individual goa l ~ IA. limiting the scope of research 15 . findin g the project o rga nisa tion 1.6. plannin g the project

~ 2. A:"ALYSIS OF TilE C RRENT SITUATION 2.1. data about the subj ect 2.2. data about cost s

2.3. finding functions 2A , assigning co sts 10 functions

+

3. DESCRIPTI ON O F TII E " TO BE" SIT AT ION 3.1. evaluation ofdata 3.2. defini tion o[HTO BE" function ... 3.3. ass igni ng target co sh 10 the "TO BE" functions

+

4. DEVELOI'~ IE:"T OF DRAFTS ~. I .

gathering c'I( isling so lutions

~ .2 .

development of new ideas

!

5. SELECTING THE BEST SOL 1'101'1 5. 1. definition o f evaluation criteria 5.2. evaluation of draft s 5.3. revi ew of possible drafts 5.4 . evaluation of possible drafts 5.5. making det ailed so lutio ns 5.6. eva luation of solu tions 5.7. defining deci sion tables 5 .X. makin g a de cision

+

6. REALIZ ING A SOLUT ION 6.1. detailed plan of rculivution 6 .2. imple mentation of realiz ation 6.3. controlling the realizat ion 6.4 . finishing the proj ect

Figure 24. Value analysis method (D IN (99 10).

157

158

Starbek et al,

Project setup

Start of the project Yes

Approach ing the goal

No

Analysis of the current situation

Yes

Approac hing the goal

No

Desscription of the "TO BE" situation

No

Development of outline schemes

Yes

Approac hing the goa l

Yes

End of the project

Selection of the best solution

Figure 25. Iteration mod el of value analysis.

Value analysis method is standardised in D IN 69910 [23] and consists of 6 steps (Figure 24). Steps are divided int o sub-steps, which can be repeated in several iterations (Figure 25). Sub-steps can be mixed or can be repeated in several iteration s. 3.3.3. Failure Modeand Effects Analysis (FivlEA)

Failure M ode and Etfe cts Analysis (FMEA) is a method ofpreventive quality assurance. T he goal of the FMEA method is to find and prevent possible failures dur ing produ ct development and manufacturi ng. Failures that arise during produ ction or use of the produ ct cause high costs. Because of them the company often loses its reputation in view of the custo mers. FMEA is a target- ori ented meth od w hich allows us to find possible failures on time. Risks as results of failures are evaluated, and corrective measures are developed to prevent failures. FMEA goals are: • evaluation of effects and consequenc es of events, which will be caused by each failure found in the system , • definiti on of value or criticalness of each failure with respect to the prop er function of the system and influence on the reliability andlor safety of the process,

Techniques and analyses of sequential and concurrent product developm ent proc esses 159

Table 1 Types of FMEA

System FMEA Design FM EA Process FMEA

FMEA of j oi nt- ventures, suppliers

Obj ect of analysis

Elements of FMEA

W hen ?

R esponsibility

Supe rior produc t/ system (e.g. car) Import ant component Manufactur ing process steps (e.g. casting) Ser vice steps

Proj ect of a produ ct

Development

Design do cume ntation Manufacturing plans

Project of a produ ct after manufactur ing D esign documenta tion after manufacturing Plan after manufactur ing

Plans of services

Plan after service

Plan ning a service

Design Manufac turing planni ng

• finding the failures in accordance with the possibility of their dete ction, diagnosing and testing, • estimation of required corrective measures. In various product development phases there are four types of FMEA ; all together they form a complete system :

• system FMEA define s function ality of individual system components with respect to the complete system and int erconnections between individual components (e.g. operation of the engine, gearbox and drive shaft at the gearbox); • design FMEA is used for finding possible failures of individu al compo nent in design, manufacturing and assembly; • process FMEA researches possible sources of failures in produ ction process, • service FMEA is used for j oint-ventures and suppliers. Types of FMEA and their basic features are shown in Table 1. Their commo n feature is the same appro ach. D ifferences between FMEA types are visible especially in [he design phase and in definiti on of a goal, which corre sponds [ 0 their execution. Although it makes sense to use all types of FMEA , in practice most often design and pro cess FMEA are used; th ey are divided as shown in Figure 26. Using FME A has the following advantages [24]: • It helps at selection of alterna tive design solutions with high reliability and safety already in early development phase. • It identifies possible failures and their effects, which influe nce efficiency of product functions. • Program of tests is made in development phase, before final confirmatio n of design. • Criteria for definition of produ ction process, supply and service are developed . • Failures are document ed as futur e references in order to help us in failure analysis during use, and when dealing with design changes. • I[ is a basis for findin g priorities of corrective actions.

160

Starbek et al.

FMEA DESIGN

PROCESS

Components

Machines

Sub-system

Tools

Main system

Workstations Production lines Processes Control devices

Figure 26. Division of design and process FMEA.

FMEA is a preventive technique which allows for a systematic study of causes and effects offailures before design is finished. The product is analysed (on a system or lower level) from all possible points of view which may lead to failures. For each possible failure, effects on entire system are estimated; their severity and their frequency of occurrence are defined. Drawbacks of FMEA:

• It is difficult to perform FMEA for complex systems which perform several functions and consist of many components. • FMEA results do not take into account human errors. Human errors usually appear in a certain sequence during the system operation. Yet, the FMEA can find the components which are the most sensitive to human factors. Execution of the FMEA is in the competence of the company management whose task is to: • define the requirement for FMEA execution • define the goal • define the limits of problem solving • define the deadline for execution of the task • form the workgroup. Setup and execution of FMEA is a result of a team work. Figure 27 presents the composition of a workgroup, responsible for execution of the FMEA analysis. This method is divided into several working steps [25]. Figure 28 presents a form for execution of the method, where individual steps, which follow each other in a sequence, are shown.

Techniques and analyses of sequential and concurrent product development processes

161

Figure 27. Composition of a workgroup.

The header of the form is first filled out with the basic data required for clear definition of the product. The form is then filled out in four steps: Step 1: Failure analysis

According to the FMEA type used (system, design or process) it is necessary to define system or design functions and individual production process steps. Possible failures, their effects and sources of failures are analysed in detail. Step 2: Risk assessment

For each possible reason of failure theprobability of its arising (risk factor N) is estimated and assigned a value from 1 (not probable) to 10 (highly probable). For each cause of failure the influence or meaning (~f thefailurefor customers is estimated (risk factor V). It is important for the customer that the product works well so the estimation from 1 (no consequences) to 10 (great consequences) is used. For each source of failure the probability of.finding thefailure is estimated (risk factor 0). The range of estimation is from 1 (high probability) to 10 (not probable). In order to define the total risk of possible cause of failure the preventive risk number (PRN) is calculated as a product of estimated values for the N phenomenon, influence V and finding the failure 0: PRN= N x V x 0

The value of PRN is between 1 (no risk) and 1000 (very high risk). However, the PRN value is not enough. If reasons for failures are sorted by PRN it is possible to define priority for their elimination. High-PRN reasons can be eliminated by introducing corrective measures into the product and production process.

.7

"

~ t-------I- ~ ~

j

z c ;::

<

:l I-

:r.

I-

...= :

·

~ t---------1-- 7

"":l

=:

U

,. j

··

;:

1'Ir--+------!-:<

,

162

Techniques and analyses of sequential and concurrent product development processes 163

Figure 29. The VEPER mini-loader.

Step 3: Measures for optimisation of product design

With respect to individual risk assessments (the value ofPRN) it is possible to introduce appropriate corrective measures and improvements into the product design. This can be done on the company level or just in a particular department. Step 4: Assessment

~f results

Using the above-mentioned procedures and measures it is possible to correct individual deficiencies. These improvements have to be re-evaluated regarding possible failures (step 2 has to be repeated (PRN calculation)). 4. SAMPLE CASE OF INTRODUCTION OF CONCURRENT ENGINEERING IN AN SME

An SME which produces civil engineering equipment decided to develop a miniloader (Figure 29). Mini-loader development process ran in two phases:

1. Analysis of customer requirements (i.e. market analyses) and construction of the house of quality. 2. Plan and execution of mini-loader development project using the concurrent engineering principle.

164

Starbek et al.

4.1. Building a house of quality

In order to build a house of quality the company management formed a team whose members were from the following departments: marketing and sales, e development and product planning e design, e production process e production eQC/QA, e supply and e external member (designer). e

Head of marketing department was selected as a project team manager. Before starting the construction ofthe house of quality for the mini-loader, the team members were informed on details about the product, possible customers, domestic and global competitors, and manufacturing costs. After preliminary activities had been finished the team members performed all 14 steps in construction of the house of quality, which is shown in Figure 30. Analysis of the house of quality for the mini-loader led the team members to some important conclusions: 1. The mini-loader, produced by the company, fulfils the following customer requirements better than its competitors: e e e e e

it it it it it

is lower and narrower than competitive products, has a recognisable design (influence of external designer), can be transported on a trailer, consumes less fuel, is cheaper than competitive products.

In comparison with the competition, the product is worse regarding the following three requirements: its components are of worse quality, delivery time is much longer, e maintenance is more demanding. e e

2. The mini-loader, produced by the company, fulfils the following technical descriptors better than its competitors: engine power, size and weight of the mini-loader, e volume of the ladle, e selection of colour, e cost of materials used. e

e

Techniques and analyses of sequential and concurrent product development processes

IU UAAI'L 'l'O':

165

."'....",

lM.h1)~",

"1lotf.Il",

. h,p l>......"\<'

, I

-I.",

1-".. ...

'- {;() \ I.: O-,..... nfht

Of.

rTrTr

l..........p

r

1 1 .\. lt ll'K ',\I . Dr.<,('RII"TORlim' TIlt : P1(O()t ("T rlJJ 14 " . IU ,< II" ' N)" I IU:(ll \lI" .IIIlI"

·, ~

Wll ...."" .

t,lI ..

.. OIl..... ' .....

~_ IIOO .

1l."" 'f""C"J<.ku",

.. ........,_tIf S

I

""

T

_~danY

.10

'.... ht' l

, "" ....

, traokt' ""'11101'

" tIolofykl a/Qdloa rot-doUC"

I

... .IItA" , ulC'fPol~ ' boh"' ''' { - . f
n

'9

,

Lw' wl"' ap>nlhln

12 ( ... tlt-lIW1d . ,Ih_

~

1._

1

--

.. Dimc.h ln ""IlM " .

_

,.... •

_ .... ,_(nt_.'~. ,

....,'. l.'

3

Q

l l ]

1;;; ,f. ,.. ::~

')

~

10 110 1('11 11<

~ J J 1

•~ ~

I

• P. ~

i1 ~ Z i

Q

;

1

!, ~

it

I

l 1i , 11 ~ ~i· i l~ t

~

~ ~ !i

•

-

•S "

....

1,' 0."1,1

...

1 1, 9 I I " 1 l,Il 1 ,1

14.CRITIC \I . TU I" CAI. Of$ CR lrT ORS

or nu rRODl'CT

) IU

1.11

K,' 2.l

Q

I

3

t> 12

2 J

"I" MU" U .

Q

!

f

• •• • • •• •• ~ .1 .1

•

11 4 X \0

1. 11 11'1 ( '1

·

'j

1

g

; <-

•

I

v

3

nl

,..nl ...,.. .. ~~

" ......... JO>I
I

3

b

1

,

-

;

"4

m" tf

\l ,......."'_..: ~ ~ ShonlIrIi ' ''')"_

"""C'frrl...-

fulr.~rovf''''

9b!a,ncoJ

V

' '_ _ ,If ,, fU

(

I

!1 "'_

,,

,

" bo hl~

I

~~~= ]i I:I! -=::. ·.;"""' ·ml' CI"w...,1 :

.. VI 1,"

III

,9 ~

iii

,""''''ll. e- = ;ttht

II .

"'~J

81:"-<11\ 1\1«11.:

l~ 6

•

~ R.

.0

1111

•~"" .. ",,"'l

I'-

fl.'

" ,I

~

Figure 30. House of quality for the mini-loader.

The product is worse regarding the following technical descriptors: • smaller load capacity, • smaller tearing force, • maintenance frequency, • too small lot size. 3. A highly positive correlation exists between the following pairs of technical descriptors of the product: • engine power and load capacity, • weight and size,

166

Starbek et al.

• stability and tearing force, • organisation level of the service department and maintenance frequency. Highly negative correlation exists between: • quality of engine and cost of purchased parts, • engine power and maintenance frequency, • quality of pump and cost of purchased parts, • load capacity and design simplicity, • stability and design simplicity. 4. In further development ofthe mini-loader it will be necessary to pay special attention to the following technical descriptors: • size of mini-loader, • construction simplicity, • weight of mini-loader, • universality of connection plate, ·lot size, • quality of pump, • quality of engine, • evaluation of purchased parts. The results of team work with an emphasis on the construction of the house of quality for the mini-loader were presented to the company management; it was stressed that this was the first one offour houses ofquality which should reveal how the product fulfils the customer requirements. The company management and the team members discussed the results obtained and decided that the team should proceed with the construction of the other three houses of quality: • house of quality for planning parts and components, • house of quality for production process planning, and • house of quality for manufacturing and assembly planning. Four Houses of quality for the mini-loader will be used in order to gradually transfer customer requirements from the product to its components and parts, from components and parts to production processes, and from production processes to manufacturing and assembly. 4.2. Project of concurrent product development process

4.2.1. Goals of theproject andproject team

The company decided to develop a new mini-loader in a project style. The goal of the project was development of mini-loader and implementation of the concurrent engineering in the company. In order that the company could switch to the concurrent development of miniloader it was necessary first to decide about the structure and composition ofconcurrent product development teams.

Techniques and analyses of sequential and concurr ent produ ct developme nt pro cesses

167

The company management decided to form a two-l evel team structure (core and project teams). In order to get the best structure of both teams two creativity workshops were organised with the general manager, his assistant and nin e departm ent managers participating. Results ofthe first creativity work shop have shown that the core team sho uld consist of eleven company employees: • general manager who wo uld manage the core team, • nin e department managers, • assistant general man ager who would manage the proje ct team. All core team members will be permanent members; core team composition will therefore not change within the mini-loader development tim e. 4.2.2. WBS of the project and rcsponsibilit» lII atrix

The second creativity workshop was organised in order to define stages of mini-loader developm ent process and their corresponding activities, as well as responsibilities of departments to carry out those activities. For the new mini-loader development project a WB S structure of the project was made, as show n in Figure 31. For execution of project activities, responsibilities were assigned to department heads and company empl oyees, as present ed in respon sibility matrix (Table 2). 4.2.3. Structure of a proj ect teamtor exeClltilll/ of concurrent ell,Rilleerill,R loops

R esults of the second creativity workshop and selectio n of the project team manager allowed for the defin ition of the project team struc ture in individual loops of the miniloader development, as show n in Table 2. C hangeable structure of the project team in loops of the mini-loader developm ent is shown in Table 3. Project team man ager will be a perm anent team memb er, while experts from nine departm ents of the comp any and representatives of designers, suppliers and customers will be variable team members. After the structure of the core and project teams had been defined, it was possible to form a two-level team struc ture for mini-loader development (Figure 32). 4.2.4. Tillie and structural plan
U p to now the produ cer of mini-loaders has developed new produ cts sequentially. Analysis of the past results of sequential development of mini loaders has shown that the average development time for a particular produ ct was four years. In these days the market demands sho rt delivery terms of produ cts and short development tim es. In order to redu ce the mini-loader development tim e (and thu s get a competitive advantage) the company decided to concur rently develop a new type of miniloader.

'=' 00

....

~IBILITY

DEVELO PMENT

00ll

DESIGN

] 012

Figure 31. WBS of the mini-loader development project.

I

~~7 Gois of markt'

-

1 1~1Produt 1 pluniaK lad II~ 1°1 5armlttril) Bills

of parts

~~"<:ak.l.riOn

Dr:twiD~.

ASHlIIbly

~~25 Acctptuu ud siorla2

I

H~2J Order l ad sapp ly of nilI~

I~~~' Ud
I

I

I

, ~ YO SJK upply

MudarturiDJt or pu ts

I HOJO

Finl

11~:..r"'Uri02orlPpiial {OJ7 to-irol

J2 H~22 ,J HO Orervlew or.".i1.bllt Ito Cbeck H Cm ~2J rioDof orders

Orrerlad toalnel

I 1~J5

I

1

I

1

11~~paratJo. arlDlleml I 1~~paratlo. orlb, ProdJ

LauDCIa of production

027

I1

A~

] 026 ~ o~ MAN UFACTURING MARKETING Ai'iDSA I

H031 H0Documutation 21 of ordeI ' ~

Prtparatloa

H020

Coalrol preeederes

IH ~19

T«hu2)' ro_tiael

I HOIK

nq ulrtmf:Dts

PROCESS PLANNING

I 016

]

Dn ica ofrompoanb I 1°1 11:'1drallud ..neteee 1 1~IJ Ml7 ltrial

I

I~::~tdrall or,omPOI"~ 1°1.

STUDY

~~5 FiaudaJ piau

1~;'b orlb, prod ... d",j ~~nD PIU

0DEFINITION 01 OF GOAl

1'1

VEPEIU.PJ

Techniques and analyses of sequential and concurrent product development processes 169

Table 2 Responsibility matrix of the mini-loader Department:

Stage No:

Descriptionof product development

Employees:

stage:

Planned activities within the staae: Definition ofgoals Goals Term plan

Feasibility study

Financialplan Pre-calculation

Goals of market First draft of the product

Product

planning

First draft of components

Planningof the product Design of components 4

Design

Drawings of parts Bills of material

Material requirements Technology routings Control procedures Preparations

Process planning

Documentation of orders Overview of stock Creation of orders Order of material Acceptance and storing Launch of production Preparation of material

Manufacturing of lianccs

Check Test and control

Offerand contract Marketing and sales

PrepaiJ.tion of theproduct Final control

Supply

170

Starbek et a!.

Table 3 Project team structure in individual loops of the mini-loader development PROJECTTEAM MEMBERS DESCRIPTION STAGES. iJ .D OFTHE INCLUDED IN THE E LOOP: LOOP: ;;;

"

C/O

~

C/O

'3

~

V::

:3

0. 0 0

OJ

...1

1.

FEASIBILITY LOOP

2.

PROJECTLOOP

3.

DESIGNLOOP

4.

PROCESS PLANNING LOOP

13

5.

MANlJFACTlJRING AND ASSEMBLY LOOP

14

A creativity workshop was organised with all members ofthe core team participating. They were asked to estimate or define the following: • duration of individual stages (activities) in the concurrent product development process; • possible connections between stages (activities); • types and planned times of overlapping stages (activities). Results of the core team work during mini-loader development are shown in Table 4. The data on times, connections and overlapping of stages (activities) in concurrent mini-loader development (shown in Table 4) are the input data for the CA-SP] software which was used to design the Gantt chart of the development process of the new type of mini-loader (Figure 33). Analysis of the Gantt charts of the existing sequential and the planned concurrent development of the new mini-loader has shown that if the company shifts from sequential to concurrent engineering, it will be able to launch a new mini-loader in 25 months instead of four years as before-which would considerably improve the competitiveness of the company. The success of the concurrent mini-loader development process mostly depends on the effectiveness of work of the project team in the product development loops, and therefore activities in future will be directed towards a detailed organisation and co-ordination of the project team members during individual loops of product development.

......,

Figure 32. Two - Icvcl tc .u n str uc tu re during m in i-l oad er developm ent .

VARIABLE STRUCTURE OF PROJECT TEAM IN PRODUCT DEVELOP MENT PROCESS

PERMANENT STRUCTURE OF CORE TEAM IN PRODUCT DEVEL OPMENT PROCESS

a· Manager ot IT department

Finance Informa tion unit

8-

9·

16· Manufacturing

15 · Cooperation

14 · Quali ty

13 · Oparative prepare

12 - Logistics

11 · Sha pIng

10 - Delivery

Markehng Sa les

5· 7-

Su pply

46-

Design Prod. proc. plan

3·

Development Product planning

1·

Manager of SUPPLY departm ent

2·

Manager 01TECHNO LOGY department

t, j.

h · Manager 01DESIGN department

f · CORE TEAM manager g _ Manager 01DEVELOPMENT and PLANNING department

e· Manager 01PRODUCTION department

c - Manager 01FINANCtAL department d . Manager 01MARKETING and SALES department

b· Manager of DUALI TY department

N

........

D efinition of goa ls Feasibility stu dy

Pro du ct planning

De sign

Process plan ning

8

12

16

D ESC R IPT ION O F PRO D U C T DE VE LO PM EN T STAG E

1 3

Stage id .

13 5 14 3

11

4 5 7

18 19 20 21 22 23 24 25

Techn ology ro utings Cont rol procedures Preparations D ocumenta tion of orders Overview of stock

C reatio n of orders O rde r of mater ial

Acceptance and storing

[0

8 9 8

10 14 15 13 18 19 18 19 21 20 17 22 24

9 9

II

9

9

9 9

2 2 5 4 5 2

Pre ceding activity id

5

4 4

10

15 17

14

10 11 13

9

3 13 12

2 4 5 6 7 19

Activ ity d uration estima tio n [months]

Act ivity id .

D rawings o f part s Bills of materi al M aterial requ irement s

First draft of th e produ ct First draft of com po nents Planning of the product D esign of co m po ne nts

Goa ls Ter m plan Financial plan Pre- calculation Goals of market

Plan ned activities within the stage

Table 4 D ur ation of activities, types, and tim es of overlapping activities during m ini-loader develop me nt

x

f'S

x x

x

x x

x

X

x

x x

x x

x x

x

x x x x

x x x

SS

x

x x

FF

Type o f ove rlap

I

0 0 1 0 5 2

[

0 3 1

0

3

0

2 1 3 3 0 3 3

0

()

1 2

Time of overlap [m onths]

..

'"

.....

Manufacturing and assembly

Marketing

26

34

11

6 8 4 5 4 4 11 4 2 3

27 28 29 30 31 32 33 35 36 37 38

Launch of production

Preparation of material

Manufacturing of appliances

Manufact. of components Assembly Check

Test and control Offer and contract Preparation of the product

Final control

Supply

20 24 30 29 31 32 28 32 33 6 33 35 37

14

19 21 24 25 27 7 x x

x x x

x

x x x

x x x x

x x x x x

x x

1

2

0 0 0 1 2 0 0 4 3 2 0 1 0 0 0 0 0 0

-

........

-

-

-

and Itortna

I

I

1'1

O l~

O l~

0 111 Oil 0 12 0 13

U09

OOS 006 007 ooH

t o ().l

00 1 U02 003

2010 3010

~mo

limo 113~

II ~

036 037 113M

,

Ih

,

Aclh' 2002 Id ", - F MA~fTJ ~ 0

0 16 0 17 OIH IJmo 019 020 ~mo 14010 021 3010 022 U2J ~ooo Sooo i1l2" 7000 : 0 2.~ 2Jldy 026 limo 027 6mo OIH 8mo 029 ~mo 030 ~mo 03 1 ~mo 032 ~mo 033

9010 l89dy Smo 8mo 9mo 336d)' 8000 lImo

~mo

~nlo

~2dy-

63dy 3000 44ldy I3mo Umo 19mo 10010

S~dy

Schd Dur

I ~2dy

I

Fi g u re 33 . G antt chart o f the co ncurr ent dev elop m ent o f a new type of m ini- loade r.

Chock Ttlt and control MARKETING AND SALE S orrrr and rontnd PR'paration orebe produC'l Final control Supply

Preparation o( ma terial Manur.durtnl of appUlncu Manufacturlnr. of pam and romponf'ntt Allrmbl,·

MANUFACTURmG AND ASSEMBLY Launch of production

A~pt.ncr

Onn Jew or avall.hle It()('k Creation or onle" Ordn and lupp l,· of materl.1

Dorumrnta tlon of orders

Conlrol preee durn Prepantion

Tec:hno..,' roullna_

Financial plan Pro-cakulatlon Colli or market PRODUCT PLANJIolNG F1nt draft of rhe product Fin. dran of componeRtt Product plauoln. and Itt control pro«u" DESIGN Dalan of oomponentl Dr_ln•• of part. BUIsof mal.rial PROCESS PLANJIo'L'l/G Ma.,rial ft'qulrrmrntt

VEPER.PJ DEFIN ITIO:\' OF GOAl.'> Goal!l urthe product development procell FEASIBILITY STUDY Tormplan

H.adlnxfTa.k HDI,".Per Column N D

J

2003 F M

A )1 J

J

A

N

•

-or

-or

-

S 0 D

~

200

Techniques and analyses of sequential and concurrent product development processes

175

5. CONCLUSION

Global market requires short product development times, and therefore small companies are forced into transition from sequential to concurrent product development. As the basic element of the concurrent product development is team-work, the chapter pays special attention to the formation and structure of teams in a small company. Research has led us to the conclusion that a workgroup in a small company should consist of just two teams (logical and technology team) instead of four ones, and that a two-level team structure (permanent core team and variable project team) is more suitable for small companies. In order to reach these goals the companies will have to shift from individual to team work, implement the known methods for quality management of products and processes, and finally organise the process of concurrent engineering for new product implementation with emphasis on: • Computer-aided design (CAD) • Quality functions deployment (QFD) • Design methodology • Value analysis (VA) • Evaluation of quality • Design for manufacturing (DFM) and assembly (DFA) • Failure mode and effects analysis (FMEA) The proposed concept of team formation in a small company has been tested in a sample case of team composition in a mini-loaders producing company. First the permanent core team structure and then the variable project team structure have been defined. The team of company department's managers accomplished activities of construction house of quality for product. With the construction of the first house of quality, which refers to product planning, the voice of the customer has not yet reached the lowest level of product planning (manufacturing and assembly); the team will have to build another three houses of quality for the mini-loader: • house of quality for planning parts and components, • house of quality for production process planning, and • house of quality for manufacturing and assembly planning. Construction of the four houses of quality for the mini-loader will enable the team to gradually transfer the requirements and wishes of the customer from product to its components, from components to production processes, and from production processes to manufacturing and assembly. Team work and construction of houses of quality are important elements of concurrent product implementation: the first one is a means for organisation integration and the second one provides for the fulfilment of customer's requirements.

176

Starbek et al.

The team of company department's managers finally constructed a project of concurrent product development into company. Results of project has shown that if the company shifts from sequential to concurrent engineering, it would be able to launch a new mini-loader in 25 months instead in 48 months as before. REFERENCES

[1] Prasad B., 1996: Concurrent Engineering Fundamentals, Volume I. Integrated Product and Process

Organization, New Jersey. Prentice Hall PTR, PI'. 216-276. [2] Duhovnik J., Starbek M., Dwivedi S. N., Prasad 13., 2001: Development of New Products in Small Companies, Concurrent Engineering: Research and Applications, Volume 9, Sage Publications, 1'1'.191-210. [3] Ehrlenspiel K., 1995: Integrierte Produktentwicklung, Carl Hanser Verlag, Miinchen Wien, PI'. 144180. [4] Winner R. I., 1988: The Role of Concurrent Engineering in Weapons System Acquisition, IDA Report R-338, Alexandrija, VA: Institut for Defence Analysis. [5] Ashley S., 1992: DARPA initiative in Concurrent Engineering, Mechanical Engineering, Vol 114, No.4 PI'. 54-57. [6] Schlicksupp H., 1977: Kreative Ideenfindung in der Unternehmung, Watter de Gruyter, Berlin New York, PI'. 152-165. [7] Starbek M., Kusar ]., Jenko 1', 1988. The Influence of Concurrent Engineering on Launch-to-Finish Time, The 31 th CIRP International Seminar on Manufacturing System, Berkeley, USA. [8] Starbek M., Kusar ]., Jenko 1', 1999: Building a Concurrent Engineering Suport Information System, The 321ld CIRP International Seminar on Manufacturing System, Division PMA, Katholicke Universitet Leuven, Belgium. [9] Bullinger H.]., Wagner E, Warschat J., 1994: Ein Ansatz zur Zulieferer-Integration in der Produktentwicklung, Datenverarbeitung in der Konstruktion 1994, VDl-Verlag, Dusseldorf [10J Duhovnik j., Starbek M., Dwievedi S. N. Prasad 13., 2003. Development of innoative products in a small and medium size enterprise, Int. ]. of Computer Applications in Technology, Vol. 17, No.4, PI'. 187-201. [11] Bullinger H. J., Warnecek H.]., 1996: Neue Organisationsformen in Unternehmen, Springer-Verlag, Berlin Heidelberg New York. [12] Draft R. 1.., 1998: Organizational Theory and Design, Cincinnati South Western College Publ. [13] Gevirtz C. D.. 1994: Developing New Product With TQM, Mc Graw-Hill, Inc, New York, PI'. 101-114. [14] Goetsch, David, I.. 1994: Introduction to total quality: quality, productivity, competitiveness, New York, Macmillan, cop. 1994. [15] Erhlenspiel K., 1995: Integrierite Produktentwicklung, Carl Hanser Verlag, Munchen Wien, PI'. 144180. [16] VDI- Gesellschaft, 1994: Wege zum erfolgreichen Qualitatsmanagement in der Produktentwicklung, VDI Verlag, Dusseldorf, PI'. (,7-79. [17] Prasad B., 1996: Concurrent Engineering Fundamentals, Volume II Integrated Product Development, Now Jersey: Prentice Hall PRT, 1'1'.1-51. [18] Miles, I.. D. 1961: Techniques of Value Analysis and Engineering, McGraw-Hili Book Company, Inc. [19] N. N., 1991: Wertanalyse; Idee, Methode, System, VDI-Verlag GmbH., 4. Auflage, Dusseldorf. l20] VDl 2801, 1970: Wertanalysc - BegrifEbestimmungen und Beschreibung dcr Methode, Hrsg. VDI. [21] VDI 2802, 1971:Wertanalyse-Vergleichsrechnung, Hrsg. VDI. [22] VDI 2222, 1972: Konstruktionsmethodik 131.. 2, Konzipieren technisher Produkte, VDI. [23] DIN 69910, 1973: Wertanalyse-Begriffe, Methode, Hrsg. Deutscher Normenausschufl. [24] Strancar, D., Krizman, V, 2000: FMEA - seminar papers, SIQ Ljubljana. [25] Stamatis, D. H., 1995: Failure Mode and Effect Analysis: FMEA from theory to execution, ASQ Quality Press, Milwaukee, Winsconsin.

DESIGN AND MODELING METHODS FOR COMPONENTS MADE OF MULTI-HETEROGENEOUS MATERIALS IN HIGH-TECH APPLICATIONS

KE-ZHANG CHEN AND XIN-AN FENG

1. INTRODUCTION

With rapid developments of high-tech in various fields, there appear more critical requirements for special functions of components/products. For example, the thermal deformation of satellite's paraboloid antenna (10 meters in diameter) should be controlled within 0.2 mm in order to work well under the environment with large variations in temperature (-180°C~120°C). To fulfil it, its thermal expansion coefficient should be close to zero. Another example is that Poisson's ratios of sensors should be negative in order to increase their sensitivities to hydrostatic pressures. If Poisson's ratio of a sensor can be changed from an ordinary value of 0.3 to -1, its sensitivity will be increased by almost one order of magnitude. The third example is about the cylinders of vehicular engines or pressure vessels. They are subjected to a high temperature/pressure on the inside while the outer surface is subjected to ambient conditions. It is desirable to have ceramic on the inner surface due to its good high temperature properties while it is also desirable to have metal away from the inner surface owing to its good mechanical properties. Joining the two materials abruptly will lead to stress concentration at the interface. A gradual change of constituent composition is thus required. But, the components made of homogeneous materials rarely possess all these special functions mentioned above. Recently attention has focused on heterogeneous materials, including composite materials, functionally graded materials, and heterogeneous materials with a periodic microstructure.

177

178

Ke-Zhang Chen and Xin-An Feng

matrix

reinforcement

Figure 1. Composite material.

= = = = == = = = = == = == = == = = = = = = = = = = = -c:» = = -c:» = = = = <:_~

'---'

<::»

<;.>

c::::>

"--'0>

.... -

<:.:»

Figure 2. Heterogeneous material with a periodic microstructure,

In the most general case, a composite material [1-3] consists of one or more discontinuous phases distributed in one continuous phase as shown in Figure 1, The continuous phase is called the matrix and may be resin, ceramic, or metal. The discontinuous phase is called reinforcement or inclusions and may be fibers or particles. The inclusions are used to improve certain properties of materials or matrices, such as stiffness, behavior with temperature, resistance to abrasion, decrease of shrinkage. For instance, particles of brittle metals (such as tungsten, chromium, and molybdenum) incorporated in ductile metals can improve their properties at higher temperatures while preserving their ductility at room temperatures; and elastomer particles can be incorporated in brittle polymer matrices to improve their fracture and shock properties by decreasing their sensitivity to cracking. Functionally graded materials [4, 5] are used to join two different materials without stress concentration at their interface. Gradation in properties from one portion to another portion can be determined by material constituent composition. The volume fraction of one material constituent should be changed from 100% on one side to zero on another side, and that of another material constituent should be changed the other way round, These functionally graded materials can help reducing thermal stress, preventing peeling off of coated layer, preventing micro-crack propagation, providing high-temperature and impact resistant capability, etc. A heterogeneous material with a periodic microstructure [6-10] is described by its base cells, which is the smallest repetitive unit of material and comprises of a material phase and a void or softer material phase, as shown in Figure 2. It should be emphasized that the microscopic length scale is much larger than molecular dimensions

Design and mode ling meth ods 179

Base cell

(a) (b)

Figure 3. Topology for the material with Poissin's ratio: -0.8.

but mu ch smaller than the compo nent size. The material with a special microstructure may have special properties, such as zero thermal expansion coe fficient and negative Poisson 's ratio. Its effective properties are determined by the top ology of its base cell and the properties of its constitue nts, and can be predicted by the hom ogenization theor y [9, 10]. In other words, its effective prop erties or the values of its prop erty characteristics can be changed by designing various topologies of its base cell. With the hom ogenization theory, the topology of its base cell for required prop erti es can be designed using topology optimi zation [9, 10]. Its design dom ain is the base cell, which is discretized by four-nod e quadrilateral finite elements. The design variables are the density of mater ial in each finite element. The design goal is to minimize the error in obtaining the prescribed elastic properties for several loading cases. For instance, we may specify the elastic prop ert ies of a mater ial with Poisson 's ratio: -0.8 and solve the optimization probl em for a quadratic base cell discretized by 1600 quadrilateral finite elements, each representing one design variable. The resulting topology can be obtained and shown in Figure 3(a) [6]. The base cell is repeated as shown in Figure 3(b), where the mech anism is seen more clearly. When the material is compressed horizontally, the triangles will collapse and result in a vertical contraction, which is the characteristic behavior or performance of a material with negative Poisson's ratio. However, currently there exists no systematic and effective meth od for designing the components made of multi heterogeneou s materials according to the functional requirements from their high-tech applications. The history of materials development has followed the sequence from process to properties and to performance or application s. Human discovered mater ials naturally produ ced from volcanic actions and then found suitable uses for them. A simple mixture of clay, sand and straw produced a composite, which was found to have some goo d prop erties and then used as building materials by the oldest known civilizations. Even in the case of plastics, the processing techniques such as the polyme rization process and incor poration of fibers in polym ers

180

Ke-Zhang Chen and Xin-An Feng

were done first, followed by characterization ofmaterial properties and microstructures. After their attractive properties (e.g., the considerably low ratio of weight to strength, high corrosion and thermal resistance, high toughness, and low cost) were identified, they have then been used in various fields, such as aerospace, transportation, and other branches ofcivil and mechanical engineering. Therefore, the conventional component design method is always to first choose a kind of material, and then design the configuration of a component and check whether the component can satisfy the functional requirements. For multi heterogeneous components, however, the design process has to be reversed according to Axiomatic Design theory [11, 12), i.e., from functional requirements in high-tech application to material properties to microstructures and/or constituent compositions and to process. It can be developed under the guidance of Axiomatic Design. 2. DESIGN METHOD FOR THE COMPONENTS MADE OF MULTI HETEROGENEOUS MATERIALS

2.1. Design procedure

According to Axiomatic Design, design involves the continuous processing ofinformation between and within four distinct domains: the consumer domain, the functional domain, the physical domain and the process domain. Customer needs are established in the consumer domain and then are formalized in the functional domain as a set of functional requirements (FRs) that govern the solution process. The creation of a synthesized solution is through the mapping process between the FRs that exist in the functional domain and the design parameters (DPs) that exist in the physical domain. The DPs in the physical domain are then mapped into the process domain in terms of process variables (PVs). The customer domain of multi heterogeneous component design is where the desired performances of the component in high-tech application are specified. These desired performances are its customer attributes (CAs). In the functional domain, its FRs are the configuration of component and the properties of materials in different portions of the component, which can provide the desired performances specified in customer domain. These FRs are satisfied by choosing the microstructure and/or constituent composition of materials and the optimal parameters of the component's geometric shape, which are its DPs in the physical domain. Finally, its PVs in the process domain specify how the desired microstructure and/ or constituent composition and geometric parameters can be created. Figure 4 shows the design process for multi heterogeneous components in terms of the four domains of the design world. According to Figure 4, mapping from customer domain to functional domain is the mapping from the component's performances required in high-tech application to the component's configuration and the properties of materials applied in the component. Mapping from functional domain to physical domain is the mapping from the required component's configuration and the required properties of materials applied in the component to the material microstructure and/or constituent composition and the optimal geometric parameters of the component. Mapping from physical domain to

Design and mod eling methods

Desired Performances

Configuration and properties of materials

Microstructures and/or compositionof materials

-

-

181

Manufacturing processes

~ ~ ~- ~ Consumer Doma in

Funct ional Domain

Physical Domain

Process Domain

Fig u re 4. De sign process for multi hete rogeneous com ponents in terms of the four domains of the design world.

process domain is th e mapping from the material microstru cture and/or constituent composition and th e optimal geometric parameters of the compon ent to the pro cess variables for manufacturing the physical component. The mapping process between two adj acent domains can be summarized as the mapping process bet ween the design requirements (DRs) ('W hat we want to achieve') and design solutio n (DSs) ('H ow we achieve them'). When mappin g from FRs to DPs, the FRs are the design requirements (D Rs) , and DPs are the design solution (OSs). But when mapping from DPs to PVs, the DP s become the design requirement s (D R s), and PVs are th e design solution (OSs). For th e design with only one objective function or design requirement (DR ), th e DR does not involve the ind ependent requirement and can reach its optimum by adj usting its corresponding DSs. T herefore, Optimization Design [13] is very effective in this case. But, for the design with multiple obje ctive functions (D Rs) that are controlled by the same set of DSs, it is not very effective because the same set of DSs canno t co ntrol all DRs to reach thei r optimums at the same time. When on e adj usts th e same set ofDSs to let the second D R, for example, to reach its optimum, the first optimized DR will be changed. Therefore, one must go back to adjust DSs to optimize th e first DR; this in turn changes the seco nd optimized DR. In this case, the mo st imp ortant DR is usually selected as an objective function and th e others are elimin ated , or different weights are assigned to the different obj ective fun ctions to form one composite objective func tion. For the form er, it becomes a on eDR problem and Optimization Design is very effective for determining the op timums of both the DR and its correspo nding DSs as explained above. But the other DRs have not been optimized. For th e latter, although the composite objective funct ion is o ne DR, the optimization results are for the artificial DR, not for the real DRs, and thu s rather unc onvincing because different designers may give different weights to the same objective fun ction according to their kno wled ge base [141 . T herefore, a coupled design is a bad design, and it is significant to apply Axiom atic Design to make DRs to satisfy th e Independent Axiom , i.e., a perturbation in a particular DS must affect only its referent DR witho ut affe cting other DRs. When th ey are not coupled with each other, Optimization De sign can be th en applied very effectively to determine

182

Ke-Zhang Chen and Xin-An Feng

the optimized DSs for each real DR of the design problem, because different real DRs are now controlled by different DSs and can reach their optimums at the same time. For a component made of a homogeneous material or single heterogeneous material, as mentioned above, its design method is always to first choose a kind of material, and then design the component configuration and check whether the component can satisfy the functional requirements. If the initial material is found not to be suitable after checking, another material can be selected, according to the most important portion of the component, without changing its configuration since it is allowed not to make full use of the material of non-important portion. Therefore, its geometric design is not coupled with its material design. But the components made of multi heterogeneous materials are used in high tech and have many rigorous requirements or constraints. As mentioned above, their design processes have to be reversed, i.e., from functional requirements in high-tech application to component configuration to material properties and to material microstructures and/or constituent compositions. Some functional requirements of a component made of multi heterogeneous materials can be satisfied by changing either the component's configuration or material properties in different portions of the component, because these functions will be changed if we either change the component's configuration or change the materials of the component. Thus, their geometric design and material design are coupled with each other. Different configuration or geometric designs will require different material selections, and different material selections of each portion will also influence the shape and dimensions of the component and the material selection for other portions. Therefore, many factors are coupled with each other and their designs become very complicated. It is necessary to apply Axiomatic Design to design the design procedure to decouple them. Otherwise, it is very difficult to obtain good design or optimums for all the DRs as described above. The elements and workflow of design method developed, under the guidance of Axiomatic Design, for the components made of multi heterogeneous materials are shown in Figure 5 and explained as follows: First, the performance requirements (i.e., CAs) of the component to be designed are carefully analyzed and can be divided into two groups. The first group (CAl) should be satisfied by the component's configuration (FR j ) , and the second group (CA 2) should be met by the properties of materials in different portions of the component (FR 2) . The former is geometric design, and the latter is material design. Its design equation, according to Axiomatic Design, can be obtained as follows: (1)

where X represents a non-zero element and 0 represents zero element. From the equation obtained, it can be seen that the design matrix at this level is a triangular matrix, which indicates that the design is a decoupled design [11, 12] and the independence

Design and modeling methods

183

Classificationof performance requirements into two groups for geometric and material design

CAD modeling

Performancerequirement NO >-----~ satisfied ?

YES STOP Figure 5. Elements and workflow of design method.

of the CAs can be assured if the FRs are adjusted in a particular order: FR 1 first and then FR 2 . Therefore, it is reasonable and necessary that geometric design should be done first and followed by determining material properties. That is, according to CA I, a 3D variational geometric model [15,16] of the component (FRll can be first built by using conventional design method with the aids of advanced computer-aided design system. Based on the model, the material properties in different portions of the component (FR 2 ) will then be determined in a certain way. When mapping from FRs to DPs, if configuration optimization is implemented first to obtain optimized geometric parameters, the design equation will be as follows:

FRI} = { F~

[XX XX] {DPDP

1

1

}

(2)

since different material selection scheme will result in different optimal geometric parameters of the component as analyzed above. It can be seen, from Equation (2), that the design matrix is neither a diagonal nor a triangular matrix, which indicates that the design is a coupled design and does not satisfy Independence Axiom [11, 12]. This procedure cannot be accepted and should be reverse. That is, the material selection is implemented first to obtain optimized material microstructures and/or constituent

184 Ke-Zhang Chen and Xin-An Feng

compositions (DP 2 ) , and followed by geometric optimization design to obtain the optimal geometric parameters (DP I ) . The design equation can become:

FR2}=[X {FR X t

0] {DP

X

2

DPt

}

(3)

It can be seen, from the equation obtained, that the design matrix at this level is also a triangular matrix and Independent Axiom can be met if the DPs are adjusted in a particular order: DP 2 first and then DP I . Therefore, it is reasonable and necessary that material selection should be done first and followed by geometric parameter optimization. It is normal to have many suitable design schemes satisfying F R2 . For instance, some material properties required can be satisfied by many different materials, such as composite, functional graded materials, or heterogeneous material with a periodical microstructure. Even for the same type of materials selected, say functional graded materials, there are some different material constituent compositions that can satisfy the requirement concerned. Accordingly, material design optimization is first implemented to determine the optimal material constituent compositions and material microstructures for different portions of the component. Based on the material design, the geometric parameters can be optimized further. After that, a CAD model with the information of both configuration and materials can be created for the component, and finite element analysis method can be applied to analysis whether all the performance requirements are satisfied or not. If the performance requirements are satisfied, the design is over. If not satisfied, their CAs need to be analyzed again, and the above procedure will be repeated until all the requirements are satisfied. The final CAD model can be used for manufacturing the multi heterogeneous component, e.g., using layered manufacturing methods. Thus, it can be summarized that the design procedure for the components made of multi heterogeneous materials should go through: (1) geometric design (CAl -+ PRI), (2) material design (CA 2 -+ FR2 -+ DP2 ) , and (3) geometric parameter optimization (F R I -+ D PI)' Since the first and the third steps are conventional geometric design that is well developed, there is no need to further introduce them in this chapter. But material design is new and must be developed in details. 2.2. Material design

The elements and workflow ofmaterial design method (C A 2 -+ F ~ -+ D P2 ) developed for the components made ofmulti heterogeneous materials are shown in Figure 6. Its design procedure is explained as follows: Step 1: Create 3D variational geometric model for the component

The 3D variational geometry model [15, 16] of the component should be first built with the aids ofcurrent advanced CAD/ CAE software. The reason for using variational model is that the modification of its geometric parameters or model can be simplified after material design. After material design, only few variables need to be optimized or

Design and modeling methods

185

Create 3D variational geometric model ~----t-----__--CAD/CAE

system

Optimize material properties using sensitivity analysis Select material constituent composition ~ and/or microstructureof each region

database of heterogeneous _ materials

Create the region sets for material constituent compositions and material microstructures Create CAD model for the component made of heterogeneous materials Figure 6. Elements and workflow of material design method.

modified and other parameters will be modified automatically by computer according to the relationships between these parameters and the variables. This model should meet both the first type of performance requirements (CAl) and the constraints of the component (e.g., its overall dimension and its relationship dimension with other components in assembly), and has suitable variables for geometric parameters and functional relationships between the variables and other dimensions of the component. This work belongs to conventional geometric design and is used as the input of material design. Step 2: Create optimization mode/fordetermining material properties in the component

ofeach region

Based on the 3D variational geometric model made, the component will be divided into several portions or regions for selecting the optimal material constituent composition and/or microstructure. The partitioning can be implemented by using any commercial Finite Element Analysis (FEA) [17] software since current FEA software all has this function. The number of regions depends on the component to be designed since different components will have different shapes and requirements. But the guideline for it is that the number of regions is small since, normally, there are not many different types of material constituent compositions and/or microstructures in a component. The approach is designed to (a) select one kind of material for the component first, (b) analyze the component's performance using finite element analysis method, and (c) find out the regions having larger response to the component's performance. Based on the result, the geometric model of the component will be modified by (i) increasing the size of those having larger responses where

186

Ke-Zhang Chen and Xin-An Feng

more material regions will be created by means of preprocessor program [17] of finite element analysis in CAE software, and (ii) reducing the size of those having less response where less material regions will be created. The relationships will then be made between their partial nodal coordinates and the variables of the component's geometric parameters, so that its geometric parameters can be optimized after material design. Effective material properties include mechanical properties (e.g., hardness, bulk and shear modulus), thermal properties (e.g., coefficient of thermal expansion and coefficient of thermal conductivity), electrical properties (e.g., electric conductivity and dielectric constant), optical properties, and chemical properties (e.g., diffusion constant). The performance requirement of component can involves one of them or some of them (e.g., both small coefficient of thermal expansion and high mechanical strength), and should be optimized by material design for each region. The objective function for optimization model is the functional relationship between the component's performance and the material properties of regions created, and can be in the forms of analytical expression or implicit expression that can be determined by finite element model. The constraints for optimization model normally involve its manufacturability, affinity of two materials in the adjacent regions, and/or physical properties of component. Step 3: Optimize material properties using sensitivity analysis and steepest descend method

The sensitivity analysis [18] of material properties is to determine the changing rates of response quantities (i.e., component's performances) due to variations in design variables (i.e., material properties in different regions). If objective function is in the form of mathematical equations, the sensitivity analysis is to evaluate the partial derivative of component's performances with respect to material properties in each region. If objective function is an implicit expression determined by finite element model, the sensitivity analysis is to create global stiffness matrix for each performance, and to determine the changing rates of the performance due to variations in the material properties of each region. Based on the sensitivity analysis, the optimal values of material properties will be obtained for each region using Steepest Descend Method [13]. Step 4: Select material constituent compositions and/or microstructures

According to the results of optimization, suitable materials can be selected for each region from the database ofheterogeneous materials. Normally, there are many suitable materials. Thus, Genetic Algorithms [19-20] will be applied to find the optimal scheme that can satisfy both constraints and the best performances. Step 5: Create the region sets for material constituent compositions and material microstructures respectively

The regions with the same or similar material constituent compositions and material microstructures can be combined together into a larger region. Thus, the component

Design and modeling methods

187

region

Figure 7. A simply supported component made of heterogeneous materials.

consists of several larger regions for different material constituent compositions, which form a region set, and comprises several larger regions for different material microstructures, which form another region set. Step 6: Create CAD model for the component made of multi heterogeneous materials

After the above steps, all the information needed for creating CAD model have been obtained. The information includes two region sets, the code name of material constituent composition for each region, the code name of material microstructure for each region. According to these code names, the information, such as material constituent composition function, geometric model of inclusion, and inserting function, can be obtained from the database of heterogeneous materials. The method for CAD modeling will be introduced in Section 3. After its CAD model is created, the optimal geometric parameters of the component can be further obtained using optimization design method [13), which belongs to conventional geometric design and are not illustrated in this chapter. The following sections will introduce the elements ofmaterial design method in more details. 2.3. How to determine the optimal material properties needed in different regions

As mentioned in the previous section, the optimal properties of materials needed in different portions of a component can be determined using sensitivity analysis and steepest descend method. Sensitivity analysis is the method for obtaining the changing rates of response quantities due to variations in design variables. In this case, the design variables are the material properties of different portions in a component, and the response quantity is the objective performance ofthe component. For instance, a simply supported component made of heterogeneous materials shown in Figure 7 is subject to a vertical load (Fo). Its design variables are the relevant material properties of different regions in the component, and its response quantity is its response displacement (Uo). The procedure of determining the optimal properties of materials needed in different portions of a component based on sensitivity analysis and steepest descend method can be explained as follows:

188 Ke-Zhang Chen and Xin-An Feng

Step 1: Create optimization model

I

The optimization model for selecting materials in different regions can be written as follows: Minimize

Un

= f (B)

Subjectto the constraint: gu(B) ::: 0, hv(B) =0,

U

= 1,2,

v=1,2,

,m

(4)

,p

=

where Uo (u 1, U2, .•• , Uk) T and is the objective performance vector of the component; k is the number of objective performences; B = (b ;1), b~l), ... , b~~, b;2), b(n) b(n) b(n)) d b(i) . . b2(2), ... , b(2) C2"'" 1 ' 2 , ... , Cn; an j (1 = 1,2, ... , n,) = 1,2, ... Ci ) IS the j -th material property in the i -th region. . Step 2: Determine material sensitivity for each region

If the objective function is in the form ofmathematical equations, the sensitivity (5 y)) of the j -th material property ofthe i -th region with respect to the objective performance of the component can be obtained by: (i)

aun

s , =--

)

ab(i)

(5)

)

If the objective function is an implicit expression determined by finite element model, the procedure of obtaining the sensitivity is as follows: (a) Derive the global stiffness matrix [K] for the component To simplify the illustration, each material region is a finite element. The component is first divided into an equivalent system offinite elements with associated nodes and the most appropriate element type. According to each type of response quantities (e.g., displacement) of the component, the stiffness matrix of each finite element can be obtained and assembled into the global stiffness matrix of the component. Thus, its equilibrium equation can be given as follows: [K]U= F

(6)

where [K] is the global stiffness matrix of the component, U is the displacement vector of the nodal points of the component, and F is the external load vector of the component. (b)Derive the sensitivity of the j -th property in material property vector of the i -th region with respect to the objective performance of the component

Design and modeling methods

189

The global stiffnessmatrix of the component can be considered as a function of the j -th property in material property vector of the i -th region, and its equilibrium equation can be rewritten as: (7)

Implicit differentiation ofEq. (7) with respect to the variable, b;'), yields

(8)

Since the load does not vary with the material property, the first item on the right side ofEq. (8) is equal to zero. Thus, the equation can be rewritten as: (9)

(c) Partition the global stiffness matrix The displacement vector can be partitioned as:

U*=

I I Un

~

(10)

where Uo is sub-vector of displacement for objective performances, U, is the subvector of nodal displacement of the i -th region, and Uq is the sub-vector consisting of other displacement. Thus, the global stiffnessmatrix and the external load vector can be partitioned according to the sequence of U as:

(11 )

and F* =

I I Fl.!

F,

r,

(12)

(d)Approximate the sensitivity by finite differences The most challenging task here is to evaluate the derivatives of the stiffness matrix with respect to the design parameters. Normally, this is done by approximating

Ke-Zhang Chen and Xin-An Feng

190

them by finite differences [21]. Thus, Eq. (9) can be mathematically stated as:

=

. [K*(b(i) + h(i)) - K*(b(i))] _[K*(b(,))]-t ) ). ) ir )

h (I)

(13)

)

where

h;) is perturbation step length for the

j -th material property of the i -th

region. Therefore, the sensitivity (sy)) of the j -th material property of the i -th region with respect to the objective performance of the component can be obtained from Eq. (13) as: s (i)

(14)

.J

All the sensitivities can then be assembled into a sensitivity vector S as: S=

(1) (1) (1) (2) (2) ( 51 ,52 , .•. , 5 Cl ' 51 ,52 , ...

(2)

1

5 C2

(II)

1 ••• ,

51

(II)

1

52

(II))T

, .•. , 5 C /1

(15)

Step 3: Search for the optimal material property vector of different regions of the component

Optimization can be implemented according to Steepest Descend Method [13]: (a) Start with an initial point B j • Set the iteration number as k = 1. (b)Find the search direction - Sk using sensitivity analysis as introduced above. (c) Determine the optimal step length h k in the direction - Sk and set (16)

(d)Test the new point, Bk+ 1, for optimality. The new objective performance of the component can be estimated by: (17)

or Uo can be obtained from Eq. (10), where (18)

Design and modeling methods

191

After the new values of material properties (Bk+l) are obtained, check both if there are suitable material microstructures and/or constituent composition for them in the database of heterogeneous materials and if the response quantities (i.e., component's performances) are improved. If the answers for both checks are "yes", set the new iteration number k = k + 1 and go to step (b). The above procedure from step (b) will be repeated until the improvement of objective performance of the component is smaller than a threshold. If the answer for any of the two checks is "no", the optimization process is over. The last material property vector of different regions in the component is the optimal material property vector of different regions of the component. 2.4. How to select material constituent composition and microstructure

After the material properties of each region in the component are determined as introduced above, the suitable material constituent composition and microstructure can be selected for each region from the database of heterogeneous materials that is designed using IDE FIX notation [22] as shown in Figure 8. Commercial tools are readily available for constructing IDEFIX diagrams and generating database structure. According to IDEFIX notation, in Figure 8, the square box indicates an independent entity, which can exist on its own as its name suggests; the rounded box represents a dependent entity, which can exist only if some other entities also exist; and the name of entity is listed above the box. The top portion of the box contains primary key attributes, the lower portion contains the remaining attributes, and the notation "FK" denotes a foreign key. The lines are labeled with relationship type names. A solid line between two entities denotes an idetifying relationship type, and a dashed line represents a non-identifying relationship type. The solid ball denotes many multiplicity (zero or more), the lack of a symbol indicates a multiplicity of exactly one, and a line with a solid ball at one end represents a one-to-many relationship. The large circle with two lines underneath denotes generalization. The attribute next to the circle is called a discriminator and indicates whether each heterogeneous material record is elaborated by material constituent composition or material microstructure. 2.4.1. Select material constituent compositions from the database of material constituent composition

There are two types of operations for it: from code name (CN.) to properties and from properties to code name (CN.). (a) Based on the code name of material constituent composition, retreive its material constituents, the code name of constituent composition function, the properties of material constituents, manufacturing technology & equipments, application fields, and application examples. (b)Based on the optimal material properties determined using above method, search for the code name of material constituent composition which has the properties close to or a bit higher than the optimal properties determined. Then, based on the

N

.... '"

mapp ing

acquires

ma teria l con stituents (FK ) CN. of comp osition functio n (FK) CN . of inse rting functio n (FK) manu fac tur ing technology & the equipmcnt s used app lication fields appli cation examples

C.N. of constitue nt co mpos ition (FK)

I

Fig ure 8 . Database o f h eterogeneo us mat erials designed usin g IDE FIX .

type of coordinate sys tem orientation of coordi nate sys tem distrib ution functio n

C.N. of com pos ition fun ction (FK)

Composition fu nction for different materi al constitue nt

......

pro pert y I prop ert y 2

C.N. of co nstituent co mpos ition (F K)

v.onsm ueru composm on

cont ent s of heterogeneo us m ateri al~

I .I

ricrostructurc

,---

mapping

I

type o f coordina te syste m orientatio n of coordinate syste m insert ing fun ct ion o f incl usion

C N. of inserting funct io n (F K) C .N. of rnatcrial microstru ctur c( FK )

Insert ing funct ions of inclusion

acquires

C N. of va ria tiona l geometric mod e l ( FK) C .N. of va ria tiona l fu nct ion (F K) C .N. of material micro structur e (F K) C N . of insertin g funct ion (F K) mater ia l co ns titue nt composition manu factu ring technology & the eq uiprnents used appli cati on fie lds appl icat ion e xamp les

offers

acqu ires

acquires

parameter I param eter 2

CN . of va riationa l g raph (FK) va riationa l funct ion (FK)

Vari ationa l graph ics param eter

l Offers

C N. of vari at ion al gr ap h (FK) inse rting coor dinates

C .N. of vari ationa l geo metric model (FK)

Var ia tional gra phics library

variatio na l function (FK)

CN . o f va riat iona l funct ion (FK)

V a l Htl lV11Ct1 ru nc tnm

materials of micro structure 1.2•... property I property 2

C.N. of material microstructure ( FK)

Effective properti es of differe mt material micro structure

C.N. o f materia l m icro struc ture ( FK)

C N. of material micro stru ctur e ( FK)

C N. of constituent com pos ition (FK)

hetero geneous mater ia ls

IIctcrogcncous mate rials

Design and modeling methods

193

code name of material constituent composition, retreive its material constituents, the code name of constituent composition function, the properties of material constituents, manufacturing technology & equipments, application fields, and application examples. Since, normally, there are more material constituent compositions which satisfy the requirement from the optimal material properties determined, they all should be found out. 2.4.2. Select material murostructurcsirom the database of material microstructure

There are also two types of operations for it: from code name (CN.) to properties and from properties to code name (CN.). (a) Based on the code name of material microstructure, retrcivc the code name ofvariational geometric model of microstructure, the code name of its inserting function, the type of heterogeneous materials (composite or heterogeneous material with a periodic microstructure), effective properties of material, manufacturing technology & equipments, application fields, and application examples. (b)Based on the optimal material properties determined using above method, search for the code name of material microstructure which has the properties close to or a bit higher than the optimal properties. Then, based on the code name of material microstructure, retreive its code name of variational geometric model of microstructure, the code name of its inserting function, the type of heterogeneous materials (composite or heterogeneous material with a periodic microstructure), effective properties ofmaterial, manufacturing technology & equipments, application fields, and application examples. Since, normally, there are also more material microstructures which satisfy the requirement from the optimal material properties determined, they all should be found out. 2.5. How to generate two material region sets

After the selection from the databases, there appear many suitable material constituent compositions and material microstructures for each region of the component. Among them, the most suitable one should be selected with good material affinities for adjacent regions, the lowest material cost, and the lowest manufacturing cost. As far as the material constituent composition is concerned, the regions with similar material constituent composition can be aggregated into a larger region, and thus the component can be divided into several regions which form a set of material constituent composition regions (C,). For the material microstructure, the regions with similar material microstructure can also be combined into a larger region, and thus the component can also be divided into several regions which form another set of material microstructure regions (C2 ) . This work is implemented using Genetic Algorithms [19, 20] and explained as follows:

194

Ke-Z hang Chen and Xin-An Feng

Evaluation & Selection

Genetic Operation coding spac

encoding Figure 9. Mapping between coding space and solution space. (1) Encode decision variables

If ther e are n regions in a comp onent and m i material cho ices for the i -th region, the decision variables or solutions (i.e., the materi al con stitu ent compositions and material microstructures of different region s of a component) are encoded as a stri ng, called chromosome, wh ich have /I bits and a decimal value (1~ lI1i ) for the i - th bit , and can be represented as:

I .. . .. .

I I r-rn .,

(2) Determine the size of population

The coding space of chromo somes covers total population (T P), which is very large and can be calculated by n;,= 1mi. The size of initial popul ation should have less chro mosomes, which are randoml y generated. The number (L) of chromosomes in the initial population is determined, according to the TP, by:

L

=

TP,

ifTP < 20

0.2 x TP,

if TP :::: 20 and L > 20

20 ,

if T P :::: 20 and L :::: 20

1

(19)

(3) Evaluation

The dicision variables represented by the chromosome can be an illegal one, infeasible one, or feasible one as shown in Figure 9. All the genetic operations are implemented in the coding space, and the evaluation and selection are in the solutio n space. Mapping between the two spaces is throu gh encoding and decodin g. If the solution or decision variables is outside the solution space, it must be an illegal one, which cannot be

Design and modeling methods

195

Table 1 Database for the affinity of materials Material A Code No.

Affinity of materials

Material B Price

Code No.

k;)

Price

evaluated and has to be eliminated. The chromosomes in the coding space defined by this paper can always be mapped to the solution space, and no illegal one will be generated. The infeasible one is the solution which cannot satisfy constraints from manufacturing technology, materials etc., and is outside the feasible area. The feasible one is the solution which meets the constraints, but must not be an optimal solution. The objective function for deriving optimal solution is represented by fitness. The fitness value for the i-th chromosome is calculated by: (20)

In Eq. (20), the first item is for evaluating the material affinity of adjacent regions in the i-th chromosome, where m i is the number of boundaries of adjacent regions in is the material affinity value of the j-th boundary in the the i-th chromosome and i-th chromosome. If the material affinity of two adjacent regions is poor, there must be stress concentration at the interface between the two regions. The material affinity can be a value within (0, 1), and can be searched from its database, the structure of which is represented by Table 1. If the material of a region is the same as that of one of its adjacent regions, the material affinity for the adjacent regions is 1. If the material affinity cannot be found from the database, the materials of the two adjecant regions cannot be connected together and the value for the material affinity will be -100 for penalty of not satisfying material constraint. The second item in Eq. (20) is for evaluating the material cost and manufacturing cost, where C, is the price per unit volume of material in the j -th region and can be searched from the database represented by Table 1, V; is the volume of the j -th region, M, is the manufacturing cost per unit volume for the j -th region and can be searched from the database about manufacturability, the structure of which is represented by Table 2, and n is the number ofregions in the component. The coefficients k 1 and a are used to adjust the weight ofitems. The database ofmanufacturability includes the types and models of layered manufacturing machines, the types and number of materials to be added, the minimal possible size of inclusions, and material manufacturing cost per unit volume. The third item in Eq. (20) is for penalty of not satisfying manufacturability constraints, where Pi is the the penalty value of the i-th chromosome. The material constituents selected should be able to be manufactured using corresponding layered

kT

t 96

Ke-Z hang Chen and Xin-An Fen g

Table 2 Database for manufacturabiliry Type of layered manufacturing machine

Model of layered manufactur ing machine

Type of the materials to be added

Number of th e materi al to be added

Min imal po ssible size of inclusion s

Mater ial manufactu ring cost per un it volume

manu facturing machines. W hen the re is a suitable machin e found from the database represented by Table 2, Pi is equa l to zero. If there is no suitable machine found, the penalty (Pi) of the i -th chromosome will be -1 00. (4) Selection

Th e first generation of pop ulation is selected ramdonly from the initial population using a roulette wheel approach [1 9]. (5) Crossover operation

After the first generation is obtained, genetic operations involve crossover and mut ation to yield ofEpring. Crossover operates on two chromosomes at a time and generates offspring by combining both chromoso mes' features. But it should avoid inbreeding since the crossover between two similar chromoso mes is not useful for efficient evolution . T he degree of hom ology between two chromoso mes may be estimated by calculating the evolutiona l distance between genes of two chromosomes. Since the evolutional distance between genes of two chromosomes is not easy to be calculated, an alterna tive method has bee n developed in this paper. The degree of hom ology is determi ned based on " family tree" . For example, the first generation has twelve chro mosomes, 1.1 to 1.12, as show n in Figure 10. The second generation after genetic op eration and selection has twelve chro moso mes: 1.1~ 1.5, I. 7 ~ 1.9, 1.11, 2.1"'2.3. T he chromosomes 2.1-2.3 are th ree offsprings generated and the 1.6, 1.10, 1.12 have been eliminated. Chromoso mes 1.1 and 2.1 form a family; and chromosomes 1.2, 2.2, and 2.3 form another family. T he third generation after ano ther gene tic operation and selection has twelve chromosomes : 1.1'"'-'1.3, 1.5, 1.8, 1.9, 1.11, 2.1'"'-'2.3, 3.2, 3.3. T he 3.1-3.3 are three offsprings generated and the 1.4, 1.7, and 3.1 have been eliminated. Chromosomes 1.2, 2.2, and 2.3 form a family, and Chromos omes 1.1, 1.9, 2.1, 3.2, and 3.3 form anoth er family. Th e chromosomes with cross in Figure 10 are eliminated through selections. Chromosome 1.11, for instance, has no consanguinity relationship with chromoso me 3.2 which is in ano ther family, so that they sho uld be given a priority for crossover. After evolution for many many generations, all the chromosomes in popul ation may have consanguinity relationships with each othe r. Th e degree of hom ology between two chro moso mes is measured by the number of evolutional paths between them. For example, the number of evolutional paths between chromoso me 1.2 and 2.2 is I , and that between chro mosome 2.1 and 3.3 is 3. Some times, there are several different routes betwe en two chro mosomes. In this case,

Design and modeling methods

1st generation

197

1.1

2nd generation

3rd generation

2.1

2.2

2.3

3.2

3.3

Figure 10. Family tree.

the route with less number of evolutional paths is selected for measuring their degree of homology. If there are n chromosomes in a population, the number ofpossible pairs for crossover can be calculated by:

n!

s=c-=--2(n - 2)! 7

II

(21)

The probability of crossover for the i -th pair of chromosomes is determined by: p(i)

Gi

=

Jk + G;

c

(22)

where G i is the degree of homology between the chromosomes in the i -th pair, and k is a positive real number which is used to adjust the sensitivity of Gi with respect to the probability of crossover. When there is no consanguinity relationship between two chromosomes, let C, be infinity and thus pY) is equal to 1. After the probabilities of crossover for all the pairs of chromosomes are determined, each probability can be normalized by: p(,)' c

=

(

p(i) ",S

L....1=1

p(i)

(23)

(

The cumulative probability for the i -th pair of chromosomes can then be obtained by:

,

q;i)= LP((i)' i=1

(24)

198

Ke-Zhang C hen and Xin-An Feng

Based on their cumulative probability, a roulette wheel can be construc ted. The selection process begins by spinning the roulette wheel for (O. 2L) times; each time, a pair of chro mosomes is selected for a new populati on in the following way: Step 1. Generate a random number r from the range [0, 1J. Step 2. If r :::: q then select the first pair of chromosom es; otherwise, seleete the i- th pair of chro moso mes such that q?-I ) < r :::: q;i).

J!),

For each pair of chro moso mes selected, for instance: Parent 1:

~I

Xj + 1

~I

X k+ 1

I Xk+2

~

Yj + 1

~I

Yk+l

I Yk+2

~

Parent 2:

~

genera te two random num bers, j and k (j < k), from the range (l ......n), and exchange the segme nts bounded by the crossover points represent ed by the two random numbers to create two new chro mosomes called offspri ng: Offspring 1:

Offspring 2:

(6) Mutation

N ew chromosomes or offspring can also be formed using a mut ation operator, which involves modification ofthe values of genes in a chromosome and increases the variability of a population. In the case wh en fitness functions of chromoso mes in a population converge to a small range or local optimum , it is difficult for crossover to generate offspring with more improved fitness function values, but mutation can play an important role for it. In fact, that fitness fun ctions of chromosomes in a population converge to a small range means that mo re and more similar chromoso mes have been obtained. The similarity of two chromosome s ind icates that the two chro moso mes have the same genes in some same bits. A population can be divided into several gro ups according to the ir similarities. T he similarity of the i -th group can be represented by the produ ct of th e number (G i ) of bits with same genes and the number (C;) of chromosomes in the gro up. Some chro moso mes may be similar to more groups, but they can only belon g to the groups th ey are mo st similar to. Th e group with more similarity should have

Design and modeling methods

199

priority for mutation. In order to find global optimum rapidly, probability of mutation can be adjusted using the following formula: (25)

where k is a positive real number and is used to adjust the sensitivity of eiG, with respect to the probability of mutation. Using the same roulette wheel approach as that used for crossover, 10% of chromosomes will be selected for mutation. Since the same genes the chromosomes of a group have in the same bits is possibly the local optimal one, a gene (e.g., the k-th gene) is randomly selected from these same genes and its value is replaced with a value randomly selected from the range (1 ~mk) for its possible material constituent compositions and material microstructures. (7) Reproduction

The reproduction is performed on enlarged sampling space [19]. After crossover operations, offspring with 40% population has been generated. With mutation operations, offspring with 10% population has also been obtained. Therefore, the parent and the offspring have 150% population in total and form an enlarged sampling space. It is easy to implement evolution based on enlarged sampling space [19]. After the genetic operations, check whether there are the same chromosomes among the parent and the offspring. If yes, keep one of them and eliminate the others. Then, an elitist selection scheme is used. The chromosome with the highest fitness function value among the parent and the offspring is selected as an elitist and copied directly into the new population ofnext generation. With this operation, nature's survival-of-the-fittest mechanism can be guaranteed. The other chromosomes are selected by a roulette wheel selection scheme. A roulette wheel is a wheel on which each chromosome in the parent and the offspring is represented by a slot with slot size proportional to its fitness function values. (8) Stop criterion

After the second generation is generated, the above crossover, mutation and reproduction operations will be repeated until the best chromosome is obtained. Since genetic operation is implemented randomly, its fitness is not always increased continuously. It is not correct to stop the optimization process when there is no increment or improvement for the fitness after a new generation is obtained. Therefore, a threshold is used for the number of generations that have the same best chromosomes. That is, if IIi - 11* is greater than a threashold q = 20, the iteration process can be stopped, where IIi is the current generation number and 11* denotes the generation number when the best chromosome, among all generations, is first found. Then the best chromosome can be output as the optimal solution. (9) Construct the regions for different material constituent compositions and material microstructures

With the optimal solution obtained, the types of material constituent compositions and material microstructures in all the regions of the component have been optimized.

200

Ke-Zhang Che n and Xin-An Feng

w

Ri-'

Rin, R out

Figure 11. A flywheel and its material regions.

Based on the solution, the adjacent regions with similar material constitue nt compositions can be aggregated into a larger region, and tho se with similar material microstru ctures can also be combined into a larger region . Thus, two region sets are formed for material constitue nt compositio ns and material microstru ctures respectively. 2.6. An example of multi heterogeneous component design

Since the design proc edur e isj ustified by Axiomatic D esign and all the techniques used are matur e and justi fied already, a simple design exampl e is introduced in this section for illustrating how to apply this meth od . As the examp le, a flywh eel is now designed using the metho d introduced in this paper. This flywh eel is not an ordinary one . It has many rigorous requirem ent s or constraints and is used in a high-tech device. The constraints include very light weight, very high moment of inertia, disk-like shape, and other requi red working conditions (e.g., available space). If a homogeneo us material or single heterogene ou s material is used for it, this homogen eous material must have very large specific gravity to meet very high moment of inertia, wh ich cannot satisfy the requirement of very light weight. Therefore, multi heterogeneous materials are need ed for its design. These requirements or con straint s can be divided into two types as intro duced in Section 2.1. According to its first type of the component 's perform ances (C A l ), the flywh eel has been designed by geometric design as show n in Figure 11. Its shape is like a disk with a thickn ess W. Its outer radius is Roul and inn er radius is R inl . (1) Requirements for material design

Based on the geo metri c design, its seco nd type of the component's performances (C A z) should be further satisfied by material design accordi ng to Axiomatic Design (Eq. 1). Its CA z is to maximize the moment of inertia of flywheel (f). But it has to meet the

Design and modeling methods

201

constraints, including (a) Its mass must be smaller than a threshold, M o; (b) The largest Von Mises stress in the flywheel must be smaller than a threshold, (To; and (c) Other constraints from manufacturability, material affinity between two adjacent regions, etc. (2) Generate material regions

Based on the geometric design, its 3D variational geometric model can be built with the aids of current advanced CAD/CAE software. Then, the flywheel is divided into 20 (or even more) regions, asshown in Figure 11, each ofwhich has a specified material constituent composition and a specified material microstructure. It does not mean final solution is 20 material regions with 20 different materials. After material design, the adjacent regions with the same or similar materials will be merged into a larger region, and the number of the sub-regions will become much less. (3) Create optimization model

The optimization model for selecting the material in each region can be written as follows: (26)

WI.: 20

SubjecttoM =

i=l

JTVi

g

(R,2 - R,2_ 1) :s Mo

(27)

(28)

where W is the width of the flywheel, Vi is the specific gravity of material in the i -th region, g is acceleration of gravity, M is the mass of flywheel, and O"VM is Von Mises stress in the flywheel. (4) Sensitivity analysis of material properties

From Eq. (26), it can be known that the moment of inertia for the flywheel is only related to the specific gravities of materials (Vi) among the properties of materials. Its sensitivity analysis is to evaluate the partial derivative of the moment of inertia of flywheel with respect to the specific gravities of material in each region. For the i -th region, its sensitivity can be obtained as follows:

aI -Wn (4 4) S R-R , - -aV,. - 2' i-I g

i = 1,2, ... ,20

(29)

(5) Search for the optimal material property vector of different regions of the flywheel

According to steepest descend method introduced above, optimization is implemented as follows:

202

Ke-Zh ang Chen and Xin -A n Feng

(a) Start w ith an initi al poi nt liJ, w hich is (V~l ) , viI ), . .. , vi:})T. Set th e iteration number as k = 1. (b) Find th e search dir ection Sk using sensitivity analysis as introdu ced above. (c) D etermine the optimal step len gth il k in the dir ection Sk and set (30) (d) Test th e new po int , 11,+1, fo r o ptimality. It is obvio us, from Eq. (29), th at th e sensitivity of outside region is larger th an that of inside regi on. After th e specific gravities of materials in region s are replaced using Eq . (30), the new objective perfo rm an ce of the flywheel can be estim ated by Eq. (26), th e m ass of flywheel (M*) can be calculated by Eq. (27), and th e largest Von Mises stress in each region can be obtained using finite element analysis. If M* > Mo, the spec ific gravity vector will be m od ified using:

(31)

so th at the tot al mass of flywh eel can be kept as k lo. If a ~M > a o, th e optim izatio n process is over. After th e m ateri al spec ific gravity of each region in th e co mpo ne nt is det er mine d as intro duced abo ve, th e sui table m aterial co nstituent co mpositio n and mi crostru cture sh ould be able to be selected for each region from th e database of het erogen eou s mater ials, and sho uld m eet th e con straint specified in Eq. (27) and (28), i.e., th e tot al m ass of flyw heel is smaller th an or equal to th e lim it iVIo and the ten sile streng th of th e material selected fo r th e i-th regio n sho uld be larger th an th e largest Von M ises stress in each regio n . If the scheme of material de sign is bett er than the o riginal o ne, set the new iter ation num ber k = k + 1 and go to step (b). T he above pro cedure from Step (b) will be rep eated until the de crement of obj ective performance of th e flywheel (Eq . 26) is smaller th an a threshold. Durin g th e proc ess, the specifi c gravity of material in outside region will be increased m ore and m ore, and those in in side reg ions decreased. The last m aterial specific gravity vecto r of all the regions in th e flywheel obtained is the optimal solution . (6) Select material constituent composition and microstructure for each region

According to the optimized spec ific gravity vecto r, there are JIlany suitable mater ial co nstitue nt com positio ns and m ater ial microstructures for each region of th e co mpo nent. Among th em , th e m ost suitable o ne is selected w ith goo d m ater ial affini ties for adja cent regions, th e lowest material cost, and the lowest m anufacturing co st using Genetic Algorithms introdu ced in Sectio n 2.5.

Design and modeling methods

203

3. CAD MODELING METHOD FOR THE COMPONENTS MADE OF MULTI HETEROGENEOUS MATERIALS

After design, the computer models for representing components made of multi heterogeneous materials need first to be built so that further analysis, optimization and manufacturing can be implemented based on the models. Current modeling techniques can capture only the geometric information [15,16]. Some researchers [23-28] are focusing on modeling heterogeneous objects by including the variation in constituent composition along with the geometry in the solid model for functional graded materials. But representing the microstructure ofheterogeneous components is beyond their scope [23]. Since the microstructure size is very small, the model consisting of such microstructures has huge number of data to be stored. Even with the help of high-speed modern computers, the processing of the model is extremely difficult and needs extreme care and thoughts for I/O operations. This paper develops a modeling method, which can implemented by applying the functions of current CAD graphic software and build the model that includes all the material information (about periodic microstructures, constituent compositions, inclusions, and embedded parts) along with geometry information in current 3D solid modeling without compromising on the speed of the operations and reasonable utilization of computer resources. A special supporting component will be taken as a practical example to describe the modeling method for the component. 3.1. Analyses of the requirements for representing the components made of heterogeneous materials

The requirements for representing a component made of heterogeneous materials should be made clear first before developing a modeling method for multi heterogeneous components. Since heterogeneous materials cover composite materials, functionally graded materials, and heterogeneous materials with a periodic microstructure, the requirements for each of them are analyzed, respectively, as follows: As introduced in Section 1, a composite material consists of one or more discontinuous phases distributed in one continuous phase as shown in Figure 1. The properties of composite materials result mainly from the material properties of both their matrix and inclusions and the geometrical feature and distribution of their inclusions. Thus, to describe a component made of a composite material, its CAD model will have to specify the geometric feature, material, and distribution of the inclusions and the matrix material as well as the geometric model of the material region in the component. The geometric feature is represented by a code name that can be used to retrieve the necessary information from a database for confecting the spraying material. The necessary information includes the type of inclusions, such as fibers, sheets or lump, and normal distribution parameters of their dominant dimensions. Functionally graded materials are used to join two different materials without stress concentration at their interface. Actually, there are many material composition functions [5]. The designers can choose certain composition functions from them for their applications. For example, the following parabolic function is selected for material

204

Ke- Zh ang Che n and Xin-An Feng

composite function of th e metal! ceramic fun ctio nally graded material in th e cylinder of vehicular en gines or pressure vessels:

(32)

where 1/,'1 is the volume fraction of metal and x is the distance from one side. The coe fficients of the parabolic function are optimized subj ect to criteria that the thermal flux across the material is minimized, and the thermal stresses are minimized and restricted below th e yield stress of the mater ial. In fact, many nature 's organisms have had their functi onally graded tissue, such as teeth , skins, bo nes, and bamb oo. Their composition fun ctions have been optimized by evolution based on nature's sur vival- of- the-fittest mechanism. After the determination of com position function, physical properties can also be estimated based on property estimation models [5]. Thus, in order to describe a component made of a functionally graded material, its CAD m odel will have to specify its material constituents and their composition functions as well as th e geo m etric m odel of th e material region in th e component . A heterogeneous materi al with a period ic microstructure is describ ed by its base cells, which is the smallest repetitive unit of materi al and comprises of a mater ial phase and a void phase, as show n in Figure 2. To descr ibe a co mponent made of a heterogene ou s mater ial with a periodic microstru cture, its CAD model will have to specify its variable geo metric model, materi al constitu ents and distribution fun ction of the base cells as well as th e geome tric mod el of th e material region in th e co mpo nent . According to th e previous analysis of th e requirements for representin g th e compo nent s made of th e thr ee types of materi als, it is obvious th at th e fun ctional requirement (FR) of its CA D mod el can be decomposed into thr ee sub-FRs: representing geo metries of the mater ial regions in the component (needed for all these materials), their material constitue nt composition s, and their material microstru ctures (including th e geome tri c feature and distri butions of inclusions for composite mater ials and th e base cell for those with a peri odic microstru cture), whi ch are no un phrases corresponding to " w h at we want to achieve" and can be written as:

F R} = R epresenting geometries of the material regions in th e component F R2 = R epresenting m aterial constituent compositions of th e material regions in th e component F R3 = R epresentin g material microstru ctures of th e material regions in the component Thus. according to Axiom atic D esign [11, 12], th e C AD model should be decomposed into thre e sub- mo dels to satisfy the three sub- FRs, respectively, as th e design solutions (O S). These are stated starting with a verb corre spo ndi ng to "h ow we achieve it" and can be written as:

Design and modeling methods

205

D S, = Build their geometric models D S2 = Build their material constituent composition models D S 3 = Build their material microstructure m odels

Therefore, th e design equ ation for it can be ob tained as follows:

I

FR' j FR2 FR.l

=

II

[X0 ~,] ~~ X X DS. X

X

X

(33)

1

From the equation obtain ed, it can be seen that the design matrix is a triangular matrix, which indicates that the design solution is a decoupled design and satisfies Indep endence Axiom [11, 12]. In other wo rds, it is corre ct to decompose a CAD model of th e multi heterogeneous component into th e three types of sub- models without coupling for successful application since satisfying Independence Axiom can ensure the ind ependence of th ese sub- models.

3.2. Unified CAD modeling for th e co m p o nent made of heterogeneous materi als

According to the analysis in the previo us section , CAD models for the components made of the three types of materials can be uniformly form ed or int egrated by the three types of sub-m odels. The first type of sub-model is a geo metric model. A 3D solid mod el representing the geometry of a component can be made by using cur rent CAD graphic software and is indi cated by C. It can be divided into I I portion s or regions based on th eir materi al con stituent co mpositions. T hus, the materi al constituent composition set can be indicated by:

e = I C;, i =

1, 2, . .. . n ]

(34)

According to its material mi crostru ctures , the geometry model can also be divided into 111 parts or regions if th ere are 111 different microstructures. The materi al microstructure set can be written as: 5

= (5" ) = 1, 2•...• III }

(35)

Thus. the material region set (tv!) of the compone nt can be obtained by solving Ca rtesian prod uct of C and S as: M =

ex

5 = {M;} Ii E (1,2,3 ... , 1/). ) E (1, 2, .. . ,1Il)}

(36)

For example , there are six materi al constituent composition regions (/1 = 6) and four material microstru cture regions (111 = 4) in the component C as shown in Figure 12. Solving its C artesian prod uct of C and S can obtain fourt een mater ial regions. each of

206

Ke-Zh ang Chen and Xin-A n Feng

c

~ 3

4

CJ

/

-,

5

s

CJ5j

M

-,

/

(enlarged)

Figu re 12. Materi al region s for the compo nents made of heterogen eous materials.

which has a specified material constitu ent composition and a specified microstructure. The first Arabic figure of the symbol of each region is the code name of material con stituent composition region , and the second English letter is the code name of mater ial microstructure region . Region 6a , for instance, indicates that its material constituent composition in this region is determined by that in R egion 6 of material constituent composition set and its material micro stru cture is specified by that in Region a of materia l micro structure set. T he last two sub-models are mater ial constituent composition model, and material micro stru cture model , which cannot be represented by 3D solid model and have to be in other forms. In fact, a model is an approximation of the component or object along one or more dimensions of interest, and can be any entity that exhibits some aspect of the component that is required for the purposes concerne d [29]. T herefore, a mode l can be in many different forms, such as a physical model, a wire-wrapped circuit board, a system of equations, frames and slots (i.e. schema [30]), 3D solid models, or their combinations. We use a schema to represent the stru ctural knowledge or information for each of the last two sub-models since the schema is easy to be used to establish the linkage amo ng graphic library, database and application software, which is prerequisite for modelin g the components with several sub-models. Each schema consists of several framcs. Each framc represents a type of inclusion or periodic microstru cture cell and consists of several slots. Each slot contains a type of information to describe the frame in more detail, such as the type of the local coordinate system of a material regio n, the location and orientation of the local coo rdinate system in the global coordinate system, the type of spraying material, the inserting array for each type of periodic microstructure cell, the composition function of each materi al con stituent,

D esign and modeling methods 207

or the code name of the variable geometrical model of a periodic microstru cture cell. 3.3 . Material constituent composition models

Each region in m ateri al constituent composition set has a specified mater ial constituent co mpositio n. The volume fraction of the h-th mat eri al constituent at th e position (x, y, z) in Cartesian coordinate system, for example, can be represent ed as:

VI, = j ,,(x , y, z)

(37)

T his material composition fun ction alo ng with primary material combinations and in ten ded applications can be obtained from many literatures [5], and organized into a database for applications. D esign ers may select suitable mat erial co m position functions from it for their applic ations according to the functional requirem ents of a component. Based on schema th eory [30], frame and slots can be used to organize th e knowledge for modeling. The model for the i -th material constitue nt com positio n reg ion is th en design ed as th e followin g typi cal schem a with one fram e: Cj

= {C oordinate system type: C arte sian, cylindrical, or sphe rical coordinate syste m Origin of coordinate system : X Ci ,

¥ Ci , Z Cj

O rientatio n of coordinate system : a c. , (3 Cj , Numb er of materi al types: N

Y Cj

Cj

M ateri al types: At, A 2 , • . . , A Nc, Materi al co nstitue nt co mposi tion func tio n:

h

=

[T1,Cj=

.ft.Cj(x , y, z),

L i

1, 2, . . . , Ncd M VI,C; = 1, (x, y, z) E C; ]

"=1

(38)

The first slot is the loc al coordinate system type of the i - th mater ial region, which may be C artesian, cylindrical, or spherical coordinate system. The seco nd and th e th ird slots are the origin and th e ori en tation of the local coo rdin ate system , respectivel y, w hich are based on glob al coordinate system . The fourth slot is th e number of mat erial typ es in the ma terial region . The fifth slot is th e material typ es used in the materi al region, w hich are repr esented by their code name s. All the info rm ation abu ut each type of material can be retrieved from a materi al database acco rding to its code name . The sixth slot is th e mater ial co nstitue nt composition function set, which incl udes the composition functio n of each materi al co nstituent in th e region that is th e fun ction of th e positio n (x, y, z) in Ca rte sian coo rdinate system , for exam ple. In each position , th e sum of th eir volum e fraction s should be equal to one or 100%.

208

Ke-Zhang C hen and Xin- An Feng

3.4. Material microstructure models

Th e material microstructure model (5) covers those for composite mater ials (R), heterogeneous materials with a periodic microstructure ( P) and the materials witho ut inclusions and peri odic microstru ctures (0) , i.e., S = lS i , j

= 1,2, . . . , m l m = (u + v + 11' ), Sj E (R + P + 0 )]

(39)

where R = [R" , a = 1, 2, . . . , 1/ ), P = [PI" b = 1, 2, . .. , IJ ], and 0 = [ Od, d = 1, 2, . .. , til]. Since there is no spraying or insertio n operation in the region witho ut inclusion and periodic microstru cture, there is no need to build a mater ial microstru cture model for the region , i.e., 0 = [ Od = "nil", d = 1, 2, . . . , w]. 3,4, 1, Material microstructure modelsfor composite materials (R)

As mentioned previously, com posite material consists of matrix and inclusions. The latter may have various shapes, sizes, and distributions. Their shapes and sizes are varied randomly, and their distributing densities in the matrix may be variable or no t variable. Since the components are considered to be made by layered manufactur ing techn ology in this paper, the inclusions are sprayed onto the layer where the matrix mater ial is being spread. Using the schema theor y, the model for the a- th material microstru cture region can be designed as follows:

R"

= {C oordinate system type: Ca rtesian, cylindrical, or spherical coo rdinate system O rigin of coo rdinate system: X Ra , YR a , O rientation of coordinate system: a Ra ,

Z Ra

f3 Ra , Y Ra

N umber of spaying operations: N Rd Spraying operation 1: Spraying operatio n 2: Spraying operati on N R a : (40)

Th e first three slots are the same as those in Formula (38). Th e fourth slot is different from that in Formula (38) and is th e number of spraying operations (N Ra ) , below which N R" sub-frames are listed. Each sub- frame describe s one type of spraying operation and consists of two slots for the detail of its operation. The sub-frame for the first spraying operation, for example, can be w ritten as: Spraying operation 1: Spraying material: C od e name of material 1 Spraying function: VRa1

= [fRa1(X, y, z) I(X, y, Z) E Ra]

Design and modeling methods

209

Figure 13. Periodic microstructures in cylindrical coordinate system.

The first slot in the sub-frame is the code name of material type. All the information about the inclusion (e.g., its material type, shape, and average size) can be retrieved from a material database according to its code name. The second slot in the sub-frame is the spraying volume fraction of inclusion and a function of the spraying position (x, y, z) in Cartesian coordinate system, for example. In each position, the sum of volume fractions of all the inclusions and matrix should be equal to one, which will be ensured in a main model introduced in Section 3.5. 3.4.2. Material miaostrutture models for heterogeneous materials with a periodic microstructure (P)

As introduced previously, a heterogeneous material with a periodic microstructure is described by its base cell, which is the smallest repetitive unit of material and comprises ofa material phase and a void phase. The base cells are arranged into a rectangular array (Figure 3(b)), cylindrical array or spherical array. Figure 13(b), for example, shows a cross section of the base cells shown in Figure 13(a) in a cylindrical array, and also represents the cross section passing the center in a spherical array. The model for the b-th material microstructure region with a heterogeneous material with a periodic microstructure can be expressed by a schema like Formula (41). p" = {Coordinate system type: Cartesian, cylindrical, or spherical coordinate system

Origin of coordinate system:

Xpb, Ypb, Zpb

Orientation of coordinate system: a pb, Number of insertion operations:

f3 pb, Ypb

Npb

Insertion operation 1: Insertion operation 2: Insertion operation N p "

:

(41)

210

Ke-Zhang C hen and X iII-Ali Feng

The first three slots are the same as those in Formula (40). The fourth slot is different and the number of insertion operations. If the base cell consists of only one type of material, the number of insertion op eration is 1 and there is only one sub- frame below the fourth slot, whi ch includes six slots. Taking the base cells (in the Cartesian coordinate system) shown in Figure 3(b) as an example , its sub-frame can be expressed as follows: Inserti on oper ation 1: Insertion: Code nam e of base cell Insertion materi al: Nil Inserting position functi on: (x , y, z)

= [(XI(II), YI (12), Z l (t3))1(11, 12, (3) E "Integer" ,(x, y, z) E Pb]

Dimension: F D1 (x , y, z) Orientation:

FIJI

(x, y, z)

Type ofRBO: Matrix dominant complex.union The first slot in the sub- frame is the pattern ofbase cell, which can be retri eved from a variable micro structure graphics library according to the code name of its pattern. The second slot is the materi al of base cell. When the heterogeneous mater ial with a periodic microstructure consists of only o ne type of material , its material is matrix material , which has been determined by material constituent comp osition model already. Thus, this slot can be filled by " N il" . The thi rd slot is the inserting positions of base cells, which sho uld be at th e points in an array of local coordin ate system and within its material microstructure region . For example, if Cartesian coo rdinate system is applied, its array can be determined, as show n in Figure 14, by:

Y = {b l,12 +

Cy ,

12 = 1, 2,

z = {b zl ) + cz , I) = 1, 2,

= {Yh

Y2,

j

} = {Zh

Z2 ,

j

j

(42)

where bx , by, bz, ex, c y and c z are constants. The fourth slot is the dimension of base cell, which is determined by a special function set, F D1 (x, y, z), which is a vector including all th e parameters of 3D parametric model of the base cell while x, y, z are the coordinates of inserting points of base cells. In Cartesian coo rdination system , the "D imension" for all the base cells show n in Figure 3(b) are the same, i.e., Dimension: "co nstant". But, if cylindrical or spherical coo rdination system is used as show n in Figure 13, the dimensions vecto rs of all the base cells in the same circle are the same and those in different circles are the linear functions of their radial position coordinates. The fifth slot is the orientation of base cell, which is also determined by a special function set, Fe! (x , y, z), whi ch is a vector inclu ding three axial angles of base cell while x, y, z, are the coordinates of inserting points of base cells. In Cartesian

Design and modeling methods 211

x Z3

Z2 ZI

y Figure 14. Inserting positions.

coo rdination system, the orientation vectors for all the base cells shown in Figure 3(b) are the same, i.e., Orientation : "co nstant". But , ifcylindric al or spherical coordination system is used as shown in Figure 13, the orientation vector s ofbase cellsare the normal vectors of their insertin g points. Thus, the orientation vector of all the base cells in the same radial are the same. The sixth slot is the type of Reasoning Boolean Operations (R BO) [31,32]. The RBO is different from the conventional Boolean Operation [15, 16]. The latter deals with only geometry, but the form er deals with both geometry and material information . Unlike conventional Bo olean O perations, the RBO needs to be executed accord ing to the dom inant materi al information , which is defined either matrix dominant or inclusion dominant union , subtract, and intersect according to the design intent. H ere, three types of RBO will be used and are illustrated as follows: • Matrix dominant subtraction In Figure 15, let the constituent compositions ofmatrix material A, inclusion material B and inclusion material C be CA , M B and Me respectively. Matrix dominant subtraction is to excavate matrix material at the insertin g position according to the shape of inclusion to obtain a gaseous inclusion or void. This operation can be expressed as: (43)

and the result is shown in Figure 15(a).

212

Ke-Zhang Chen and Xin-An Feng

C(illc)

B(illa)

a)

b)

c)

d)

Figure 15. Reasoning Boolean Operations.

• Inclusion dominant complex.union This operation is to excavate matrix material at the inserting position according to the shape of inclusion to obtain a void first and then insert the inclusion in it. When inclusion B is applied, the operation can be expressed as: (44) and the result is shown in Figure 15(b). If inclusion C is applied, its result can be shown in Figure 15(c). • Matrix dominant complex.union This operation is to excavate matrix material at the inserting position according to the shape of inclusion to obtain a gaseous inclusion first, then insert the inclusion in it, and replace the inclusion material with the matrix material. If inclusion C is applied, the operation can be expressed as: (45) and the result is shown as Figure 15(d). When the heterogeneous material with a periodic microstructure (e.g., that shown in Figure 16 [8]) consists of several types of materials, such as three types of materials as shown in Figure 17, its basic cell can be decomposed into three sub-cells. Its model is the same as Formula (41). The number of insertion operations should be 3, below which there are three sub-frames. Each sub-frame represents a sub-cell insertion and also has six slots. The material of sub-cell with the largest volume or functionally graded material in a cell is defined as matrix material. For the basic cell in Figure 17, sub-cell 1 has the largest volume, and its material is taken as matrix material and is determined by its material constituent composition model. Thus, its insertion material is still "Nil" and its type ofRBO is still matrix dominant complex-union. But, in each

Design and modeling methods

213

r ---------------------! i ! i

I

i I

! i i

ii i i ! i

L_.._._._._.._._.._._

(a)

(b)

Figure 16. An example of the microstructures with a single material.

2

sub-cell 1

sub-cell 2

r

sub-cell 3

Figure 17. An example of the microstructures with three material constituents.

of the next two sub-frames for other two sub-cells, "code name of its material" should be specified for the insertion material in the second slot and the "inclusion dominant complex.union" should be filled for the type of RBO in the sixth slot. 3.5. Main model for integrating the two types of sub-models

After the sub-models have been made in the form of schema for material constituent composition and material microstructure in each region, a main model can be built to

214

Ke-Zhang Chen and Xin-An Feng

integrate these sub-models for application. T he main model can be written as: QG = (M aterial con stituent com positio n model: C = {Ci , i = 1, 2, . . . , 11 }

Material micro stru cture model:5 = {5 j , j = 1, 2, ... , m 1m = (u + v

(R +

+ w), P + a)}

Model for those witho ut microstruc tures: 0= { ad = "nil " , d = 1, 2,

, til }

5j

= {1<" , a = 1, 2, . . . , u} Period ic microstru cture model : P = {Ph, b = 1, 2, . .. , v }

E

Composite material model: R

Material region : M

= C x 5 = (AJ;j li E (1, 2, . . . ,11), j

E (1,2,

, m)}

Number of material region s: 1\!,H Materials region 1: Material region 2: Materi al region 1\!,\1 : (46)

The first five slots in Qc are used to describe sub- mo dels of the componen t. Th e sixth slot is material region set. The seventh slot is the number of material region s, below which there are 1\!,H sub- frames for the details of the NAt material regions. In the sub-frame of each material region , the first two slots are the identification codes (IDs) of material con stituent composition model and material microstru cture model respectively. If the material microstructure is composite material, the third slot must be added to specify th e volume fraction of its matrix mate rial since the sum of volum e fraction s of all the inclusions and matrix in each position should be equal to one as menti on ed previously. If material region 1 is a composite material region , for instance, its sub- frame can be written as: Materials region 1: ID of material constituent compositio n model: [Cl , C l E C] ID of material microstru cture model : [51, 51 E 5j Volume fraction of matrix : vi.ct = vi.cl (1 - L ~~l VR1N), h = 1,2, ... , N C1 where Cl is not C 1 and is one region in C , 51 is also not 51 and is on e region in 5, and the like. 3.6 . An example of modeling

Figure 18 shows a special suppor t component. Its right end is require d to provide high abrasion resistant capacity, and its lengthwise th ermal deformation should be close to zero. In order to meet these requirements, a kind of material (m 1) with goo d streng th is employed for the left end part , ano ther kind of materi al (m2) with a special microstru cture (M2 ) and very small thermal expansion coe fficient is used for the intermediate body, and a kind of composite material (material m 3 as matrix matenal and m4 as inclu sions) with high abrasion resistant capacity is applied for its

Design and modeling methods

c

/

215

s -----------+ 1--

Q

Zs Z7

Z'I'

nil

\

p,

I

nil R,

---z (enlarged) Figure 18. An example of CAD modeling for a component.

right end part. To prevent high stresses and crack at the interface between two kinds of materials, functionally graded materials are applied both between m I and m 2 and between m2 and m-: Therefore, this component can be divided into five material constituent composition regions and four material microstructure regions, Its CAD model can be built by integrating the following sub-models: (1) 3D solid models of the component According to Eqs. (34) and (35), its material constituent composition set and material microstructure set are indicated as follows: C

=

{CI , C2 , C3 , C4 , Cs }

(47)

R= {Rd

(48)

P

(49)

=

{Pd

0= {Ol , 02} = {nil, nil} 5 = {5;, j = 1,2,3,415; E (R+ P

+ O)} =

(50) [nil.P,, nil,Rd

(51)

The material regions of the component can be obtained by solving Cartesian product of C and 5 as:

Q = M = C x 5 = {Mll , M 21, M 22 , M 32 , M 42 , M 43 , MS4 } = {(C I , nil), (C2 , nil), (C2 , PI), (C3 , PI), (C4 , PI), (C4 , nil), (C s, R I ) }

(52)

216

Ke-Zhang Chen and Xin-An Feng

Thus, the B-rep scheme [15, 16J can be used to represent the shape of the whole component and all the borders between different materi al regions in the component. (2) Its material constituent composition model C Since there are five material con stituent composition region s, its five models can be built according to the schema shown in Formula (38), where all the attributes for the slots of each model are listed in Table 3. (3) Its composite material model R It has only one composite material region and its model can be written as follows:

R1 = {Coordinate system type : cylindrical coordinate system Origin of coordinate system : 0, 0,

.,q,

Orientation of coordinate system: 0,0,0 Number of spraying operations: 1 Spraying operation 1: Spraying material : 1114 Spraying function : VR11 = [J Rl l (z) I 0 .:::: z .:::: (2 7

-

.,q,)J (53)

(4) Its periodic microstructure model P It has only one periodic microstructure region and its model can be obtained as:

PI

= {Coordinate system type: cylindrical coordinate system Origin of coordinate system: 0, 0, 2 2 Orientation of coordinate system: 0, 0, 0 Number of insertion op eration s: 1 Insertion operation 1: Insertion: Code name of basic-cell Insertion material : Nil Inserting position function: (r, kzlt3

e, z) =

[(kr I tl

+ C 1, ke: tz + C111, r

+ Czl) I (tl , tz , t3) E "Integer", r4 .:::: r ,:::: r3, 0 .:::: e .: : 2n,

o< z <

(2 5

-

2 2) J

Dimension: k D 1 r + CD t Orientation:

e

Type ofRBO: Matrix dominant complex.union (54)

...

'" .....

111 2

0, 0, 4 ,

111 2

1

2

2:1

111 3

"' 3

N ote: W he re Z is the coo rd iuate o f the glo bal co o rdinate system of the co m po nent and c is that o f the lo cal coord ina re system o f J m ateri al regio n.

Z.

5

3

111 2

1/1 1

"'1

M aterial type

2

0,0, 0

0, 0, 0

Number o f materi als

0,0 , 2 ,

O r ient atio n of co or dinate system

O r igin ofcoo rdina te system

0,0, 0, 0,

C ylindrical co ordinate system

Type of co or din ate system

4

2

Region N o. of material co nstitu ent co mposition

Table 3 At tributes fo r each slot in m aterial co nstitue nt co mpo sition m odels

even

() ::S z ::S ( 4, - 2;)

V2 = 1 - z/ ( ;{" - 2;) a ::S z ::S (4, - 2;) 1'3 = ::/( 4 . - 2;)

V2 = Z/( 2:1 - 2 1) 0 ::S :: ::S ( 2.1 - 2 Il even

VI = 1 - ::/ ( 2.1 - 2 1 ) 0 ::S :: ::S ( 2 .1 - 2 1)

even

M aterial co m po sitio n fun ct ion

218

Ke- Zhang C hen and Xin-An Feng

4. FINITE ELEMENT ANALYSI S BASED ON THE MODEL

The CAD model s of co mpo nen ts made of multi heterogeneous materials is int ended to be used not only for depositing the info rmation from design procedure described in section 2 of this chapter but also for subsequent analysis, op timi zation, and layered man ufactur ing. This section illustrates its finite eleme nt analysis. A co mponent made of multi het erogeneou s materi als norm ally requi res more nod es and finite elem ent s to m odel and describe completely th e response of th e who le structure since it possesses a non-homogeneo us charac ter at a microscopi c scale. It is possible that the number oforder ofits final stiffness matrix will be very large. Its stiffness matrix and equations for solutio n will possibly exceed th e memory capacity of th e computer. A procedure to overcom e th is problem is to separate the whole struc ture int o smaller units called substru ctures [33, 34], which are analyzed separately to obtain the relationship between forces and displacem ents, for instance, at th e common interfaces or boundaries. These boudary variables are then determined and are used to obtain the unknown s within each substruc ture. In this case, each material regio n of its material set can be considered as a substructure . Therefore, its finite element analysis can be impleme nted according to the following proce dure:

(1) Crea te and discretize the co mpo nent into finite eleme nts based on material region s of its mater ial set Th e materi al constituent compo sition and the materi al microstructure have been clarified in each region . The number of finite elem ent s to be construc ted depe nds on th e precision of analysis and the no n- ho moge neo us degre e of its materials. Th e higher th e precision of analysis and/o r th e non-hom ogeneous deg ree of materials is, th e more finit e elem ents wi ll be required. (2) Build the stiffness matr ix for the bou ndary of each material region In each finite elem ent co nstruc ted, there are very small changes in material co nstituent co mposition if functionally graded material is used in it, and the distributi on of inclusions is even when composite or hete rogeneous material wi th a peri odic microstru cture is used in it. T he stiffness matri x for each finite eleme nt can be obtained by using the theory of homogenization [6-1 0], which has been developed since the 1970s and can be used as an alternative approach to find the effective properties of the equiv alent hom ogenized material. From a math ematical point of view, the theory of homogenization is a limit theory whi ch uses th e asymptotic expansion and assumption of perio dicity to substitute th e differenti al equations with rapidly oscillating coefficients, with differential equations w hose coefficients are constant or slowly varying in such a way that th e solutions are close to the initial eq uations [35]. After the finite eleme nts in a material regio n are assembled to represent the en tire region , an eq uilibri um eq uation can be obtained as follows: (55)

Design and modeling methods

219

where {F(r)} is the load vector, {o(r)} is the displacement vector, and [K(r)] is the global stiffness matrix of the r -th material region in its material set. For the r -th material region, there are two types of nodes: those on its boundary and those inside it; and the stiffness matrix, the displacement vector and the load vector can be partitioned corresponding to boundary and internal degrees of freedom, {oi')} and {Diy)}, respectively. Its equilibium equation can thus be rewritten as: lr)] F.h [ Fir) I

where

IY

[KIJ/J) Klr)

(56)

ib

Ft) and oi'J are load and displacement vectors of nodes on its boundary F/')

respectively; and and OJ(Y) are load and displacement vectors of nodes inside the region respectively. Then, the stiffness matrix of its boundary can be obtained as follows [33]:

I] = [Kill] _[Kill] [K1li]-1 [Kill] [KIY b

IJI

bb

II

1/1

(57)

(3) Build the global stiffness matrix [K,,] for the boundaries of all the regions in the component The above analysis will be carried out for all the material regions in its material set and the stiffness matrix for each region will be obtained. Then, treating each region as an element, the global structure stiffness matrix can be formed by the usual assembly procedure by direct stiffness method as: (58)

where L is the number of material regions in its material set. (4) Generate load vector for nodes on the boundaries of all the regions in the component First, the loads on nodes inside each region are converted to the loads on nodes on the boundary of each region using: (59)

and then the load vector for the nodes on the boundaries of all the regions in the component can be obtained using:

IS,,} = {Ft>J -

L L

1'=1

{~;rl}

(60)

220

Ke-Z hang C hen and Xi n-An Feng

Here, the physical inter pretations of {Rt )} is the force required to be applied at the region boundaries to keep the boundary displacement s equal to zero, i.e., for fixing the boundaries. (5) Calculate displacement vecto r for the boundaries ofall the regions in the component After the stiffness matrix and load vector for the boundaries of all the regions in the component are obtained in step 3 and 4 respectively, the displacement vector can be calculated according to the equilibium equ ation as: (61)

so that the displacement for the boundaries of each region, {8~r ) }, can be obtained. (6) De termine the displacement vecto r for the nodes inside each material region According to Eq. (19), the displacement vector can be determined by [22]: (62) (7) Analyzing the stress for th e com po nent Up to now, all the displacement for nod es both inside and on the boundary of each region have been obtained. Thus, the stresses can be calculated following the usual finite element procedure [33, 34]. 5. SUMMARY

With rapid developments of high-tech in various fields, there appear mo re critical requireme nts for special function s of compo nents/ products, whi ch canno t be satisfied by using conventional hom ogeneous materials. The atte ntion has focused on heterogeneous materi als, including composite materials, function ally graded materials, or heterogeneous materials with a periodic microstru cture. The design method for conventional components made of homogeneous material or single heterogene ous materia l is always to choose a kind of mater ial first, and then design compo nent's configuratio n and check whether the compo nent can satisfy the function al requirements. For the components made of multiple heterogeneous materials, however, their design process has to be reversed, i.e., from functional requirements in high-tech application to a com ponent's configuration to material properties and to mic rostructures and/or constituent compositions. The design procedure goes though (1) design component's config uration according to the first type of performance requirement (CAd , using conventional CAD technology; (2) determine material properties in different portions or regio ns of the comp on ent according to the second type of performance requirements, using Sensitivity Analysis and Steepest Descend Meth od; (3) Select optimal materia l constituent compositio ns and microstructures for different porti ons of the component to satisfy material property requ irements and variou s constraints from material affinity, manufacturability, etc., supported by a related heterogene ou s materi als database, using Gen etic Algorithms; and (4) optimize the parameters of configuration based on the material selection using Finite Element Analysis. The first and the

Design and modeling methods 221

fourth phases belong to geometric design that is well developed . The seco nd and the third phases concentrate on material design, whi ch is introduced in more det ail in this chapter. With the design meth od , all th e information (abo ut both con figu ration and materi al) needed for creatin g a CAD model of the compon ent made of multi heterogeneo us material s can be obtained. Since this method and subsequent CAD modeling both mu st be implement ed in compute rs by using th e functions of curre nt CAD / CAE software, th e method is also a computer-a ided design method. After geometric and material design, a CAD model for representing the component made of multi heterogeneou s mater ials need to be built so that further analysis, optimization and manu facturing can be implemented based on the mod els. The CAD mod eling method for the compo nent made of multi het erogeneou s material s divides a component into man y material regions (M ij ) , based on two region sets (C and 5), each of which has a specified material constituent composition and a specified microstru cture. For each region, its CAD model consists of thr ee sub-mo dels: geometric model, material constituent co mposition model, and material microstructure model. The first sub-model is 3D solid model, and the last two sub-model s are in the form of schema. The CAD model for a compo nent made of multi heterogen eous material s is formed by integrating th e three sub- models for each materi al region. This method can be implemented by empl oying the functions of current CAD graphic software and build a model that includes all th e material information (about peri odi c micro structures, co nstituent composition , and inclusions) alon g with the geo me try inform ation in current 3D solid modeling with ou t the problem arising from to o mu ch data. Such a CAD m odeling system has also been developed and applied. Th e CAD models of compo nents made ofmult i het erogeneou s materi als is intended to be used not only for depositing th e information from design procedure described in section 2 of this chapter but also for subsequent analysis, optimization, and layered manufactu rin g. A compo nent made of multi heterogen eous materials normally requires more nodes and finite elem ent s to mod el and describ e completely the respo nse of th e who le structure since it possesses a non-hom ogene ou s character at a microscopic scale. It is possible that the number of orde r of its final stiffness matrix will be very large. Its stiffness matrix and equ ation s for solution will possibly excee d th e m emory capacity of th e computer, and solution efficiency will be very poor. A procedure to overcome this problem is to separate the wh ole struc ture into smaller units called substruc tures (in this case, each material region of its material set can be considered as a substru cture), which are analyzed separately to obt ain th e relationship between forces and displacements, for instance, at the common int erfaces or boundaries. These boudary variables are then determined and are used to obtain th e unknowns within each substruc ture. Therefore, its finite element analysis can be simplified and impl em ented based o n its CAD model ACKNOWLEDGEMENTS

The reported research is suppo rte d by Co mpetitive Earm arked R esearch Grant of H ong Kon g Research Grants Co uncil (RGC) under project co de: HKU 7062/00£ . The financial cont ribution is gratefully acknowledged. T his chapter is further written

222

Ke-Zhang Chen and Xin-An Feng

based on the authors' two journal papers: Computer-Aided Design, Vo1.35, 2003, pp. 453-466, "Computer-aided design method for the components made of heterogeneous materials" and "CAD modeling for the components made of multi heterogeneous materials and smart materials" with permission from Elsevier. REFERENCES [1] Berthelot,J. M. Composite materials: mechanical behavior and structural analysis. New York: SpringerVerlag, 1999. [2] Chawla, K. K. Composite materials: science and engineering. New York: Springer-Verlag New York, Inc., 1998. [3] Barbero, E. J. Introduction to composite materials design. Ann Arbor, Ml: Taylor & Fancis, 1998. [4] Miyamoto, Y et al. Functionally Graded Materials: Design, Processing and Applications. Boston: KJuwer Academic Publishers, 1999. [5] Bhashyam S, Shin, K. H., and Dutra, 0. An integrated CAD system for design ofhetergeneous objects. Rapid PrototypingJournal, 2000; 6: 119-135. [6] Larson, U. D., Sigmund, 0., and Bouwstra, S. Design and fabrication of compliant micromechanisms and structures with negative Poisson's ratio, Journal of Microelectromechanical Systems, 1997; 6: 99106. [7] Silva, E. C. N., Fonseca, J. S. 0., and Kikuchi, N. Optimal design of piezoelectric microstructure. Computational Mechanics, 1997; 19: 397-410. [8] Sigmund, O. and Torquato, S. Design of materials with extreme thermal expansion using a three-phase topology optimization method. J. Mech. Phys. Solids, 1997; 45(6): 1037-1067. [9] Bendsoe, M. P. Optimization ofstructure topology, shape, and material. Berlin: Springer-Verlag, 1995. [10] Hassani, B., and Hinton, E. Homogrnization and structural topology optimization: theory, practice and software. New York: Springer-Verlag, 1999. [11] Suh, N. P. The Principle of Design. New York: Oxford University Press, Inc., 1990. [12] Suh, N. P. Axiomatic Design: Advances and Applications. New York: Oxford University Press, Inc., 200!. [13] Rao, S. S. Engineering Optimization: Theory and Practice. New York:John Wiley & Sons, Inc., 1996. [14] Chen, K. Z., Identifying the relationships among design methods: key to successful application and development of design methods, Journal of Engineering Design, 1999; 10: 125-141. [15] Lee, K. Principle of CAD/CAM/CAE System. Reading: Addison-Wesley Longman, Inc., 1999. [16] McMahon, C. and Browne, J. CADCAM: Principles, Practice and Manufacturing Management. Reading: Addison-Wesley Longman Inc., 1998. [17] Prinja, N. K. Use of Finite Element Analysis in the Design Process. Glasgow: NAFEMS, 2000. [18] Bakshi, P. and Pandey, P. C. Semi-analytical sensitivity using hybrid finite elements. Computer and Structures, 2000; 77: 201-213. [19] Gen, M. and Cheng, R. Genetic Algorithms & Engineering Design. New York: John Wiley & Sons, Inc., 1997. [20] Chen, K. Z., Zhang, X. W, Ou, Z. Y, and Feng, X. A. Recognition of digital curves scanned from paper drawings using Genetic Algorithms. Pattern Recognition, 2003; 36(1): 123-130. [21] Milne-Thomson, L. M. The Calculus of Finite Differences. New York: Chelsea Pub. Co., 1981. [22] Blaha, M. R. A. Manager's Guide to Database Technology: Building and Purchasing Better Applications. Prentice Hall, 2001. [23] Kumar, V. and Dutta, 0. An approach to modeling & representation of heterogeneous objects. Journal of Mechanical Design, 1998; 120: 659-667. [24] Kumar, v., Burns, D., Dutra, D., and Hoffmann, C. A framework for object modeling. ComputerAided Design, 1999; 31: 541-556. [25] Jackson, T. R., Liu, H., Patrikalakis, N. M., Sachs, E. M., and Cima, M. J. Modeling and designing functionally graded material components for fabrication with local composition control. Materials and Design, 1999; 20(2/3): 63-75. [26] Siu, Y K. and Tan, S. T. Source-based heterogeneous solid modeling. Computer- Aided Design, 2002; 34(1): 41-55. [27] Siu, Y K., and Tan, S. T. Modeling the material grading and structures of heterogeneous objects for layered manufacturing. Computer-Aided Design, 2002; 34(10): 705-716.

D esign and modeling m ethods 223

[28] M or van. S. and Fadel. G. M . MMA-Rep. A V-Representatio n for Multi-mater ial O bj ec t. Software Solutio ns for R apid Proto typing. PEP Press. UK , 2002. [29] Ul rich, K. T. and Eppi nger S. D. Product design and develo pment . Boston : M cGraw-Hill C ompany, Inc., 2002. [301Jon assen, D. H ., Beissner, K., and Yacci, M . Struc tu ral kn owle dge: techn iques for represent ing, conveying. and acquiring struc tural knowledge . H illsdale, N ew Jersey: Law rence Erlbaum Associates. Inc., 1993. [31 ] Sun . W., Lin, E, and Hu , X . C omputer-ai ded design and m odel ing of co mposite unit cells. Co mposi te Science and Technology, 2001 ; 6 1: 2R9-299. [32] Sun. w., and Hu , X . R eason ing Boolean ope ratio n based modeling for heterogeneou s obje cts. Computer- Aided D esign . 2002 ; 34: 4RI-488. [33] Krishn amoorthy, C. S. Finite elem ent analysis: th eory and program ming. N ew D elhi: Tara McGrawHill Publ ishin g Company Limit ed. 1994 . [34] Logan, D. L. A first course in the finite elem ent m eth od using Algor. Bosto n: PWS Publi shing C ompany, 1997. [35] O leinik, O . A. On hom ogenization problems. Trends and Appli cation of Pur e M ath emati cs. Berlin: Springe r, 19R4.

QUALITY AND COST OF DATA WAREHOUSE VIEWS'

ANDREAS KOELLER 2 , ELKE A. RUNDENSTEINER, AMY LEE3 , AND ANISOARA NICA 4

1. INTRODUCTION

Query rewriting has been used as a query optimization technique for several decades to reduce the computational cost of a query. Traditional problems in query rewriting include in particular query optimization [28, 60, 6] and rewriting queries using views [40, 7]. Most of these works deal with the problem of maintaining the exact original interface (schema) and extent of a given query while optimizing performance. They are thus based on the restricting assumption that the rewritten query must be equivalent to the initially given query. Recently, query rewriting with relaxed semantics has been proposed as a means of retaining the validity of a data warehouse (i.e., materialized queries) in situations where equivalent rewritings may not exist-yet alternate but not necessarily equivalent query rewritings may still be preferable to users over not receiving any answers at all [34, 43, 53J. Other scenarios that also motivate a relaxation of the "exact query" assumption include loosely-specified query paradigms [44], relaxed restrictions on WHERE-clauses to generate approximate result sets [8], vaguely specified queries in semistructured 1This work was in part supported by several NSF grants, namely, the NSF NYI grant #IRI Y796264, NSF ClSE Instrumentation Grant #IRIS 9729B7B, and the NSF grant #IIS 99BB776. 2This work was performed while Andreas Koeller was a Research Assistant at Worcester Polytechnic Institute. 31'his work was performed while Amy Lee was a Research Assistant at Worcester Polytechnic Institute and a Ph.d. student at the University of Michigan, Ann Arbor. 41'hi5 work was performed while Anisoara Nica was a Ph.d. student at the University of Michigan, Ann Arbor.

224

Quality and cost of data warehouse views

225

environments that need to be refined during query evaluation, as well as marketoriented environments in which very similar (but not equal) results in answering a query can lead to dramatically different query computation costs. Some more recent work in XML also addresses the approximate query answering problem, for example the approxQL project [55] or the XXL project. [59] Generating non-equivalent query results raises a new problem in the context of query rewriting. Since results returned for a given query may now be quite distinct, it leads to the problem of having to compare "incomparable" query results, or rewritings, for a given query. As one would expect, the number ofnon-equivalent query rewritings is much larger than the number of equivalent query rewritings in general. Given that the search space is now even larger than for the equivalent query rewriting problem, an automated means of comparison of various rewritings is needed. In this chapter, we report the development of such a measurement model for nonequivalent rewritings. While, as illustrated above, the problem arises in many different environments in which queries are used, for the purpose of this work we focus our attention on E-SQL [53] as the relaxed query model and on the issue of view maintenance in data warehouses as motivation for establishing the model. In this work, we introduce the two dimensions of information preservation (quality) and view maintenance peiformance (cost) of query rewritings as two key components of the proposed model. The paper addresses the need for measuring the divergence between queries in a quantifiable manner by proposing measures for the interface and extent divergence of the query results, referred to as the quality of the rewriting. Given that the independent dimensions of quality and cost cannot be easily evaluated and compared against each other, we analyze the semantics ofthese dimensions and propose a model of assigning numerical values and trade-off parameters in order to achieve a quantifiable overall evaluation for query results. The resulting model, which we call Quality-Cost-Model (QC-Model), combines these two dimensions into a single measure. We address several core issues of the problem including the definition of a distance between view extents (called here "degree of divergence") and several properties of the cost model, which we adapted from the literature on incremental view maintenance cost. [6, 65] We describe the comprehensive test bed we have developed for the purpose of experimentation and demonstration (built as part of the EVE-System demonstrated at ACM SIGMOD 1999 [52]), which also incorporates the QC-model as presented in this current paper. We report upon an experimental study we have conducted. Our experiments assess the trade-offs among the different factors of the quality and cost measures, characterizing correlations and independence among them. We also study the effect of different parameters of the view rewritings on the QC-Model, such as number of ISs over which the view is defined, the distribution of relations over a fixed number of ISs, and so on. Using our experimental setup, we have evaluated the accuracy of our proposed view overlap estimation for the quality portion of the QC-Model. The experiments indeed show a strong correlation between estimated and actual view extent overlaps. Similarly, we have also conducted a number of experiments designed to assess the predictability of the proposed QC-Model in terms of its cost

226

Koeller et al.

measure in estimating the actual view mainten ance cost. The experiments show a strong correlation between the predicted and actual incremental view maintenance cost, and thus support the utility of our propo sed Q C-model. In summary, this work makes the following contributions: First, it identifies the problem of trade-offs of quality against cost for non-equivalent quer y rewritings and the need for a model for assessing these measures. Second, we introduce the measure of quality for a query, and establish techniques for determining the quality measure for a given query based on empirically supported findings. Third, we establish an integrated measure for both quality and cost, based on an existing cost model for distribut ed view maintenance. [65) The resulting Quality-and- Cost(QC)-model that we propose assigns num erical values to approximate quer y rewr itings. Fourth, we have developed a fully distribut ed data warehouse maintenance system for demonstrative and experimental purposes. [52, 15) O ur prototype not only includ es view synchronization algorithms [43, 29, 45, 63) and algorithms for incremental view maintenance [4, 62, 22), but it also utilizes the Q C-model as criteria for selecting a goo d view rewriting among the ones generated by the view synchronizer. Fifth, we perform an analytical evaluation on the properties of our model , characterizing trends, correlations and independence among the different Q C-Model factors. Sixth , we use our software for an experimental evaluation demonstrating the utility and soundness of the QCModel by using statistical meth ods and measuring correlation between predicted and measured Q C-Values. While we have developed the Q C-M odel in the context of data warehousi ng [53), it is also applicable to other areas of query reformulation , as mentioned above. The remainder of this chapter is organized as follows: Section 2 introduce s backgroun d concepts necessary for the development of the Q C-M odel, whereas Sections 3 and 4 present a detailed analytic model of quality and cost trade-offs, respectively. Section 5 describes our prototype imple mentation . Section 6.2 summarizes experime ntal results and Section 7 reviews related work, while Section 8 discusses our conclusions. 2. NON-EQUIVALENT QUERY REWRITINGS

The notion of relaxed queries has appeared in the past in several contexts, such as the EVE system [53, 11, 9] and, more recently, XML. [55, 58) Th e notion of a relaxed view definition is a generalization of the problem of traditional query rewriting, in which the execution plans or queries generated may be different syntactically, but will alwaysbe (semantically) equivalent to the original query, i.e., compute the same output relation. On the other hand, relaxed queries may compute a different exten t and even a different view interface (schema) than the original query. R elaxed queries are useful in the context of approximate reunitings of views in the presence of partly redundant informa tion sources. A typical case would be a view definition using information from an information source R which becomes unavailable at some point in time. The view may th en be rewritten to replace the missing information with information from

Quali ty and cost of data wareho use views 227

ano ther information source R', as long as Rand R' are known to contain the same, or similar, data. Clearly, such a system would not have to nor should be restricted to produce eqllivalent query rewritings. R ather, in order to achieve meaningful yet relaxed query rewritings, we prop ose that it would be useful to specify user preferences as to which elements of a query (attributes, selection conditions, relation s) may be replaced and/or removed fro m the query witho ut sacrificing the usefulne ss of the view to its users. Two facto rs guide the rewr iting process: the degr ee of redu ndancy in the information space and the degree of relaxation allowed as expressed by user preferen ces about flexibility in the query definition . De pending on those factors, the rewriting process may yield a large and possibly expo nentially (over the size of the information space) growing number oflegal rewritings for an affected quer y. Under the assumption of non-equivalent query rewriting, each new qu ery could be specified on disparate base relation s with different cardinalities at different sites, hence return a different view interface a view extent, or even both. T his leads to the necessity to compare such non-equivalent queri es in order to find a rewriting that best matches the view user's needs. The goal of this paper is thu s to develop a "desirability" model for qu ery rewritings. Towards this end , we will introd uce the two con cepts of quality and cost of a query rewriting as two key measures for establishing such a comparison. T he first measure is the degree of divergence of quality (i.e., information preservation ) between two queries (cf. Section 3). T he second measure represents the lon g term maint enance cost associated with a view, which for example occurs in a data warehousing contex t, where th e cost to maintain a view significantly influences the usefulness of the view to the user (cf. Section 4). O ther costs could of course be incorp orated for the later measure depending on the purpose of the overall measurement mo del. 3. EFFICIENCY MODEL: QUALITY OF A QUERY REWRITING

3.1. Information preservation in rewritings

T he information, i.e., qu ality, returned by a query is of great impo rtance to its users. T he information returned in the (relational) result of a query can be determined in terms of two aspects, namely the qu ery interface (i.e., the set of attributes in the SELECT clause of the query definition) and the query extent (data). When a relation or attribute that is used by a view definition V becomes unavailable, the view V would be rewritten, making use ofredu ndant information in the und erlying information space and of user preferen ces regarding the "rewritabiliry" of the view. Ideally we would like to replace V by a rewriting V; such that V; is "equivalent" to V in terms of both quality aspects, altho ugh some information may be taken from other infor matio n sources. When V; is not equivalent to V, we say that V; diverges from V. R ankin g rewritings which preserve V to different degrees is not trivial. This can best be demonstrated by an example.

228 Koeller et al.

Name

Smith Baker Davis

Address 1st St. 2nd Ave. Blue St.

City

Somerville Boston Boston

Phone

Na me

617-123-4567 6 I7-321-6547 617-987-6543

Baker Davis

City

Smith Baker Davis Adam Jones

Somerville Boston Boston Worcester Worcester

2nd Ave. Blue St.

(b) Rewriting Vi (Base Table BackBay)

(a) Original Extent of V (Base Table Customer)

Name

Address

Phone

617-123-4567 617-321-6547 617-987-6543 508-123-4567 508-321-6547

(c) Rewriting V2 (Base Table MABranch) Figure 1. Different amounts of information are preserved in rewritings.

Example 1. Let the view V over the database ill Fig. 1 be defined asfollows: CREATE VIEW V AS SELECT Name , Address , City, Phone FROM Customer WHERE CustomerSin ce < 1996

(I)

Assume the relation Customer is deletedf rom itssite. Twopossible rewritinys, by replacing Customer with Ba ckBay and MABranch , respectively, are: CREATE VIEW V; AS Name, Address SELECT FROM Ba ckB ay WHERE CustomerSin ce < 1996

CREATE VIEW V2 AS Na m e, Ci ty, Ph on e SELECT (2) FROM MABranch WHERE Cus tom erSince < 1996

(3)

From the viewpoint of the query inteiface, Vi and T-2 are able to preserve a different subset of attributes of the original inteijace, Name and Address by Vi and N ame, City, and Phone by T-2 . From the viewpoin t of the query extent, by considering the CO III III on set of attributes between the original queryand a query rewriting, Vi is able to preserve two out of three tuples of the original query without introducing any extra tuples (i.e., precise but not total recall), while T-2 is able to preserve the original qllery with two surplus tuples (i.e. , total recall but not precise). Obviously, we need a mec hanism to decide which rewriting is closer to the original and thu s the best choice for a replacement for Va nd thu s superior to others. T herefore, our system must trade- off the pros and cons betwe en the query interface and qu ery extent preservation (and also between the two dimensi ons ofthe query exten t: precision

Qua lity and cost of data wareho use views 229

and recall) in order to rank these pot ent ial rewritings of V so that the "best" solution with regard to that rankin g can be selected. 3.2. Information preservation on the view interface

In this section, we prop ose a meth od to measure the preservation of the interface (schema) of a view in its non- equivalent rewritings. The basic pr inciple is to assign user ptejerences to attr ibutes in the view schema. There are two fundamental dimensions in which such user preferences can be expressed: dispensability and replaceability. 3.2.1. Dispensable and replacable attributes

For each attribute in the view schema, we assign two Boolean parameters: attribute replaceable (AR) and attribute dispensable (A D). Definition 1 (Replacement) Consider a view V whose schema contains an attribute V. A origilwting from a base table R. A dditionally, consider a relation R' that contains data related to R, in the sense that both relations contain itiformation about the same real-world objects, and some l' their respective attributes contain the same data about those objects. Let Vi be a non-equivalent rewriting of view V. Then, attribute VI. A is a replaceme nt for attribute V. A if (1) it originates from table R' , which must be a superset, subset, or equivalent to R and (2) V '. A stores the same data about the same objects as V. A . The autho rs of this paper have explored ways to express the concept of the "same data" in attributes as well as the overlap of base table extents. [53, 33] Definition 2 (Attribute replaceable (AR» A n attribute A ill a view schema V is considered replaceable if the view user regards a view rewriting V' containing a replacementfor A as useful. Definition 3 (Attribute dispensable (AD» An attribute A in a view schema V is considered dispensable if the view user regards a view rewriting V ' that does not contain A as useful. There are four possible combinations of these attribute parameter s: • AR=true/AD=true: attribute can be replaced or deleted in any rewriti ng. The semantics of this case are quite clear- a user would specify such semantics if the attribute is not very imp ort am for her. • AR =true/AD=false: attribute can be replaced but not deleted . These parameters would apply for an attribute whose data might be supplied from a different source but which always needs to be supplied in some way. • AR=false/AD = tru e: attribute can be deleted but not replaced. This case justifies a closer look . The concept of "replaceability" expresses a user's preferences with regard

230

Koeller et a1.

to the trustworthiness of an attribute. Declaring an attribute replaceable means that the user trusts other data sources to provide reliable information about that attribute. On the other hand, by declaring an attribute non-replaceable, a user declares that s/he will not trust the data in the attribute if it is not supplied by the original data source. Therefore, a user might allow that an attribute A can be deleted (ifit is dropped from its original data source) but cannot be replaced from other data sources "offering" this information. The issue here is one of trust in the reliability of information provided by alternative sources. • AR=false/AD=false: attribute cannot be deleted or replaced. These semantics would be specified for essential attributes in a view whose only trusted source is the original one. With our explanations above, it becomes clear that the two dimensions (or preferences) of replaceability and dispensability are orthogonal. Note that the dimension of "replaceability" (trustworthiness) could be expanded to allow for different quality levels depending on the source of the data. For example, a user of a travel data view might trust information supplied by a major travel agency, but not data originating from consolidators or small unknown data suppliers. Here, however, we restrict ourselves to a Boolean replaceability measure for simplicity. The choice of essentially two classes of relaxation parameters (dispensability and replaceability) is an approach at trading off the complexity of the system (i.e., the expressiveness of the quality model) against the ease of use by a user (i.e., the simplicity in specifying such relaxed query semantics). Various extensions of this model, such as a replacement of the Boolean preference values by numeric "fuzzy" values (as done for WHERE-clauses only in CoBase) [8], are of course possible, but are beyond the scope of the current paper (nor, would we expect them to result in a significant change of the treatment of this current work). After defining the semantics of all four combinations of the two preferences, we can now measure the preservation of a view interface in numeric terms. In order to achieve this, we observe that since the categories AD and AR are orthogonal, a user will generally have separate preferences on whether it is better to replace an attribute or to delete it, in the case both operations are allowed. Therefore, we simply assign numerical weights to an attribute for each of the four combinations of the AD and AR parameters. An attribute with a higher weight is then more "important" than an attribute with a lower weight and should have a higher chance to be preserved. However, we also observe that indispensable attributes (AD false) must be preserved in any view rewriting, thus forcing the weights for those cases to be infinite. In summary, we have the situation depicted in the table in Fig. 2. The table expresses that a view rewriting is legal if it omits attributes in categories 1 or 2 (i.e., dispensable attributes) but that a user might have preferences between these two cases. As the ultimate goal of our preference model is the comparison of view rewritings, we will normalize all results and thus require (0 :s w 1 , W 2 :s 1). Note that there are two choices on the relative values of w 1 and w 2:

Quality and cost of data warehouse views

Category

231

Weight

(AD, AR)

Cl: (true, true) C2: (true, false) C3: (false, true) C4: (false, false)

Wj

Wz 00 00

Figure 2. Weights for the four classes of preserved attributes.

WI:::

Wj

Wz

< Wz

This represents the fact that a user is in favor of preserving the replaceable attributes (i.e., attributes in category 1). A view having replaceable attributes may be evolved further as more schema changes occur as our experimental evaluation in Section 6.2 confirms, whereas having relatively many non-replaceable attributes (i.e., attributes in category 2) has a negative effect on the further ability of a view query to evolve. In other words, it is harder to find good legal rewritings for a view if its view elements are non-replaceable. This represents the case in which a user finds a non-replaceable attribute more worthy to be preserved in a view rewriting than a replaceable attribute. This would express a low confidence of a user in the reliability ofalternative data sources, and would state that s/he prefers to lose access to some information over having unreliable information in the view.

3.3. Information preservation on view extent

We now introduce a notation for common subset of attributes and some set operators using the common-subset-of-attributes semantics. For this notation, we will usc bag semantics, i.e., duplicates which may occur after projection of a relation to a subset of attributes are not removed.

Definition 4 Common Subset of Attributes of V with respect to Vi. Let Vand Vi be two relations, such that Attr (V) n Attr (Vi) i= 0. We use V Jr ( Vi) to denote theprojection of relation V on the common attributes of Vand Vi. That is, V Jr ( Vi) = HAttr (V) n Affl ([1) V. Similarly, v,Jr(V) is defined as HAtty (V) n Attr (Vi) v;,.. Besides considering the attributes preserved in the legal rewritings, the sets of tuples returned by the queries will also have an impact on the user's satisfaction with a view rewriting Vi. When the view interfaces of a legal rewriting ~~ and the original view V are not the same, the extent preservation evaluation is done by comparing tuples on the common subset of attributes only. When the view interfaces of Vi and V are the same, the extent comparison is done as usual. We can also define set (bag) intersection and difference under the implicit assumption of" common-subset-of-attributes".

232

Koeller et al.

=

Semantics

{z 13 t E V /\ 3 t i E Vi, z t[Attr(V) n Attr(Vi)] { z 13 t E V, z = t [Attr (V ) n Att r(Vi)]/\ ,lI t i E Vi, Z

= t;[.4 ttr(V) n A ttr(V;)]}

= t ; .4ttr (V ) n Attr (l ,,)]}

Figure 3. Set operators on the common subset of attributes of V and Vi "

3.4 . Metric of quality: Degree of Divergence (1),D)

W hen choosing from amo ng a numbe r of rewritings, we wou ld like to choose a legal rewriting such that the view or query extent V does not change. If it is not possible to find a rew riting that satisfies this conditio n, we choose a rewr iting that produ ces a view extent as close as possible to the original one. Some rewritings may have a larger number of tuples in V preserved, but at the same time gen erate extra tuples that were not in V. On the other hand, some legal rewritings may preserve less tuples in V, but also generate less surplu s tupl es. In this section, we discuss how to generate a good rewr iting according to the user's preference by making a cho ice wh ich trie s to generate a rewriting that preserves information to as large a degree as possible. Below, we discuss how we quantify the quality of query rewr itings in terms of the query interface and the query extent, individually, and how to uni fy these two measures into one single measure-the Degree if Diuergence (DV) of a rewriting Vi from the original qu ery V. 3.4. 1. Degree

~f divergetlce 0 /1

the query interjace (DV attr (Vi))

Let I Ai I be the number of dispensable attributes (AD = tru e) in the qu ery interface of Vi that are replaceable (AR = tru e, category 1 in Fig. 2). Likewise, I A~I is the number of attributes in category 2. The query flexibility value of the query interface of Vi can then be defined as follows: (4)

with UJ1. lV2 weights on the two measures as introduced in Section 3.2.1. As tho se weights are expressing a relative preference between the two attribute types, we require them to not both be 0 atthe same tim e (i.e., wI, w2 ::::: 0 and WI + w 2 = 1). The query flexibility value of the original view V is defined likewise and denoted by QF v . The normalized degree of divergen ce of Vi from V in terms of the qu ery interface, denoted by D D,trr (Vi), can then be defined as:

DDattr( I?,)

= {~FV-QF'; QF,"

if QF v

=0

ot herwise

T his is a measure of distance of the interface of a rewriting from the original que ry interface. QF v = 0 occurs if the attributes contained in the original query V are all indispensable. In this case, any legal rewriting Vi of V must preserve the entire

Quality and cost of data warehouse views

233

view interface. That is to say, Vi can not diverge from V on the query interface, i.e., DDattr (Vi) = 0. When there are dispensable attributes in V, and QF v ::: 0, then D Dattr (Vi) is computed as defined above. If Vi does not preserve any ofthe dispensable or replaceable attributes, then QFv, = and DDattr (Vi) = 1. In terms of the query interface, Vi is preferred to fj' if D D attr (Vi) < D o.; (Vi).

°

Example 2. Let us look at the query and rewritings defined in Example 1. In that example, Attr(V) = {Name, Address, City, Phone}, Al = {Address, City, Phone}, and A 2 = 0 (since Name is indispensable). Therefore, QF v = 3· WI. The rewriting V1 preserves the attribute Address (besides attribute Name). Therejore, QF Vi = 1 . WI. On the other hand, the rewriting V2 preserves two if thedispensable attributes City and Phone. Therefore, QF v, = 2 . WI. Thus, V2 ispreferred to V1 as indicated by D D attr( V2) < D D attr( V1). 3.4.2. Degree
The divergence of a legal rewriting Vi from V is computed in two dimensions:

01. The (relative) number of tuples in the original V that are not preserved in the new Vi, denoted by DD

e.\'l-Dl

v n VI (V) -1- I I " I I V"(V;ll

(5)

02. The (relative) number of tuples in the new view Vi that are not in the original view V, denoted by

Iv,"( Vll- lV n" Vii 1v,"(Vll

= 1_

IVn" Vii

1v,,,(Vll

(6)

We express the number of tuples that are not preserved (Case 01) as a ratio to the size of the original view extent 1V Jr( vi) I, whereas the number of extra tuples coming into the new view (Case 02) is seen in relation to the size of the new view extent 1 V;Jr( V) I. That means, we see the loss of tuples (imperfect recall in information theoretical terms) as occuring in the old view, whereas the negative effect of additional "wrong" tuples (imperfect precision) are seen in relation to the new view. The total extent divergence of Vi from V is the weighed sum of D D ex LD 1 (Vi) and DDext_D2 (Vi), denoted by DDext (Vi), and defined as follows: D D ext (Vi) = '11 . D DexLDl (Vi)

=1-

+ '12 . D DexLD2( Vi)

('111 v;Jr(V) I+'121 VJr(V,) I) ·IVinJr I V,,(vi) II V;,,(V) I

VI

(7)

234

Koeller et al.

w he re Ql and Q2 are th e trade- offparam eters between D Dcx I _DI ( Vi) and D Dcx l _D2( 11") (QI, Q2::: 0 and Q I + Q2 = I). Again , th e view defin er is given an oppo rtu nity to set the trade- off parameters, with th e default setting being (Ql, Q2) (0.5, 0.5). T hose default setting reflect an assumpti o n that recall and precision of a rewriting are of equal importance for endusers . N ot e th at in this sectio n, we do not discuss how to obta in accura te estima tes for the input param eters of th e for mulae above . Estim ating such param eters is applicatio ndep endent an d a variety of techniqu es are available to help with th e task (notably sam pling- based techniques to estima te sizes of arbitrary qu eries such as [23, 25, 16, 47]).

=

3.4.3 . Total dej(ree

~f diverj(et1ce

With th e findings of th is section , we now define the tot al degree of divergence of Vi from Vas: DD( 11) =

f)atl' .

DDatl, (11) + fl ext

.

DDw (Vi), where f)att',

fl w

:::

0, f)atlr

+

f)w

= 1.

(8)

Qattr and Qw are paramet ers assigned by th e view user. They represent user preferences

for view int erface over view exte nt. 4. EFFICIENCY MODEL: VIEW MAINTENANCE COST OF A LEGAL REWRITING

In thi s sectio n, we now discuss the m easure of view maintenance cost as a method for ranking view rew ritings. 4.1. View maintenance basics

For m ost applicatio ns, data updates such as inserts o r deletes of tupl es to /from th e base relation s take place m ore frequ ently than sche ma changes in th e information space . Therefore, we choose to rank th e legal rewritings by th eir IOllg term view m ain ten ance costs", A legal rew ri ting is co nsidered to be preferred if its expec ted view maint en ance cos ts are low co m pare d to othe r legal rewritin gs. We fur ther assume th at a co nventio nal increm ental view maintenance algo rith m sim ilar to th e o ne specified in [65] is used to bring the view extent up-to-date right after th e information source data is updated. Ad opting their approach , we int roduce three m ajor cost factors (for a single data co nte nt update) for a particular legal rewriting: th e number of m essages ex change d, th e numbe r of byte s transferred , and the I/O cost at the local ISs. Our cost model wo rks well for such view maint en ance enviro nme nts and is therefor e expl ained here. Other cost m odel s are co nce ivable for other purposes, as lon g as th e return a single numeric cost value for a given qu er y. :lT he cost for reco mputing the o riginal view extent aftcT a view re- definition is a o ne-time cost. T hus We do not rank the legal rewriting on this one -t ime view update cost.

Quality and cost of data warehouse views 235

4.2. Cost factor based on number of messages exchanged (CFM )

The number of messages exchanged between the information space and the view site for a single base data update, denoted as CFw, is in the range [0, 2m] (with m denoting the number of information sources involved in the view). To be more specific: if m = 1 and if m = 1 and (m - 1) if m > 1 and / 2· m otherwise 0

CFM

=

~.

11[

= 0 > 0

11[

= 0

11[

with n 1 the number of relations in the update-generating IS besides the relation where the update occured. The best case CFM = 0 occurs when there is only one relation referred to in the view V or when V is self-maintainable as discussed by Gupta and others. [20] Self-maintainability is out of the scope of this paper, so we do not discuss it any further. Note that when there is only one relation in lSI referred to in V (nl = 0), then no query needs to be sent to lSI. 4.3. Cost factor based on bytes of data transferred (CFT )

Considering an information space consisting of n relations R l , ..• , R; in m information sources lSI, ... , IS"" it is possible to estimate the number of bytes transferred in the entire system during the incremental maintenance of the view after an update. Such a computation will generally assume that one inserted/deleted tuple is sent from an information source lSI to the view site, which is the initial delta relation. Then this delta relation is sent down to the information source lSI to join with other relations in lSI referred to in the view query, and the resulting new delta relation is sent back to the view site. The same process iterates through all the information sources referred to in the view to build up the delta relation that contains the tuples affected by the data update. This is the conventional incremental view maintenance approach. [65] Depending on the distribution of values in the join attributes of each underlying relation, estimates of the number of bytes transferred can be computed by statistical methods, possibly involving sampling [23, 25, 16, 47] or using traditional database statistics such asjoin selectivities, relation sizes, and duplicate counts. [13] View maintenance algorithms that deal with concurrent updates [4] or use parallel algorithms [64] will require a more careful estimation of the amount of data transfer. 4.4. Cost factor based on I/O (CF1jo)

We use the total number of estimated input/ output operations (block accesses) performed by local ISs in order to process incremental view maintenance for each legal rewriting as a criterion to rank the legal rewritings. Let CF1jO(IS,) be the number of estimated I/Os at the information source IS,. CF1jO(IS,) is the sum of the I/Os of the relations that reside at source IS" i.e., incorporating the I/O-costs of all relations

236

Koeller et al.

at ISj • Then the total nu mb er of llOs, den ot ed as C F// o , is the sum of th e 1I0s at all In sources, i.e.,

CFI/ o

= L C F I/ o (I S;) '"

(9)

i =l

Algorithms to estim ate the number of blocks accessed in order to retrieve a tuple from a database are given in th e literature, dating back to Yao. [61] 4.5 . Total view maintenance cost for a single data update

The tot al view maint enance cost of a view V wi th respect to a single data update can now be defined as: Cost ( V)

= C F M · COStM + C F T· COStT + CF 1/ O · COStl/O

(10)

where costlll , costr , and tostu o are the un it pr ices for sending a message, transferring a data block, and performing a disk 110, respectively. We can now comput e th e total view maint enan ce costs, C OS T ( Vi ), for th e upd ates within a certain tim e unit . In ord er to norm alize th e cost for our mod el, we find th e high est and lowest costs, respectively, from all view rewr itings generated , and no rmalize the cost for each rewritin g over the range given by th e maximum and minimum. If we assume that th ere are k legal rewritings for an affected view, the total cost of legal rewriting Vi can be no rmalized as follows:

C OST* ( V; )

=

COST( I?; ) - min m ax (COS T(

l -s. j--== k

1< ° < I.! _1_

I~ ))

(COST(l~))

- m in

l :::: j ~ k

.

( C O S T(l~))

(11)

'

This gives us a view maint en ance cost between 0 and 1 that we can trade off against the view qu ality (Section 3). The rewriting with cost 0 is the best (lowest maint en ance cost), and the rewriting with cost 1 is th e wors t in our model. 4.6. Overall efficiency of a legal rewriting

The overall ifficiellcy of a legal rewriting can now be computed as: QC( I?;)

= 1-

(Qquafity • VV( 11)

+ Qeost • C ost( V; ))

=

(12)

with Qquality' Qco' l 2': 0 and Qqua{ily + Qeost 1. W ith both quality and cost norm alized , this number will be between 0 and 1. If Qqlla{ily > 0 /\ Qeost > 0, an efficiency of 0 means a legal rewriting that preserves the least amount of information amo ng all rewritings at the highest cost. Likewise, an efficien cy of 1 would ident ify a "perfect" legal rewriting preserving th e complete view interface and all tuples at th e lowest possible cost.

Quality and cost of data warehouse views

Clients

237

View Exten t

Figure 4. The framework of the evolvable view environment (EVE).

5. REVIEW OF THE EVE PROJECT

Our quality-and-cost model can be used for a variety of enviroments in which multiple, non-equivalent, queries are generated. However, it was developed in the context of the context of the Evolvable View Environment (EVE) [53, 54], which we will now briefly review as an example for an application of our proposed model. This example will show how the QC-model can be integrated with a query rewriting system and how the features of the model complement the system under consideration. As mentioned earlier, views over distributed information sources are affected by capability changes of such sources. Our EVE -system provides a solution for the problem of views becoming undefined after such meta data changes (Figure 4). Major concepts of this architecture [53] are the registration of information sources and the storage of meta knowledge in a Meta Knowledge Base (MKB) which allows for a certain degree of cooperation between sources and middleware in EVE, the storage of view definitions in a View Knowledge Base (Section 5.1), and the application of View Synchronization Algorithms (Section 2). We give a very brief overview over the functions of those modules:

238

Koeller et al,

• Meta Knowledge Base (MKB) Meta infor mation about participating ISs is stored in the MKB. The MKB consists pr imarily of information about semantic interrelationships observed between different ISs registered in the system. The data in the MKB is either entered manually, or the MKB can be filled partly or fully with the results of a meta-data discovery process (e.g., (30)). • View Knowledge Base Th e view knowledge base stores information about views defined over the ISs by different users. These views are augmented with a user preference model about view evolution (cf. Section 5.1 for Evolvable SQ L, below), • QC-Computation Th is module computes the QC- Value, described in this paper, for newly rewritten views. This value is then used by the View Synchroniz er to select the view rewriting to be used. • View Maintainer Thi s module is responsible for traditional incremental view maintenance after data updates in the sources. In EVE , we have implemented SW EEP (4) for this purpose. • Concurrency Control (SDCC) This modul e handles the complex concurre ncy issues that occur in an environment that has to deal with both data and schema updates. (63) • View Synchronizer W hen und erlying ISs change their schema (not j ust their data), existing view queries have to be adapted in order to keep providing information to their users. This goal is accomplished in EVE by synchroniz ing views with the schema changes of und erlying ISs. [43, 45] • MKB Evolver/Consistency Checker T hese modules update the Meta Knowledge Base according to the schema changes occuri ng in the underlying inform ation sources. • Wrappers connect infor mation sources with the data warehouse, by translating informa tion-source specific query mechanisms and data models into the relation query model assumed for the system.

s.r. A relaxed SQL query modcl-E-SQL We now introduce the E-SQL query language (or Evolvable-SQL), which is our approach towards relaxed query semantics and implements and extends the semantics of attribute replaceability and dispensability, introduce earlier (Sec. 3.2). E-SQL is an extension ofSQ L that has been designed to allow for the specification of relaxed quer y semantics by users. We take the stand that it is most appropriate for the view definers themselves to specify the relaxed semantics at the time of query specification, as they are the ones that know the criticality and dispensability of the different components of their query. T he main idea of E-SQL is to allow a user to specify aspart of a query definition what information is indispensable, what inform ation is replaceable by similar information

Quality and cost of data warehouse views

239

Table 1 Relaxation parameters (preferences) of the E-SQL query language Relaxation parameter

Domain

Default

Attribute-

dispensable (AD) replaceable (AR)

true Ifalse (dispensable/indispensable) fmc Ifalse (replaceable I non-replaceable)

false false

Condition-

dispensable (CD) replaceable (CR)

true Ifalse (dispensable/rndispensable) fmc Ifalse (replaceablelnon-replaceable)

false false

Relation-

dispensable (R D) replaceable (RR)

true Iralse (dispensable/indispensable) fmc Iralse (replaceable I non-replaceable)

false false

View-

extent (VE)

"": no restriction on the new extent =: new extent is equal to old extent ;2: new extent is superset of old extent c;: new extent is subset of old extent

-

from other ISs, and what relationship between original and new query result is desired, if obtaining the original query result becomes impossible in the changed information space. Relaxation parameters (preferences) are associated with the different components of a query, such as the attributes in the SELECT clause, the conditions in the WHERE clause, and so on. Table 1 lists the seven types of relaxation parameters used in E-SQL. It has three columns: column one gives the parameter name and the abbreviation for each parameter, column two the possible values each parameter can take on plus the associated semantics, and column three the default value. When the parameter setting is omitted from an E-SQL query, then the default value is assumed (column 3 of Table 1). This means that a conventional SQL query (without explicitly specified preferences) has well-defined semantics in our model, i.e., anything the user specified in the original query must be preserved exactly as originally defined in order for the query to be well-defined. Our extended query semantics are thus well-grounded and compatible with regular (non-relaxed) SQL semantics. We now use an E-SQL example query (Equation 13) to demonstrate the usage of the relaxation parameters, while for a full description the reader is referred to [53). CREATE VIEW SELECT

Asia-Customer (V E = C.Address,

FROM

C.Phone

(AD

= true, AR = true)

Customer C, FlightRes F

WHERE

"2") AS

C.Name,

(RR = true)

(13)

(C. Name = EPName) AND (EDest

= 'Asia')

(CD

= true)

The semantics of this query are as follows. Any query rewriting is acceptable as long as the new view extent is a superset of the old one (expressed by V E = ":;2"); the attribute Phone is dispensable and can also be replaced from another source (expressed by AD = true, AR = true); the relation FlightRes (but not Customer) can be replaced

240

Koeller et al,

with ano ther relation (R R = true); and the user will still have use for the view even if the second WHERE co ndition canno t be kept valid (CD = true). Furthermore, there are some dependencies between different settings for the relaxation parameter s. For example, it is meaningless for a relation to be marked dispensable if one of its attributes is indispensable. Therefore th e paramet er settings (A D = f alse, RD = f alse) and (A D = f alse, RD = true) for a particular attribute are equivalent. We have develop ed a th eory of strongest E-SQL queries [42], which describes equivalence classes of relaxation parameter settings based on their semant ics. 6. IMPLEMENTATION AND EVALUATION

6. I , Implementation of the EVE System

In th e context of the EVE-project [45, 51, 43, 53, 34], we have impl emented an experime ntal data warehouse maintena nce system that is able to maintain a data warehou se over distributed sources handlin g both schema and data changes of ISs. The system is capable of breaking down queri es and reassembling results from distributed ISs, increme ntal view maint enance using a simple multi source view maintenance algorithm, performing data wareho use evolution acco rding to a view synchronization algorithm [45], computing QC-Values for the rewr iting solutions for a given view and schema change as defined in the cur rent paper and thu s supporting the user in selecting a rewritin g for a view. T he QC value is compured as described in Sections 3 and 4, with th e cost part computed according to th e view synchronization algorithm used. De tails on this meth od are given in the experiment in Section 6.2.3. The entire system is w ritte n in Java and uses a Swing (JFC) user int erface. Co nnec tions to the databases are realized in JDB C with appropriate drivers, which gives us the flexibility to incorporate any relation al DBMS available on the net work . We have tested and run th e system on different combinations of G racie Server 8.0 and MS Access. T he system has been tested on both Wi ndows NT 12000 and Linux. Figure 5 shows an example screenshot of the running EVE-System. An [S provider has just deleted a table and the system has generated four different view rewritings defined over the new infor mation space that could replace the old view. Each view has a Q C-Value assigned to it (see th e left side of Figure 5), and the user can browse th e composition of that Q C-Value from the factors introduced earlier (see the right side of Figure 5), and th en decide which view rewriting sho uld be used. Based on our model, th e rewriting with th e highest QC-Value is th e one' closest' to intent and ext ent of the original view. T he results of this paper have been inco rporated into our EVE system which had previo usly simply picked the first legal view rewriting it discovered and not necessarily the best one. As illustrated in Figure 5, the cur rent system present s a number ofcho ices for a view rewriting to th e user, sorted by their numerical Q C-Value. T he user can then select the exact rewriting for each view, based on th e Q C- Value and its composition from quality and cost factor s (see Figure 5). The implem ent ation of th e EVE system is

Quality and cost of data warehouse views

241

x au:an: VIDl .lITlal.ClIllTOIIER all

m:U:CT

1IS2 . 1IAS 6 ~ 1UU ca . ClIAKE. 152 KAI;5URAJICll . :i'1'lU:L"T. a

Ill2 . 1W ~D RAlI CI1. CITY,

1!J7 . 1lAS!ln RAI C11 .lITaTI:

°

:vnrnlon Yoc Gi on I VQ~ "lo.!!.. 2

Ye, sion

a

IW>lllnr

- -

ac_

0 .7'5

.O, GJ .O._!-!!r-

0 .0

K

I B7 . HAIlBIlRAJICH. 15 1. FLICImQ;S I&IlJ:

1l:2 .1lAl:

RAlIC1I .Cl

lal .rLlC lTIU:ll.FIIAK

lUlIl IllI.FI.mllTlU:!I .D lIT ' To r o n to' AUO ISl. n,U:ImlER.DCPaRTDATC " 970101

01(

CRl:ATI: VIJ:II

sr.r.r.CT

1 ar.CllllTOKI:R All

IB2 .1la8

WICH .CI

•

1:12 . 1IA:;:iUXAf{ClI. trrREL"'T ,

-

I Z.

tn7 . KAS!:DRaNClI . !:TAn:

lJslo/Vcrmm

0 VOc o i on I :vn ~ nl on

Vutulou ·;t

v" r slon J

[

IClI.CITY.

ac_ 0 .7G~

o.n

O .~.u= 0 .0

J'ROK 107 . nAommallCH. l SI. rLICIrT1U:S

IlllEJl:l:

11IZ.Ha

IClI.Cl

- l ll l . r Ll

mu: .

AIID Hit . J'I .IGU1·IU:!I . DJ:PaRTDan:< nOlol

01(

Figure 5. Different rewritings for a view and their respective QC-values.

fully functional, and has been demonstrated at the IBM technology showcase during the CASCON '98 conference [34] as well as at SIGMOD '99. [52] 6.2. Evaluation and discussion

We now set out to verify the validity of our proposed QC-Model and gain an understanding of the interplay between quality and maintenance costs through a number of experiments. Using the prototype implementation described earlier, we conducted experiments to evaluate the QC-Model. The experimental system holding the data warehouse was a Pentium 233 PC with 64 MB RAM running Windows NT 4.0 and

242

Koeller et al.

Java (JDK 1.1.6). As DBMSs, we used instances of Oracle 8.0 on separate Windows NT PCs as server for each IS. We use tables without indexes for predictable assessment of I/O-operations. Where large amounts of data were needed for an experiment, we used synthetic data generated by the TPC/D benchmark data generator. These experiments were conducted in the context of our EVE system, i.e., the rewritings of a view were being generated by our synchronization algorithm. [34, 43) In Section 6.2.1, we discuss the influence of certain parameters of the query and information space on the overall QC-Value. In Section 6.2.2, we assess how the cardinalities ofbase relations may influence both the quality and the cost ofa view rewriting. In the experiment in Section 6.2.3, we show actual performance measures using the Java-based implementation of the system described in Section 6.1, to calibrate the trade-off coefficients to associate with the different components of the QC-model. These empirically determined coefficients are then verified to result in accurate predications ofthe QC-model with the actually measured maintenance costs, for out testing environment. This also has resulted in a methodology that can be used to find cost factors that help to predict actual query execution time if the QC-model is used in a different system. 6.2.1. Influence of relation distribution on view maintenance cost

In this section, we study the relationships between the number and distribution of ISs involved in a view and the incremental view maintenance cost. We first assess the effects of a variation of the number ofISs involved in a view, while fixing all other parameter settings, such as the selectivity and the join selectivity. The purpose is to find a heuristic for a view synchronization algorithm to choose between otherwise similar views (that is, in particular, views with the same number of base relations) if the main difference between views is the number of information sources on which it is based. We look at the three cost factors introduced in Section 4. We observe that: 1. the number of messages exchanged between data warehouse and base relations (CFM ) grows proportionally with the number ofISs, 2. the number of bytes transferred between the warehouse and sources (CF T ) grows with the number of information sources. This is due to the fact that a view based on fewer information sources can accomplish part of its joins inside information sources (Fig. 6). 3. the number of I/O-operations (CFI/o), which refers to the total number of such operations across all base relations, remains roughly the same, since we assume access to the same base relations which are simply stored in distinct information sources with similar system parameters. That is, the view maintenance cost of a single data update tends to be higher for views with many information sources, allowing us to use this fact for a heuristic to guide a view rewriting algorithm. Secondly, we study the effect on the relation

Quality and cost of data warehouse views

243

V= R 1><1 S 1><1 T 1><1 U

View Rewriting 1

View Rewriting 2

Figure 6. Effect of base relation distribution on view maintenance cost.

distribution among information sources. That is, we want to assess whether in terms of view maintenance cost, it is beneficial for a view to have its base relations distributed evenly across information sources, or whether it is better to have most of its relations in one information source and only a few relations in others. Of course, this situation is of interest only for views base on many relations (at least 6 ... 7). Figure 7 shows results for the number of bytes transferred (C F r ) for a number of representative cases. The other two factors (CF.'I1 and CF1/ O ) are not affected. The charts show the number of transferred bytes for a particular view maintenance operation, where we varied the distribution of relations among information sources. The view has 6 relations, which are distributed among 2, 3, or 4 information sources. For example, the leftmost bar in the figure (marked (1, 5)) shows the number of bytes transferred for a particular update for our view, where the view is defined on a single relation in one information source, plus 5 relations in one other information source. We repeated the experiment for different average join selectivities (j s). From Fig. 7, we observe that there is no correlation between the relation distribution and the view maintenance cost. That is, a heuristic that would choose a particular relation distribution from among otherwise similar view rewritings would not be helpful in our environment. 6.2.2. Effect of relation cardinality on QC-value

In this experiment, we study the relationship between the cardinalities ofthe substituted relations and the overall efficiency of the legal rewritings. We conduct this experiment by varying the cardinalities of the substituted relation while keeping all other parameter

N

......

500

~

1/5 2/43/3 1/1/4 1/2/3 2/2/2 (2 sites) (3 sites) Relation Distributions

1/1/1/3 1/1/2/2 (4 sites)

1000

~

~

CO

'"

[J]

1/1/1/3 1/1/2/2 (4 sites)

Worst Legal Rewriting

1/5 2/43/3 1/1/4 1/2/3 2/2/2 (2 sites) (3 sites) Relation Distributions Best Legal Rewriting

2000

~

§

3000

(b)jFO.0022

Figure 7. The relationships between relation distribution and view maintenance cost.

;;, 250 CO

0.> '"

~

<E

'%j

l::

il

~

-e 750

(a)jFO.OOI

~ 75000

CO

~

25000)

E- 50000

§

'%j

-e

)1

100000

0.005

1/5 2/4 3/3 1/1/4 1/2/3 2/2/2 (2 sites) (3 sites) Relation Distributions

(c) .

1/1/1/3 1/1/2/2 (4 sites)

Quality and cost of data warehouse views 245

Table 2 Cardinalities of Lineitem, Lineitemi , ... , Lineitem« Site name

Cardinality

Relation name

Lineitem (Orderkey, Lincitem, (Orderkey, l.incitem-. (Orderkey, Lineitem-; (Orderkey, Lineitems (Orderkey, Lineitem« (Orderkey,

)

) ) ) ) )

4016 2008 3012 4016 5020 6032

settings the same. Let us assume a view V is defined as follows (this is a view defined over the TPC-D schema):

CREATE VIEW V (V E = '~') AS ... , Lineitem. Orderkey (AR = true) ... SELECT FROM Order, Customer, Lineitem (RR = true) WHERE Lineitem. Orderkey = Order. Orderkey (CR = true) AN D ...

(14)

Let us assume that relation Lineitem is deleted by its information provider, and that there are five relations Lineitem«, ... , Lineitem« in the information space that are identified by the view synchronizer to be appropriate substitutes for Lineitem. Five new views, Vi ... Vs, can be defined that are formed by replacing relation Lineitem with the respective relation Lineitem.; The cardinalities of Lineitem and the substitute relations for our experiment are summarized in Table 2. We further assume that the following inter-relationships among these relations hold true: Relation Lineitem, is contained in relation Lineitems, denoted by a CC constraint: CC Lineitewu.Lineitem-. = iLineitem, ~ Lineitems), Lineitem- in turn is contained in Lineitem-; Lineitem-; is equivalent to the deleted relation Lineitem, Lineitem-; is contained in Lineitems, and Lineitems contained in Lineitem-; (i.e., Lineitem, ~ Lineitem-. ~ Lineitem-; = Lineitem ~ Lineitenu ~ Lineitems). Therefore, replacing Lineitem with Lineitem., for 1 ~ i ~ 5, we get five alternate yet legal rewritings with different view extents and view maintenance costs6 . Setting the system parameters to W t = 0.7, W2 = 0.3, QD1 = 0.5, QD2 = 0.5, Qattr = 0.7, Qext = 0.3, costM = 0.521-'-, cost-r = message 0.000623castllo = 0.001961/0 S . , Qquality = 0.9, and Qeost = 0.1, we get the ' , byte -operatlon metrics of quality and cost that are summarized in Table 3 (see also Case 1 in Figure 8). The above coefficients are empirically validated using experiments that are described in Sec. 6.2.3. The other two cases in Figure 8 are obtained with (Qquality = 0.75, Qeost = 0.25) and (Qquality = 0.5, Qeost = 0.5), respectively. In Section 3 we postulated that the degree of divergence D D(i) for a view rewriting Vi will be large for a relation whose size is very different from the size of the original relation, and vice versa. The cost of a legal rewriting will be larger, all other factors equal, with a growing size of the replaced relation(s). Trading off these two factors ('Note that we assume that V E = '~' for this view is given in Equation 14.

246

Koeller et al.

Ta ble 3 R anki ng of legal rew ritings for experime nt 6.2.2 . (Derailed data fo r C ase 1; Qqllaliry = 0.9)

R ewriting I~

1'2 1'3

1'4 1-'5

o D attt

oo.;

0 0 0 0 0

0.25 n. 13

o.on

0. 10 0. 25

Cos t (N or malized Cost )

O f)

0.075 n.n375 0.00 0.027 0.075

39 .5 (0) 45. 1 (0. 17) 50.? (0.33) 56.3 (n.5) 73 .2 (1)

Qc(!1)

R aring

0.9325 0. 9496 n.9667 0.9230 0.8325

3 2 1 4 5

0.9 0.8 VI VI

Q)

0.7

"C

0.6

e

0.5

e 0 0

e 0.4

o .

Q)

> 0.3

0

0.2 0.1 0 2

3

4

5

Legal Rewriting !illCase 1 (qual. 0.9, cost 0.1) (qual. 0.5, cost 0.5)

!2l Case 2 (qual. 0.75, cost 0.25)

~ Case 3

Figu re 8. Re sults of assessing legal rewrit ings for experiment 6.2.2 .

against each other will th erefore lead to differen t results dependin g on how the tradeoff paramet ers are set. Our experimen t validates these findings. For example, when th e parameters are set to (Q'll/ality = 0.9, Ow"t = 0.1, C ase 1), the Q C-Mod el chose legal rewriting 1-'3 over th e other four legal rewritings. Here, we give a high priority to the quality of th e rewriting, which is best whe n the replacing relation comes as close as possible to th e ori ginal relation , whi ch is th e case in legal rewrit ing l'3. The graph depicted in Figure 8 shows that the overall efficiency increases from legal rewriting Vi until I'} (because the size of the replacing relation approac hes the size of th e original relatio n), th en becom es worse as the difference between the relation sizes grows bigger. H owever, in Case 3, with (Q'll/'Ilit)' = 0.5, Q",st = 0.5), the cost has a larger imp act on th e overall efficiency of the legal rewriting. Since th e cost is continuously increasing as th e replacing relations get bigger (i.e., from legal rewritin g 1-'1 to Vs), th e overall efficiency of the rewritings decreases, so rewr iting Vi (with the smallest replacing

Quality and cost of data warehouse views

247

relation) is chosen by our view synchronizer. Even in Case 2, the influence of the cost on the total result is large enough for Vi to be selected as best legal rewriting. Two observations we made from Figure 8 are: elfwe focus our attention on the legal rewritings V3, Vi, and Vs (labeled 3,4, and 5 in Figure 8, rows 3 to 5 in Table 3), we can see that these rewritings are obtained by substituting the deleted relation Lineitem by a superset relation. Among these three legal rewritings, V3 is always ranked highest among the three in various parameter settings. This is because the degrees of divergence (fourth column in Table 3, labeled DD) as well as the view maintenance costs (fifth column, labeled Cost) go up when the cardinalities of the replaced relations go up. For these cases, the trade-off parameters have no influence on what rewriting is selected to be best. A consequence is that if we have only superset replacements at our disposal, the replacement that is closest to the original in terms of the relation size is also the smallest replacement and will always rank best among legal rewritings. e If we focus on the legal rewritings Vi, VS, and V3 (labeled 1, 2, and 3 in Figure 8, rows 1 to 3 in Table 3), these rewritings are obtained by replacing the deleted relation Lineitem with a subset relation. The degrees of divergence of the rewritings go down as the sizes of the replacement relations go up (column four in the table), but the view maintenance cost of the legal rewritings increases with the cardinality of the substituted relations (column five). Therefore, the overall efficiency ofthese rewritings depends on the trade-off parameters. For Case 1, V3 is the best among the three. For Cases 2 and 3, i.e, when the view maintenance costs have a higher weight, then Vi is ranked higher by the efficiency model. 6.2.3. Experiments on accuracy of cost model prediction EXPERIMENTAL DESIGN. We conducted a series of experiments that support the soundness and correctness of the cost part of our QC-Model, namely, to determine how well the estimation that our cost model gives predicts the actual cost of maintenance after data updates. An important result ofthis experimental study is that it yields a method to empirically compute the unit costs costM, cost--, and costl/O (Equation 12), which will be described in this section. For different setups, these parameters will be different but generally constant for a given data warehouse implementation. Thus, for other implementations using this cost model one could use our proposed suite of experiments to calibrate these factors. While the cost part of QC incorporates aspects such as data transported, I/O-cost at the ISs, etc., for these experiments, we measure cost as the (real) time it takes our data warehouse to update its extent after a data update in an underlying IS. Using a fixed view and IS schema, we conducted the following experiments:

1. Inserting tuples into different sized base relations with a constant join selectivity, i.e., with a constant number of tuples joining with the update tuples. The purpose of this experiment was to assess the impact ofI/O-cost (CF I/O) on the QC-Value computation while keeping the other two cost factors constant.

248

Koeller et al.

80

~ CF1/ O 40 40 40 40 40 40 40

6750 6750 6750 6750 6750 6750 6750

2130 4260 6430 8580 10760 15940 21170

I tupd (sec) I 24.5 33.2 37.9 43.0 47.0 52.1 65.5

_ ........_._......-

...... _

_ ...._-_ .......

._-- -._.-

60

~

-: ----

.!!!. 40 Ui o

o

-----""

20

o

o

5000

.......

_._._..-

....---.>:

10000

.... _--

.--

---------

.--A

,

, 15000

VO-Operalions

20000

25000

Figure 9. Execution times for updates on different- sized, constant-selectivity ISs.

2. Inserting tuples into a base relation whose join selectivity changes with its size. This leads to changes ofI /O-costs and network costs (CF I/ O and CF T)' The influenc e of the 1/0 -c osts can be eliminated from the results by using the findings from the previous experiment , thu s allowing us to isolate e FT' 3. Insertin g different sized sets of rand om tuples int o the same information space (i.e., resetting base tables after each experiment). This leads to a changing number of messages, since some upd ates will lead to non-empty join results while others will not join with any tuple in the base relation s of the view. Together with the findin gs from the previous two experiments, the influen ce of the number of messages on view maintenance cost can be assessed. Under the assumption that the thre e cost factors are orthogonal (i.e., linearly independent from one another), we expect to have linear correlation between the (analytically obt ained) cost factor and the (measured) view maintenance time for each of the three experiment s. Using linear regression , we can then dedu ce the actual values for C F M (in messages per second), C FT (in bytes per second) and C FI / o (in IO-op erations per second). If all three exper im ents in fact do show linear correlation, we conclude that the three-factor cost model is sound and the three base factors do no t significantly influence each other. INFLUENCE OF I/O- COST. First, we keep the number of messages and the number of bytes transferred con stant and focus on changing I/O-costs. The values in Figure 9 (columns 1-3) were obt ained using formul as that accurately describe the view maint enance algorithm used in this implementation, whereas the execution time (column 4) was measured using PC system time . Leaving C F'H and C FT constant, we can now compare how well our cost model predi cts the actual executi on cost using the I/ O -cost as main cost factor (columns 3 and 4 in Figure 9). Executing a linear regression on data pairs in tho se two columns and computing the slope of the regression function yields cost I/ O = 1.96 . 10- 3 I/O -osec . . peratIon The correlation coefficient r for an assumed linear correlation is 0.98. We will now assume that the influen ce of the I/O-cost on the total cost that we found is independent

Quality and cost of data warehouse views

50

CF,\1 40 40 40 40 40 40 40

CFT 6750 9620 12490 15360 18230 26840 32580

CFJ / o 2130 4090 6050 8010 9970 15850 19770

tupd (sec)

tadj (sec)

25.5 33.3 38.0 41.6 45.5 65.6 77.2

21.3 25.3 26.2 25.9 26.0 34.6 38.5

.

~

,,40

i---+ v-----

!

1;; 30

<3

/

il20 1;;

249

::I

~10

o

o

5000

10000 15000 20000 25000 Number of Bytes Transferred

30000

35000

Figure 10. Execution times for updates on ISs with varying selectivity.

from the other two cost factors so we can use costl/O to eliminate the influence of I/O-cost in later experiments. The high correlation in this data set suggests a strong linear correlation between I/O-costs and actual execution cost when the other two measures are held constant. This also means that there are no other important influences on the execution cost besides the three factors evaluated here. Next, we evaluate the influence of the amount of data transferred on the view maintenance cost in a similar fashion by running a second experiment. In order to compute the adjusted execution time in Figure 10, we multiply the number of I/Os with the value for costl/O obtained above and subtract this time from the measured execution time (tadj = tupd - CFl /O . costl/o). We expect a linear correlation between columns 2 and 5 in Figure 10, meaning a linear dependency of number of bytes transferred and execution cost when eliminating the other two cost factors. Assuming correlation between C F r and the adjusted query execution time I;dj' linear regression yields a unit cost for the Number of Bytes Transferred of cost}' = 6.23 . 10- 4 ~;~c. The correlation coefficient is 0.97, which suggests a strong linear correlation. INFLUENCE OF THE NUMBER OF BYTES TRANSFERRED (NETWORK COST).

INFLUENCE OF THE NUMBER OF MESSAGES. For Figure 11, we eliminate both network and I/O-cost in the way described above to determine the unit cost for messages: tadj = tupd - C F l /0 . cost I /0 - C F}' . cos(\1. The last line in this table represents a set of updates that did not join with any tuples in the underlying relations. Thus, the I/O-cost is O. Eliminating the other two cost factors, we postulate linear correlation between C F'VI and the adjusted time. Regression yields costM = 0.53 1l1~~~ge with a correlation coefficient of 0.91. This again suggests a strong correlation between the Number of Messages and the total cost when the other two factors are eliminated. We find a remaining constant overhead time for our system of about 4.7 sec which cannot be accounted for using the three cost factors. This time is assumed to be constant for any incremental update, a finding which is supported by our experiments.

250

Koeller et al.

45

~- -

~--- ~ ._

40

CF.\I 16 32 40 48 48 64 64

C FT 791 154 1 171 1 2007 2209 3 123 1056

CF1/ O 5207 944 1 6693 10585 11729 19855 0

t upd

tadj

(sec)

(sec)

2 1.2 39.9 47.5 50.1 56.4 74.1 40.3

10.3 19.6 32.8 28.0 31.7 33.2 39.3

OJ

8 25 20

~

10

~

.

---

i

.1

I ~~ il

.._

J

.....---:-- ~

--..

/

-/

15

5

o

o

20

40

60

80

i 100

Number 01 Messages

Figure 11. Execution tim es for varying number of messages.

CO NC LUSIO N S BASED ON EX PERIM ENTAL ANALYSIS. The high correlation factors suggest th at there is a correlation betwe en our cost factors and th e actual view maintenance cost. Through evaluating cost factors separately, we have found a linear dependency betwee n the three cost factor s and the actual measured executi on time. We also found unit costs that we can use to predict the actual view maintenance cost for a given view in our system. Using these un it costs, we can now evaluate if our cost mod el co rrectly predicts th e execut ion time (cost) for incremental view maint enance for a given view. For this, we use diverse views generated over the same base schema but in different infor mation spaces and co mpute a predicted executi on tim e by multipl ying the respective values of C F u , CFT , and C F 1I O with th e unit costs found in the previou s experim ents. Graphing computed and m easured execution time s and compar ing the m with the ideal line of Measured Cos t = Predicted Cos t, we obtain Figure 12. The figure shows th e correlation between predicted and measured view maintenance costs for a number of diverse views over different information spaces. The line labeled " Ideal" is the optimum, ind icating a perfect prediction of view mainten ance cost for our system. We can see that our cost model predicts the actual measured cost very well. The correlation coefficent between the predi cted and measured cost is 0.96, the standard error for the computation of the predicted value is 4.4 8. NON - UNIFORM DISTRIBUTI ON OF TEST DATA. The previous experiment was carried out using data from the TP C /D benchm ark test, whose data are largely uniform. It is interesting to discuss how our cost model performs und er non-uniform data sets. Th e precision of the cost model on non-uniform data is affected by how precisely th e facto rs C FT , C F M , and C F I I O can be estimated und er different distributions of the base data. The number of messages C FA( will not be affected by data distribution . However, non-uniform data will not have a constant join selectivity and the accuracy of th e prediction of I/O -cos t will decrease also. So th e overall accuracy of the cost m odel will depend on the relative erro rs of the base factors C F T and C F I 10 . It is clear that a small deviation from th e uniform distribution in the base data will have a smaller effect on the cost model accuracy than larger deviations.

Quality and cost of data warehouse views 251

90 80

•

70 U Ql

••

60

•

~ 50 Ui o () 40

• • • • •• .'

~:l

•

gj 30

•

20

•

• Actual " "" Ideal

a

Ql

:E

•

10

o

o

10

20

30

40

50

60

70

80

90

Predicted Cost (sec) Figure 12. Co rrelation between predicted and actual view maintenance cost.

In some experiments that we ran in this context, we established that the base data distribution doe s have an effect on the accuracy of the cost model. If no distribution function for the base data is available, the prediction of II O-cost and number of bytes transferred could have a large relative error. This would make the predictions ofthe cost model less reliable. However, small deviations in the base data distribution do not lead to significant redu ction s in predi ction accuracy. If reliable measures for C F r and C FI / 0 are available (e.g., through a precise estimation ofjoin sizes, even in non -uniform data), our cost model will perform better as well. In addition, the field of estimating join sizes from simple system parameters [46, 24] or by sampling [25, 16, 27] is a very active research area and good solutions are available in the literature. 7. RE LATED WORK

Materi alized views over distributed information sources have been explored for a number of years. First work focused on questions of materi alized view maintenance under data update s in the sources [22, 48, 4]. More recentl y, que stions of optimizing view queries given varying parameters or capabilities of underl ying sources have also been explored. Generally, work in this area assumes that the rewritten view query computes a view extent equivalent to the original one . Prominent approaches that deal with equivalent quer y rewriting include work by Selinger et al. [56] with a recent optimization by Kossmann and Stocker [31J, Jarke

252

Koeller et al.

et al. [28], van den Berg et al. [60], Du et al. [10], or Levy et al. [36]. Also important is the Volcano Query Optimizer Generator by Graefe et al. [19, 18] Some work has been done on rewriting queries using materialized views [37, 49, 40, 57, 50]. This work is relevant to ours, although it generally deals with rewriting queries into equivalent ones using underlying views. Work on rewriting queries using views [35, 38] is used in subsequent work by Levy et al. which is closely related to our EVE project in terms ofits goal ofsupporting views over dynamic environments, but not the approach taken. Levy introduced the notion ofthe world-view as a global, fixed domain model ofa certain part ofthe world on which both information providers and consumers must define views. [39] This work is in some sense an approach inverse to the EVE-approach. [53] Where Levy et al. describe information sources in terms of a world model, we incrementally establish our world model in terms of the available sources. Levy's model provides a solution to a subset of problems that we also solve. It is nevertheless necessary to establish a world model before any source can provide information-a very complicated and often impossible task. Also, the concepts of quality and/or cost are no explored in the context of that work. In an earlier paper [34], we introduce the overall EVE solution framework, in particular the concept ofassociating evolution preferences with view specifications and we introduce several algorithms that achieve view synchronization under deletions of underlying information. [43, 45, 29] All these algorithms generate large numbers of alternative legal rewritings, thus raising the need for an efficiency model. This current paper addresses this need by establishing a model for systematically ranking otherwise incomparable solutions for view synchronization. Arens et al. [5] and the SoftBot project [14] provide similar approaches as Levy which solve similar problems. Although addressing different issues, SIMS' process of finding relevant information sources for a query raises some of the same problems as finding the right substitution for an affected view component in EVE. The SoftBot project has a very different approach to query processing as they assume that the system has to discover the "links" among data sources that are described by action schemas and that is does not use a cost model. While related to our view synchronization algorithm CVS [43], the SoftBot planning process also relies on discovering connections among information sources when very different source description languages are used. Neither SIMS nor SoftBot address the problem of evolution under capability changes of participating external information sources. All these projects do not discuss the problem of comparing non-equivalent rewritings of queries, but rather find some solution to a query without being able to evaluate the query result. Another relevant approach similar to Levy's is the Infomaster information integration project by Genesereth et al. [17] which tries to find the largest subset of data that can be provided for a certain query. This project is based partly on work done by Abiteboul, Duschka et al. [12,2] . . on answenng recursive quenes usmg views. CoBase by Chu et al. [8] relates to our work in that they also use the notion of relaxation of the query extent, similar to our E-SQL approach. [53] Chu established an SQL extension called CSQL (cooperative SQL) which relaxes the strictness of

Quality and cost of data warehouse views

253

SQL-where-conditions, i.e., it relaxes restrictions on the extent, but not the interface of a view query, whereas E-SQL allows for both. Given explicitly available knowledge about an application's domain, queries can be relaxed in a stepwise manner by altering local WHERE-conditions ofa query until it returns approximate results to a user. Chus work differs from ours in that it is limited to relaxing the values oflocal conditions in queries, whereas we handle relaxation of all elements in a Project-Select-Join-SQLquery. In contrast to CSQL, in which a manually established order of relaxation of conditions is needed to compare two rewriting possibilities, we have also defined a comprehensive model of quality and cost to automatically assess the desirability of a query rewriting 132, 33] (of which our algorithms would normally generate several) in order to help a view synchronization algorithm to find trade-offs among query rewritings. Important work on integrating heterogeneous sources in one view using a common semistructured data model (OEM) has been done in the TSIMMIS-project [26, 41] and in a similar form by Abiteboul and others. [1] Incremental maintenance of views over such semistructured sources has also been considered, e.g., by Abiteboul. [3] For the problem of incremental view maintenance, a concept which we use in our performance studies, earlier work has been done by several other projects in the literature. [21] Blakeley et al. [61 are concerned with a centralized environment only. Also, they have looked at incremental view maintenance assuming non-concurrent updates (updates are sufficiently spaced to not interfere with each other, each update reaches the data warehouse before the next update is executed at any of the base relations) . Lately, work on concurrent updates has been done. Based on the concept of updates interfering with each other due to long transmission times between base relations and the data warehouse, these works attack increasingly complex scenarios of handling concurrent updates by collecting update information in queues and handling them in batches. Zhuge et al. 165] introduce the ECA algorithm for incremental view maintenance and report on findings on the cost of their algorithm, but in a different environment from ours (a single information source is assumed). A second paper by the same authors ("Strobe" [66]) extends their findings towards multi-source information spaces, but does not incorporate any performance model or cost studies. Agrawal et al. [4] propose the SWEEP-algorithm, which can ensure consistency of the data warehouse in a larger number of cases compared to the Strobe family of algorithms. Finally, Zhuge et al. [65] contains a performance study. However, their work is limited to a comparison between traditional view recomputation and incremental view maintenance algorithms, and does not address the issue of view rewritings nor compares quality and cost between different rewritings for a query. Preliminary results of this work have been published at the IDC'99 conference [33] and in a one-page poster summary in ICDE'99 [32J. This previous conference paper identified the problem of non-equivalent rewritings and presented a preliminary discussion on the idea of the QC-value. It does however not cover the implementation of the system, does not discuss the importance of workload models for the QCValue, and omits a number of details that are necessary to fully evaluate the approach.

254

Koeller et al.

Furthermore, it does not give an in-depth evaluation of the approach-which is the core contribution of this current work. 8. CONCLUSION

View synchronization refers to the new and important problem of how to maintain views in dynamic distributed information systems. [53] These issuesbecome important as more and more diverse and autonomous database systems are incorporated into large data warehouses. Local meta data updates at information sources participating in a data warehouse will generally cause a view in the warehouse to become invalid. This problem has been addressed by our previous work on the EVE -project. [34, 43, 29] In this work, we now focussed on performance issuesraised by view synchronization. Since view evolution under schema changes of underlying data sources will generate a large number ofpossible rewritings for an original view query, it is necessary to compare these rewritings and identify the best solution to maintaining a view. A novel measure of ifficiency is introduced in this paper that explores the two dimensions of quality and cost and leads to the definition of the QC-Model. This model can be used to establish a ranking among alternate legal query rewritings for an affected view definition. It turns out that a ranking is possible among seemingly incomparable solutions using the QCModel we developed, and that it is feasible to introduce parameters to trade off quality against cost (and also sub-dimensions of either against each other). While we have used a simple cost model in this paper and have not dealt with query optimization, alternative cost models can be incorporated as well, as long as they can correctly predict the incremental view maintenance cost of an arbitrary query under some workload model of updates. A combination of a query optimizer (producing equivalent rewritings) with our approach could for example lead to a system that could find view rewritings that show low divergence (i.e., are very similar to the original view) at a much lower execution cost. We have conducted experiments that analyze the properties of our model, such as correlations between certain parameters. Also, we have run performance measurements and conducted a statistical analysis of the trade-off parameters in the cost model. A high correlation between computed view maintenance cost and actual cost (execution time) was found. The results of this work are being used in the EVE-System in an evaluation module for the view rewritings generated by our view synchronization algorithms. Future work includes a deeper study as to how possible extensions of the model affect the quality dimension of our work, more sophisticated solutions for the cost part of the model (for instance, taking connection cost of information sources into account), and the support of other types of information sources (e.g., semistructured ISs through wrappers). ACKNOWLEDGMENTS

This work was supported in part by several grants from NSF, namely, the NSF NYI grant #IRI 97-96264, the NSF CISE Instrumentation grant #IRIS 97-29878, and the NSF grant #IIS 97-32897. Dr. Rundensteiner would like to thank our industrial

Quality and cost of data wareho use views

255

sponsors, in parti cular, IBM for the IBM partn ership award and for the IBM corpo rate fellowship for one of her graduate students. The autho rs would also like to thank students at the Database Systems R esearch Gro up at W PI for their int eractions and feedback on this research . In particular, we are grateful to Yon g Li and Xin Zh ang for implementing several of the EVE components, including th e MKB , th e VKB, and the view synchro nizatio n algorithms. REFERENCES 11 ] S. Abitebo ul, R . Goldman , J. M cHu gh, V. Vassalos, and Y Zhuge. Views for semistructure d data. In Works/lOp on ManaJlemelltof Scniistruaured Data, Tucson , Arizona, 1997. 12] Serge Abiteboul and O liver M . D uschka . Complexity of answering que ries using materialized views. In ACM , editor, Proceedinys of AC'vt Symposil/m on Principles of Database Systcms, pages 254-263 , N ew York, N Y 10036 , USA , 1998. ACM Press. 13] Serge Abiteboul, Jason M cH ugh, M ich ael R ys, Vasilis Vassalos, and Janet L. Wi en er. Incremental maintenance for materialized views over semistructured data. In Prot. 24th lnt. Coni Very La~~e Data Bases, VLDB, pages 38-49, 1998. 14] D. Agrawal, A. EI Abb adi, A. Singh , and T. Yurek. Efficient View M aint enance at D ata Warehouses. In Proceedinos of SIG M OD, pages 417- 427 , 1997. 15] Y Arens, C. A. Kn ob lo ck, and W - M . She n. Q uery R efor mul atio n for Dynami c Info rmat io n Int egratio n. [oumal ~f lntcllioent It!formatioll Systcms, 6 (2/3):99- 130, 1996. [61 J. A. Blakeley,.P.- E. Larso n, and E W. Tom pa. Efficien tly Up dating Materialized Views . Proccedinos of S IGMOD, pages 61-71, 1986. [71S. C haudhur i, R. Krishnarnurthy, and S. Po tamianos. O ptimi zing Query w ith Materialized Views. In

Proceedino) of IEEE lntemational Conference 0 11 Data EIIJlilleerillg, 1995.

[81 W. W. C hu, M . A. M erzbacher, and L. Berkovich . T he D esign and Implementatio n of C oB ase. SIGMOD Record, 22(2):51 7-522, June 1993 . [9] Wesley W. C h u, Hu a Yang, Kuorong Chiang, M ich ael Mi no ck, Gladys C how, and Chris Larso n. Co llase: A scalable and extensible coo pe rative informatio n system. ltJlelliJlellt lty'lYmatioll Systems aIlS;, 6 (2/3):223-259 , 1996. [101 W. D u, R . Krishnamurthy, and M - C. Shan . Q uery O pti mization in H eterogeneous DB M S. Illtemational Conjerence 011 J/cry LarJle Data Bases, pages 277- 29 1, 1992. [1 1] O liver M . D u sch ka. Q,Il")' Pla11lti",f:! and Optimir ation in [,!{ormat ;oll lutt)!.ratitm . PhD thesis. Stanford U niversity, Stan ford , Ca liforn ia, December 1997. [121 O liver M . Du schka and Mi chael R . Ge nesereth. Answe ring recursive q ue ries using views. In ACM , edito r, Proceedino) 4 ACM Symposil/m Oil Principles o( Database Systems, pages 109-1 16, N ew York, NY 10036, USA , 1997. AC M Press. [131 R . Elmasri and S. B. N avath e. Fundamentals '!( Database S ystems. Th e Benj am in / C um mi ngs Publishing Company, Inc., 1994 . [14] O ren Etzioni and Da niel Weld . A Sofibot-based int erface to th e Int ern et. CO"IIIl /( llicatiolls ol thc A CvI, 37(7):72- 76, July 1994. [151 EV E Proj ect Hornepage: http: / /davis.wpi.edu/dsrg/EVE, 1998. [16] Sum it Ganguly, Phillip B. Gibb ons, Yossi M atias, and Avi Silberschatz. 13ifocal sampling for skew resistant j oin size estim ation. SIGiHOD Record, 25(2):271-281, June 1996. [17] M ichael R . Ge nesereth, Arth ur M . Keller, and Olive r M . D uschka. lnfo master: An informat ion integration system. SIGM OD Record (ACM Special Interest GnJllp Oil Mal ta,~(,' ll e'lt '!f Data), 26(2):539tT. , 1997. [181 G. Graefe, R . L. Co le, D. L. D avison , W. J. M cKen na, and R . H . Wolniew icz. Exte nsible qu ery optimiza tion and parallel execution in volcano. In J. c. Freytag, G. Vossen and D. Maier. ed itor . Q/(ery Processinofor Advanced Database Applications, page 305. M o rgan Kaufmann , San Francisco, C A, 1994 . [19] Goetz Graefe and W illiam J. McKenn a. T he volcano o ptimize r generato r: Exte nsibility and efficient search. In Proceeding: ofIEEE International Conjercnce Oil Data ElIgiliecrillJl, pages 209-21R. IEEE Co mput er Socie ty, 1993.

256

Koeller et al.

[201 A. Gupta, H. V Jagadish, and l. S. Mumick. Data Integration using Self-Maintainable Views. In Proceedings o{lntcrnational Conference 011 Extendinc Database 7eclmology (EDBT), pages 140-144, 1996. [21] A. Gupta, I. S. Mumick, and V S. Subrahmanian. Maintaining Views Incrementally. In Proceedings o{ SIGMOD, pages 157-166, 1993. [22] A. Gupta and l. S. Mumick. Maintenance of Materialized Views: Problems, Techniques, and Applications. IEEE Data Engineering Bultetin, Special Issue on Materialized Views and Warehousing, 18(2):3-19, 1995. [231 Peter J. Haas, Jeffrey E Naughton, S. Seshadri, and Lynne Stokes. Sampling-based estimation of the number of distinct values of an attribute. In lntcrnational Conierencc 011 Very La~~e Data Bases, pages 311322, 1995. [24] Peter J. Haas, Jeffrey E Naughton, S. Seshadri, and Arun N. Swami. Fixed-precision estimation ofjoin selectivity. In Proceedil1gs of ACM Symposium on Principles of Database Systems, pages 190-201. ACM Press, May 1993. [25] Peter J. Haas and A. N. Swami. Sampling-based selectivity estimation for joins using augmented frequent value statistics. In Proceedings o{IEEE lnternational Conference on Data EI1gil1eering, pages 522-531, 1995. 126] J. Hammer, Hector Garcia-Molina, S. Nestorov, R. Yerneni, M. Breunig, and V Vassalos. TemplateBased Wrappers in the TSIMMIS System. In Proccedinos ofSIGMOD, pages 532-535, 1997. [271 Wen-Chi Hou and Gultekin Ozsoyoglu. Statistical estimators for aggregate relational algebra queries. ACM Tral1sactiol1s 011 Database Systems, 16(4):600-654, December 1991. [28] M. Jarke and J. Koch. Query Optimization in Database Systems. ACM Computiny Surveys, pages 111-152,1984. [29] A. Koeller, E. A. Rundensteiner, and N. Hachem. Integrating the Rewriting and Ranking Phases of View Synchronization. In Proceedings of the ACM First lntcmational l1iorkshop on Data Warehousing and OLAP (DOLAP'98), pages 60-65, November 1998. [30] Andreas Koeller and Elke A. Rundensteiner. Discovery of high-dimensional inclusion dependencies. Technical Report WPI-CS-TR-02-15, Worcester Polytechnic Institute, Dept. of Computer Science, 2002. [311 Donald Kossmann and Konrad Stocker. Iterative dynamic programming: a new class of query optimization algorithms. ACM Transactions on Database Systems, 25(1 ):43-82, March 2000. [321 A. J. Lee, A. Koeller, A. Nica, and E. A. Rundensteiner. Data Warehouse Evolution: Trade-ofI, between Quality and Cost of Query Rewritings. In Proceedings of/EEl'. International Conierence on Data Engil1eerin,~, Special Poster Session, page 255, March, Sydney, Australia 1999. [331 A. J. Lee, A. Koeller, A. Nica, and E. A. Rundensteiner. Non-Equivalent Query Rewritings. In Proceedincs o{the 9th International Databases Conference, pages 248-262. City University of Hong Kong Press, Hong Kong, July 1999. [34] A. J. Lee, A. Nica, and E. A. Rundensteiner. Keeping Virtual Information Resources Up and Running. In Proceedings o{ IBM Cenue for Advanced Studies Conference (CASCON'97), Best Paper Award, pages 1-14, November 1997. [35] A. Levy, I. S. Mumick, Y. Sagiv, and O. Shmucli, Equivalence, query reach ability and satisfiability in datalog extensions. In Proceedings of the Ttl'elfih ACM SIGACT-SIGMOD-SIGART Symposium 011 Principles oiDatabasc Systems, pages 109-122, Washington, DC, 25-28 May 1993. [36J A. Y. Levy, lnderpal Singh Mumick, and Y. Sagiv. Query optimization by predicate move-around. In Jorgeesli Bocca , Matthias Jarke, and Carlo Zaniolo, editors, International Conference on Very Laroe Data Bases, pages 96--107, Los Altos, CA 94022, USA, 1994. Morgan Kaufmann Publishers. [371 A. Y. Levy, A. Rajararnan, and J. D. Ullman. Answering queries using limited external processors. In pods, pages 227-237, Montreal, Canada, 3-5 June 1996. [38] Alon Levy and Yehoshua Sagiv. Constraints and Redundancy in Datalog. In Proceedings o{the Eleventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles ofDatabase Systems, [une 2-4, 1992, San Dicoo, CA, pages 67-80, 1992. 139] Alon Y. Levy, Divesh Srivastava, and Thomas Kirk. Data model and query evaluation in global information systems. [ouma! of Intellioent lnjormation Systems-Special Issue on Networked Information Discovery and Retrieval, 5(2): 121-143, 1995. [40] A. Y. Levy, A. O. Mendelzon, and Y. Sagiv. Answering Queries Using Views. In Proceedings of ACM Symposium on Principles o{ Database Systems, pages 95-104, May 1995. [41] C. Li, R. Yerneni, V. Vassalos, Hector Garcia-Molina, Y. Papakonstantinou, J. D. Ullman, and M. Valiveti. Capability Based Mediation in TSIMMIS. In Proceedings of SIGMOD, pages 564-566, 1998.

Quality and cost of data warehouse views 257

[42] A. Nica. View Evolution Supportlor lniormation lnteyration Systems over DyHamie Distributed lnformation Spaces. PhD thesis, University of Michigan in Ann Arbor, in progress 1999. [43] A. Nica, A.J. Lee, and E. A. Rundensteiner. The CVS Algorithm for View Synchronization in Evolvable Large-Scale Information Systems. In Procecdinos of lnternational Conference on Extendino Database Tcehnoloxy (EDBT'98), pages 359-373, Valencia, Spain, March 1998. [44] A. Nica and E. A. Rundensteiner. On Translating Loosely-Specified Queries into Executable Plans in Large-Scale Information Systems. In Proceedinos of Second IFCIS International Conference 011 Cooperative lniormation Systems CoopIS' 97, pages 213-222, June 1997. [45] A. Nica and E. A. Rundensteiner. Using Containment Information for View Evolution in Dynamic Distributed Environments. In Procccdinos of International Workshop on Data Warehouse Desion and OLAP Technoloo» (DWDOT'98), Vienna, Austria, August 1998. [46] Gregory Piatetsky-Shapiro and Charles Connell. Accurate estimation of the number of tuples satisfying a condition. SIGMOD Record, 14(2):256-276, 1984. [47] Viswanath Poosala and Yannis E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In International Coniercnce on Vcry Laroe Data Bases, pages 486-495, 1997. [48] D. Quass and J. Widom. On-Line Warehouse View Maintenance. In Proceedinos of SIGMOD, pages 393-400, 1997. [49] A. Rajaraman, V. Sagiv, and J. D. Ullman. Answering Queries Using Templates With Binding Patterns. In Proccedincs of ACM Symposium on Principles of Database Systems, pages 105-112, May 1995. [50] A. Rajaraman and J. D. Ullman. Integrating Information by Outerjoins and Full Disjunctions. In Proceedings of ACM Symposium Oil Principles of Database Systems, pages 238-248, 1996. [51] E. A. Rundensteiner, A. Koeller, A. Lee, V. Li, A. Nica, and X. Zhang. Evolvable View Environment (E V E) Project: Synchronizing Views over Dynamic Distributed Information Sources. In Demo Session Procecdinos of International Conierencc all Extendiny Database Technoloxy (EDBT'98), pages 41-42, Valencia, Spain, March 1998. [52] E. A. Rundensteiner, A. Koeller, X. Zhang, A. Lee, A. Nica, A. VanWyk, and V. Li. Evolvable View Environment. In Proceedinos of SIGMOD'99 Demo Session, pages 553-555, May 1999. [53] E. A. Rundensteiner, A. J. Lee, and A. Nica. On Preserving Views in Evolving Environments. In Proceedinos of 4th Int. It,,rkshop OH Knouiledoc Representation Meets Databases (KRDB'97): lntellioent Access to Hcteroocneous lniormation, pages 13.1-13.11, Athens, Greece, August 1997. [54] Elke A. Rundensteiner, Andreas Koeller, and Xin Zhang. Maintaining Data Warehouses over Changing Information Sources. CommunicatiollS of the ACM, pages 57-62, June 2000. [55] Torsten Schlieder. Schema-driven evaluation of approximate tree-pattern queries. In Proceedinos of International Conicrcncc OH Extendino Database Technology (EDBT), volume LNCS 2287, pages 514-532. Springer, 2002. [56] Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, and Thomas G. Price. Access path selection in a relational database management system. In Proceeding, of SIGMOD, pages 23-34. ACM, 1979. [57] D. Srivastava, S. Dar, H. V Jagadish, and A.V. Levy. Answering Queries with Aggregation Using Views. In lnternational Conference on viTy Larxe Data Bases, pages 318-329,1996. [58] Anja Theobald and Gerhard Weikum. Adding relevance to XML. Lecture Notes in ComputerScience, 1997:105-??,2001. [59] Anja Theobald and Gerhard Weikum. The index-based XXL search engine for querying XML data with relevance ranking. In Proccedincs of International Conference on Extendino Database 'Iechnol0XY (EDBT), volume LNCS 2287, pages 477-495. Springer, 2002. [60] C. A. van den Berg and M. L. Kersten. An Analysis of a Dynamic Query Optimization Schema for Different Data Distributions. In J. c. Freytag, D. Maier, and G. Vossen, editors, Query Processino J'lY Advanced Database Systems, chapter 15, pages 449-473. Morgan Kaufmann Pub., 1994. [61] S. B. Yao. An Attribute Based Model for Database Access Cost Analysis. ACM TransactioHs On Database Systems (TODS), 2(1):45-67, March 1977. [62] X. Zhang, L. Ding, and E. A. Rundensteiner. PSWEEP: ParallelView Maintenance Under Concurrent Data Updates of Distributed Sources. Technical Report WPI-CS-TR-99-14, Worcester Polytechnic Institute, Computer Science Department, May 1999. [63] X. Zhang and E. A. Rundensteiner. The SDCC Framework for Integrating Existing Algorithms for Diverse Data Warehouse Maintenance Tasks. In International Database Enxineerinx and Application Symposium, pages 206-214, Montreal, Canada, August, 1999.

258

Koeller et al.

[64] Xin Zh ang, Elke A. R undensteiner, and Lingli Ding. PVM: Parallel View Maintenance U nder Con current Data Up dates of Distributed Sources. In Data ll 'tIrellOllsil/,l1 and KI/o",/et{~e Discovery. Proceedings, M un ich , Germany, September 200 1. 23G-239. [65] Y. Zhu ge, Hecto r Garcia-M olina, J. Hammer, and J. Wi dom. View Maintenance in a Warehou sing Environm ent. In Proceedinos of SIGM OD, pages 316- 327, May 1995. [66] Y. Z huge, Hector Garcia-Molina, andJ. L. Wiener. The Strobe Algori thms for M ulti- Source Warehouse C onsistency. In lnternational Conf erence OIl Parallel and Distributed lniormation Systems, pages 146157, Decem ber 1996.

WEB DATA EXTRACTION TECHNIQUES AND APPLICATIONS USING THE EXTENSIBLE MARKUP LANGUAGE (XML)

JUSSI MYLLYMAKI AND JARED JACKSON

1. INTRODUCTION

The driving force behind the technology revolution has always been just one thing: information. Almost every invention related to the computer since the transistor has been made to aid in the transferring of a piece of information, or data, from one place to another. Despite the existence of a primitive form of what we now know of as the Internet, less than one generation ago digital information mostly needed to be carried around on magnetic devices such as tapes and disks. Fortunately, the prominent rise of the Internet and the World Wide Web in the mid-1990s removed the barrier that physical transportation of data placed on us. Today, nearly every company, institution, or organization of note makes use of now ubiquitous Web technologies and avails all connected to the Internet of an abundance of information. Product catalogs, financial reports, service offerings, published information such as news reports, and more are stored on servers waiting to be queried by anyone from anywhere around the world. This wealth of information can be extraordinarily powerful for those who are able to filter through it and use it to their advantage. The aim of this chapter is to introduce the key concepts behind how Web-based data is distributed and how this data can be collected in an efficient manner for future processing. First, a brief description of the relevant technologies that make up the Web will be given. This description will then be augmented with an examination of how data is delivered using Web technologies. With these concepts understood, we will 259

260

Jussi Myllymaki and Jared Jackson

illustrate how to use common tools of the Web in order to recreate the data sources used by those serving up the data we are interested in and store them in such a way that we can use the data for our own purposes. The recreation of external data affords us the opportunity to work with the data in real time, cache it for later processing, and conduct analysis on data cumulated over time. These advantages show the rising importance of data extraction and the need for modern businesses to understand the technology. 2. WEB DATA EXTRACTION

2.1. Why Web data is important

Since Web-based data extraction is not an effortless process, we need to ask whether we gain anything by it in the first place. There are many sources of data, some easier to process than others. Print and voice sources are the most difficult to work with, but still are widely used. In complete contrast, some companies and organizations now offer direct connections to portions of their databases through Web Services [31] or other similar technologies. These technologies allow others to work directly with external data without involvement in the middle layer ofWeb data extraction. A major drawback to extracting information from print-based media is that it is quickly made obsolete, while the dynamic nature of the Web allows for continual updating of the desired data. While there are no considerable drawbacks to using Web Services, they are unsurprisingly rare to find for accessing proprietary information. For instance, if some companies were to provide the information they make availablevia a Web-based catalog of products by exposing portions of their database directly, they may offer some of their competitors an easy to obtain advantage over them. For this reason key information is often only available through Web pages and not as Web Services. Extracting information is not without challenge. On many sites, particular Web pages require some form of access control or authentication in order to view them, such as requiring a user to log on to the site with a site-determined username and password. The various challenges behind Web data extraction and their solutions will be covered later in this chapter. So what value does all of this data bring to us? First there is a cost consideration, since many companies may charge large sums ofmoney for services delivering data that is already available for free on their own Web site. Despite some technical challenges, there are many applications that can make valuable use of information. The possibilities are bound only by the creativity of the developer. Already applications exist for integrating information and presenting consolidated results, gaining competitive intelligence, managing supply chains, implementing competitive pricing and advertising, etc. New applications of this technology are being discovered and applied constantly. 2.2. Core technologies behind the World Wide Web

Before any Web-based data extraction can begin, a basic understanding of how the information flow ofthe Web works needs to be gained. The architecture ofthe Internet has many components. Web servers are machines connected to the Internet that accept

Web data extraction techniques and applications

Client

.

Applir:Il1iOll

117TJJRe'1Uf'st

(filII" du,"'

Server

di",,; 1{tn'

261

Hp..~ponM~

..

Appi'r.ll1iolt (JSP, ASJ~

(I/TMI- IfH,P"".) '--

f.'lf~.)

--'

Figure 1. Interplay of Web Technologies: Web Client, Web Server, HTTp, HTML, Web Application, and Backend Database.

queries, or data requests, from other machines connected to the same network. The way in which the request to the Web server and its response are communicated on the Web is through a protocol called the Hyper Text Transport Protocol, or HTTP [11]. HTTP requests are sent from the requesting computer to the Web server and HTTP responses are returned with the data that has been requested. The most common scenario for this is when a computer user enters a Uniform Resource Locator (URL) into a Web browser and a Web page is returned, formatted by the browser, and presented visually to the user. The most common response to a HTTP request is in the form of Hyper Text Markup Language, or HTML [10]. HTML is a text-based, human-readable way of formatting a document for presentation to a human reader. HTML works by placing tags around portions of the page's content to alter its presentation or add meta-data to the document. HTML is the pre-rendered form of Web pages, and is the primary source used in Web data extraction. A similar technology, the eXtensible Markup Language, or XML [36), has gained prominence oflate due to its common use in transferring Web data that is not necessarily rendered as a document within a Web browser (e.g. in Web Services). XML is similar to HTML in its structure, but instead of formatting data for presentation to a human reader, it formats the data to be easily processed by a computer. XML is text based and human readable, just like HTML, which makes it both easy to learn and use. Web technologies are used to connect Web servers, HTTp, HTML, and XML together to deliver information from one source to another. Figure 1 demonstrates the inter-workings of these technologies. The request is sent via HTTP from a client machine to a Web server. The server processes that request and formats the resulting response in either HTML or XML and returns a response to the requesting client machine. 2.3. The challenges of web data extraction

The greatest challenge faced in extracting Web data comes from the loosely structured nature of the Web. During the browser wars of the mid-1990s the most popular Web browsers engaged in a pattern of becoming more and more tolerant of ill-formatted HTML pages. This had the positive effect of making pages otherwise unreadable available for browsing, however the long-term effect was that major errors in Web

262

Jussi M yllymaki and Jared Jackson

pages were never fixed and in fact were prop agated and multiplied in future iterations on those pages. This is evidenced by Web pages whose source code is missing requ ired tags or whose format and syntax are almost unreadable, even to a develop er trained in Web page constr uction. Even the most widely used sites on the Internet today are often full of pages that do not conform prop erly to the published HTML standard. A second challenge presents itself in the dynamic nature of the Web. Few pages of note on the Web are statically defined, meanin g that a Web request simply looks up an HTML file on the Web server's file system and returns that file unaltered. Instead, Web servers often compose their responses from a variety ofdata sources, any ofwhich could change at any time . The most common ~ ce nar io for " interesting" Web data come s from a Web server communicating directly with a back-end database. Electronic commerce (E-co mmerce) Web sites are a goo d example of this scen ario. Product information is stored and manipulated by th e owning company on a central database, and when a Web browser requests the information , the server automatically retri eves the information from the database, allowing the data to be updated and present ed accurately in real tim e. This dynamic presentation of data does no t mean that ther e is no und erlying order to work with. If not the task of data extraction would be nearly impossible. The typical working of a Web server is to insert the variable data into the Web page response through the use of some sort of template. A template defines the un changing port ions of the Web page and provides the Web server with window s where it can put dynamic conte nt . Examples of these template technologies include Sun'sJava Server Pages (JSP) [1 6], Microsoft's Active Server Pages (ASP) [1], and the Extensible Style sheet Language (XSL) [40]. Like the Web pages themselves, these templates are susceptible to change over time as Web develop ers add or remove features on the page or even ju st update the page to change its look and feel. T he last primary challenge then in extracting Web-based data is to make sure the solution is robust. T his means that our extraction technique sho uld not fail in light of minor changes from page to page or from iteration to iteration of the template . Wh ile this may seem like a monum ental task, there are good techn iques that we will elabo rate on that make this work less daunting. It is also imp ortant to note, that while small changes within the templ ate are somewhat common, an empirical analysis of Web sites owned by corporation s has shown that these templates rarely change in any large-scale fashion . Companies invest heavily in the development of their own look and feel and the costs of changin g to a new templat e are so high that it is reasonable to rely on the continued use of particular templates on one Web site for several years. Given these challenges, the goal of Web data extraction is in effect to impose order and strict structure on data that is at best semi-structured. T his is often possible because the templates give us j ust eno ugh com mo n struc ture across similar pages in a site and over time that we can still identify the por tions of the page we deem relevant . The challenges illustrated above provide a preview of the considerations that have to go int o the development of the techn ology for Web data extraction. T hese techni cal

Web data extraction techniques and applications

263

obstacles generally lie in the broad categories ofdesign, change, and solution. We should now note that there are other challenges that present themselves in the accomplishment of this goal that are less technical in nature. In Section 3 we explore further these technical challenges and other problems that come up that are less related to the direct extraction of data, such as legal considerations. Of course, we will also examine the solutions that may be used in order to overcoming these problems. 2.4. Using XML technologies in web data extraction

Our technique for extracting data from a Web source is to transform the information given to us from the Web server into an XML document. It is certainly legitimate to ask why XML is used at all. Why not just use our transformation mechanism to store the extracted data directly into our own database since that is our ultimate goal? While inserting data directly into a database is certainly possible, there are several advantages to using XML as a middle layer. One such advantage is that XML provides a method of imposing schemas on documents that is both easy to read and flexible. Since robustness is a key factor in the world of changing Web sources, extraction developers will need to be able to adapt their data models to the changing information at hand. This is a much more complicated task when dealing directly with database calls. Adding to this advantage is the core integration that modern databases now have with XML. The most recent versions of all top-of-the-line databases have tightly integrated processes for importing and exporting data to and from XML documents. The integration tools offered by these databases will only improve in the near future. Thus we can leverage the ease of use and adaptability of XML without adding too much overhead to the entire process. A second advantage to XML is its relation to existing Web sources. XML and HTML have much in common and mapping data from one to the other has become simple using XSL, another common Web technology. Also, as many Web sites begin adding Web Services support, the source of our data to extract will already be in XML, and we can use the same process to go from the XML given to our desired XML as we use to translate data from the HTML source to our XML result. This process has become a de facto standard in working XML and is easily integrated into most modern business systems. With this overview of the technologies to be used in Web data extraction we need now to consider the business requirements and systems behind the technology. We discuss these in the next section. 3. FROM WEB TO SYSTEMS

3.1. Business requirements

While data extraction can be applied to any application domain that benefits from the public information available on the World Wide Web, it is particularly advantageous to companies that wish to incorporate external information and knowledge into their decision-making processes. For example, information on the pricing and features of

264

Jussi M yllymaki and Jared Jackson

com petitive produ cts on the market is a natural input to the pricing strategy of any company. These modern business systems impose more stringe nt requ irements on the data extractio n process than wo uld otherwise be the case. For instance, a un iversity research team that wishes to retri eve news articles from the Web for research purposes might want to con trol th e extractio n process manually and use the file system as the repository for the resultin g news article files. In contrast, the market analysis departm ent of a company might want to run the same extraction process continuo usly and have the news articles flow into a business intelligence engin e, which in turn triggers alerts to execu tives who are interested in certain topics appearing in the news. T he company nee ds a continuous, reliable data extractio n process that work s silently in the background ("lights-off operation"), requires little manu al effort, and provides powerful monitor ing and administration tools. E- commerce is a prime examp le of a business process that can both provide and utilize real-time product information. Suppose company A is int erested in retrieving information from the E-commerce server of another comp any B which may be their supplier, vendor, comp etitor, or business partner. The anatomy of such an E- commerce server consists of three main components typically: a backend database where data is stored, an application server that contains programs for accessing the database, and a Web server that provides the visual inte rface to the system in the form of HT ML pages. The backend database may be tightly integ rated int o the business process of the com pany or it may just be extracted daily from some other database which the company does not want to make public. The application server runs the business logic, for example a sho pping cart management and invoice pro cessing. Th e Web server is configured with a set of HTML templates that convert data from the database int o HTML pages. 3.2. Database-centric data extraction

A shallow data extraction process would attempt to use a Web crawler to find as many produ ct pages on the E- commerce server as possible and extract the informatio n contained on all of those pages. A deeper, more aggressive approach is to attempt to replicate the actual backend database of the target E- commerce server. In essence, the goal is to copy the remote database as completely as possible by accessing it through the Web server. The retriev ed data is stored in a local database that is structured as identically as possible to the remote database but is initially empty. Appropriate data mappings between the remote and local database are required if the precise structure (database schema) of the remote database is not known. This will commonly be the case since no t all aspects of the remote database are visible thro ugh the Web server. Database metadata such as consistency rules, constraints, and triggers arc examples of items that are not visible th rou gh the Web server. While these me tadata may be dedu cible from the data itself, the primary obj ective of a com pany in this situation is to get hold of the data itself and not the metadata. This database-centric view suggests the followin g model for perfo rm ing the data extraction. A crawler is used to periodically fetch pages from the target Web site and

Web data extraction techniques and applications

265

extract data as a set ofwell-structured XML documents. The XML data is converted to a set ofinsert and update operations on the local database. Performing those operations refreshes the local database so that it contains a replica of the remote database. The company can now execute sophisticated data analysis or decision support applications on the local database without requiring continuous access to the remote database. 3.3. Crawler-based data extraction

The basic mechanism for retrieving Web pages in a controlled, automated fashion is a well-understood topic. Search engines are a prime example of systems that involve fetching pages from Web sites. Some of the key parameters used to configure crawlers (also known as spiders and robots) are seed URLs, crawling depth, and include/exclude rules. The crawler starts by retrieving pages listed in its seed URL set. The seed URLs point to one or more pages on the target Web site one wishes to extract data from. On an E-commerce Web site, a likely seed URL would point to the root page of the product catalog section of the site. In a corporate intranet, the seed URL set could include the home page URLs of different business units, organizational units, or geographic units. A corporation may have a well-connected intranet, in which case listing the top home page of the intranet as the seed URL is sufficient. In large corporate intranets, this is not likely to be sufficient and the root page of each disconnected sub-intranet needs to be listed individually. Crawling depth specifies the number of hops a crawler will move away from a seed URL. A depth of 1 means that the crawler will fetch seed URLs and all pages they directly point to via hyperlinks. Increasing the depth to 2 extends the coverage of the crawl by also fetching pages pointed to by pages that are directly linked to the seed URLs. The crawling depth parameter has an exponential effect since every page contains links to many other pages. Since there may be many paths from a seed URL to a given page in the network, it is important to eliminate duplicates so that the same page (and all pages it points to and the pages they point to!) is not fetched multiple times. The crawler can be configured to follow certain links and not others. A rule-based approach is a powerful method for achieving this. A rule specifies a URL pattern based on the protocol, hostname, and other parts of the URL syntax. An include rule says that any link that matches the pattern is followed. Conversely, an exclude rule tells the crawler not to follow a matching pattern. The rules are listed in some precedence order. For example, it might be necessary to say that pages at the Web site mycompany.com are to be crawled, except those that use any protocol other than HTTP, but FTP links to ftp.mycompany.com should still be crawled. The concept of seed URLs, crawling depth, and include/exclude rules brings us to the notion of crawling scope. In Web data extraction, the goal of crawling is to fetch certain, interesting portions of a Web site or sites. The goal can be stated more formally as follows: retrieve pages that are of interest by starting at a convenient seed URL and following direct or indirect links to pages that contain interesting information. Retrieving any page that does not directly contribute to the goal is wasteful and

266

j ussi Myllymaki and Jared Jackson

'~ A(n~ Gizmo Suprrslore - MICrosoft Internet. Explorer

'"- __

Ciz o

rper t re

Store Locator Employment Opportunities

"Acme Gizmo Superstore is committed to beingyour one-stop shop for aD your Gizmo needs. You won't find a better product at a better price than you will at Acme'

-- CEO. A C. Goodstuff

r

a My Conl><J;er

Figure 2. Hom e Page of the AC ME Gizmo Supe rstore Web Site.

indeed counterproductive. Minimizing the number of unn ecessarily fetched pages helps reduce the load on the crawler, and perhaps more imp ortantly, on the network and target Web site. If the load placed on the target Web site is excessive, it is likely that th e owner of the Web site no tices the surge in traffic and asks the crawler to stop crawling that site. De termining the optimal seed URL set, crawling depth, and include /exclude rules is a difficult problem in general but tractable in practice. O ne techniqu e is to ask a hum an user to browse the target Web site and visit pages that contain int eresting information . A tool can record the URL of pages visited, starting from the hom e page of the target Web site and traversing the navigation al struc ture down to the pages one wishes to extract. If a sufficient numb er of pages are visited and recorded, one can apply data-mining techniques to discover common patterns.

Web data extraction techniques and applications

"!II Acm e:Gizmo Super'ltore • Mlcroi:ofl: Intet'nrt

267

E1(plor~

Cizmo Su erstore Store Locator Employment Opportunities

Computer and Electronic Gizmos Computer D esktops

• • • •

IBM Tosluba Den Apple

Flat Scre ens

• 15 inch • 17 inch • 21 inch

Compute r La pt ops

• • • •

IBM Tosluba Den Apple

Hard Drives

• 80 GB and over • 40 to 80 GB • 40GB and lower

Ii) fle :/IIC:/Do
Monitors

• 15 Ulch • 17 inch • 21 inch Accessories

• • • •

Printers Scanners Keyboards Mice

rr.

CJ

MyConw:er

..::.J

M

Figure 3. List of Categories and Subcategories on the ACME Gizmo Superstore Web Site.

To illustrate the process of determining proper crawler configuration parameters, consider an online shopping site called "ACME Gizmo Superstore." The home page of the company (Figure 2) provides links to many parts of the Web site, one of which is the product catalog link titled "Acme Products" (Figure 3). The product catalog page lists several product category-subcategory combinations such as "Hard Drives-40 to 80 GB" which in turn lists individual products (Figure 4). Additional product pages are shown in Figure 5. By following the links as just described, it is possible to start at the home page and get to the product pages by following 2 links (crawling depth 2). However, since there are no direct links to product pages from the home page and from the home page one must go to the product catalog page, we consider the product catalog page to be a better starting point, or seed URL. It lets us reduce the crawling depth to 1, which

268 Jussi Myllymaki and Jared Jackson

11 Aum: Gizmo Super~tore - Mluo'Jort Internet Exp lorer

:.,:;

C izmo Superstore Hard Drives - 40 to 80 GB

Acme Products Store Locator Employment Opportunit>es

~ -

10 Ii '-'

~

1_. . .-...

Manufactur er: Zygnot Hard Drives N ame: MegaAwesome 2.0 Capacity:

60 GB

Speed Price .

5400 RPM $75.00

M anufactur er. ZIPStorage 1000 N ame:

Quick Tum SE

Cap acity: Speed Price

55 GB 10000 RPM $84.99

Manufacturer: Zygnot Hard Drives Name Capacity:

Store Star 3000 80 GB

Speed:

7 200 RPM

Price

$89.00

Manufacturer: Eludable Electronics Name.

Zipper Beta 2

rt

r"".'lr.M-;-Y~CalllU: --:"."r'-""';;;~ I

Figure 4. List of Products on the ACME Gizmo Superstore Web Site.

improves crawling efficiency and reduces the possibility that the crawler would wander around into parts of the Web site that it was not meant to go to. An alternate configuration would be to list all product category pages as seed URLs (and reduce crawling depth to 0), but this would require us to maintain that list over time. For example, if a new category is added, the corresponding category page needs to be added to our list. Therefore, it is preferable to start from the product catalog page and just follow whatever categories the Web site happens to have at the time ofcrawling. Next, we need to figure out the appropriate include/exclude rules. The seed URL is implicitly included in the include rule but we can still list it explicitly. The URL to product category pages appears to follow a common pattern "category.Xi.Y/index.html" where X denotes the category name and Y is a subcategory name. We add an include rule "category_*" where the asterisk matches any category and subcategory combination.

Web data extraction techniques and applications

269

• Cizmo Sup erstore

Cizmo Sup erstore Hd

l"_

c.. ..,. ... Prk.

!h~_IO

~_

"OG!

C.,ecit.,

~ oouu

~

us 00

Prin

M""'~_ ~~

__ 1Q N_

QwkT.. -...

C", -.riIJ $pH.

~GB

'~

Drh n . 10 CB _ Illhlp... lf~... tww ~

)f'-(~ ~

IOOOOUU

;:~I':!

c.....r

25Gli

J(_

z.n.-~P.-

,s,..472OOIU'N 139 00 Pritt }.f. _ _ ~

I

It!rpA_ i'b B2oJ!

:.4oouu, In 00

~,::--· ~Slot,: ~~ C.,ariCJ

90GB

Prit.

l l 2()OO

s,....

l0(00 )J' ),{

I..J._ ~

N_

Sb f Sr. SOOO

Sp.t'

12OO lJ'U

Pric.

", 00

C.,1tt'iIJ

~~aB

r

~---l

.:J

Figure 5. Additional Product Listing Pages from Acme Gizmo Superstore Web Site.

Suppose we do not want to crawl products in certain categories, for example those in the Accessories category. The exclude rule "category.Accessories" /index.html" tells the crawler not to follow links leading to the Accessories category. We can also tell the crawler not to follow links that use other protocols, such as HTTPS, FTp, or NNTP. Figure 6 shows the resulting crawling configuration expressed in the XML syntax adopted by the Grand Central Station crawler [32]. 3.4. Challenges

Implicit in the processes described above is the assumption that data extraction is robust and trouble-free. However, as we have already highlighted earlier, detailed analysis of the steps involved in data extraction and their legal ramifications raise several challenges. These challenges may be broken into four main categories: legal, semantic, design, and change management.

Legal challenges. While information published on the World Wide Web is by and large public, the right of companies to automatically extract it and use it for their business advantage is debatable. Product information in particular has been aggressively protected by companies that own the data but want to publish it on the Web for human users to see. A case in point is the lawsuit filed by auction company eBay against "auction aggregator" company AuctionWatch.com in the late 1990's [6]. The lawsuit claimed that AuctionWatch. com was illegally retrieving auction data from the eBay Web site and republishing it on their own Web site. A casual user was not made aware of the fact that the data originated from eBay and furthermore was shielded from the advertising eBay wanted to display together with the auction information. The lawsuit was settled out of court.

270 Jussi Myllymaki and Jared Jackson

<seed-list> <seed url=''http://acmegizmo/catalog.html''/> <exclude-pattern-list> Figure 6. Sample Crawler Configuration File Expressed Using an XML Syntax.

An additional challenge comes via a key technology that informs prospective users of Web data whether the data can be crawled and extracted automatically. This technology is known as the Robot Exclusion Standard (RES) [19] and manifests itself in the form of a "robots. txt" file placed on the Web site by the site owner. RES is a gentlemen's agreement that specifies which crawlers can access the Web site robotically and specifically which parts of the Web site are off-limits. The owner of an E-commerce server, for example, might want to tell crawlers that the product information section of the Web site cannot be crawled and extracted. In practice, however, our empirical studies have shown that the vast majority of Web sites do not take advantage of the RES standard to protect themselves and are therefore open to crawler access. Semantic challenges. The desire to bring together datasets originating from different sources raises the likelihood of incompatible or conflicting schemas and vocabularies. The terms used by one source to describe the features of a product may be different from those used by another source, and the units of measure and product identification information (SKU numbers) may differ.

Web data extraction techniques and applications

271

A related problem is to determine when two objects described by different sources really refer to the same object. For example, a product that is sold through different channels such as retail vendors, the OEM product market, or as part of a consulting services offering may have entirely different product model numbers depending on the channel used. Consequently, product data extracted from the Web sites of these channels is largely overlapping yet hard to integrate. Comparing the price of a product across different channels would require explicit knowledge of the mapping between product model numbers used by different channels, or some form of intelligence to analyze product descriptions and determine each product's "identity." These semantic problems and requirements may lead to missing, conflicting, and redundant information if used without care. Referred to as Information Integration, the problem is widely recognized and has many research groups working on techniques and tools to solve it. One such project is the Clio project at IBM Almaden Research Center [26].

Design challenges. Web sites are increasingly using programming techniques that increase the level of interactivity of the site. In its simplest form, interaction means that a program script is embedded in the Web page to intercept a user's input to a form and validate it before submitting the form to the Web server. Validating the data before submission reduces data entry errors and increases the responsiveness of the Web application because validation is done locally in the user's browser. While these design techniques increase the ease of use of the Web site, they also make it harder for crawlers to access the data. For instance, if accessing a product catalog on a Web site requires the user to submit a query as opposed to just following links (browsing), the problem arises that the crawler needs to know what to enter as the query. Similarly, program scripts may encode arbitrarily complex computations that affect the URL that is ultimately accessed. For crawlers not capable of dealing with forms, scripts, and other interactivity techniques, much of the Web is left inaccessible. This difIicult-to-reach part of the Web is sometimes referred to as the "Deep Web." More complex Web applications require the Web server to track the user's movements through the Web site. The concept of a session refers to a period of time during which the user enters the Web site, navigates the site and interacts with its forms and other information (e.g. shopping cart), and eventually leaves the site. Session management and tracking is usually done in one of two ways: using cookies or using variables embedded in the URL referring to the Web site. A session cookie contains a session identifier and associated host name and expiration time information [12]. Cookies are stored in the user's browser and returned to the Web server whenever the user accesses that site. Session information can be embedded into the URL of a Web site by using the Common Gateway Interface [4] mechanism. The URL lists one or more variable name-value pairs; the session identifier would be stored in one of the CGI variables. The use of session identifiers on Web sites poses requirements on the crawler that are very similar to those posed by the interactivity features of the site. The notable difference is that session identifiers are usually assigned to the user only at the beginning of the session and only on certain "start pages" of the Web site. As a consequence, a user or crawler that starts navigating the Web site on any other page may receive

272

Jussi M yllymaki and Jared Jackson

an "invalid session" error message and be directed back to th e start page. This me ans that a craw ler needs to be co nfigured in such a way that it starts from a start page and carefully retain s the session information across page accesses, w hether the information is stored in cookies or CGI variables. Information that refers to a single "obj ect" but is broken into mu ltiple Web pages present s another design challeng e for Web data extraction . Som etimes information is broken into mu ltiple pages bec ause it is too voluminous and simply wo uld not fit on a single page or it would make it incon venient for a user to browse it. A case in point are search eng ines that typic ally return the result set in chunks of 10 or 20 results per page. An oth er reason for broken-up Web pages is the desire to improve the visual design of a Web site. H T M L frame s provide a mechanism to design effective Web sites that are in many cases easier to navigate. H owever, the content of each frame is a separate Web page and involves a separate Web page access by a browser or crawler. Merging pages that each individually co ntain part of the information relating to a single obj ect suffers from the sam e semantic problem we discussed earlier. It is one of "obj ect identity." Suppose one frame of a page contains the features of a product, w hile another frame co ntains the pri ce of that product. Suppose further th at the co nte nt of th e two fram es is retrieved at different times during the crawl, so there is no temporal association between them. What is now required is a pro cess that attem pts to merge the pie ces of information co ntained in each frame in ord er to build object in w hole. In an idea l case, the frame embeds an obj ect identifier somew he re in its URL or HTM L co ntent and one simpl y ex tracts th e identifier and doe s the mergin g of the piece s at th e XML file level or perhaps in a database. C hange manayement. From a research point of view, perh aps the most challenging aspe ct of Web data extraction is ensuring th e rob ustne ss of data extrac tio n patterns. It is typically relatively easy to develop patterns that perform perfectly on a given set of input Web pages. It is mu ch harder to choose patterns that wo rk reliably with pages that have no t been seen before and co ntinue to work on pages in the future even if the struc ture of H T M L pages or templates changes. Em pirical evidence suggests that the frequency of struc tura l changes in Web sites is inversely pro po rtional to th e size of the organization operating it. The intuition behind this statem ent is that in large enterprises and other org anizations decisions are not made by individual people bu t by committees, task force s, advisory teams, and so on. C hanging the design of a Web site is a major deci sion, as it is affected by corporate design guidelines, consistency requirements with other mass media, adherence to prevailing design standards (e.g. Web content accessibility stand ards [34]), and national langu age support. A large number of people and organization al entities have to come to an agreement over major chan ges affecting a Web site. In cont rast, a Web site ow ned by a small co mpany or an indi vidual can be chan ged mu ch more frequently. In fact, we have observed changes to some Web sites almost wee kly, and typically these changes parallel the increasing sophistication level of the Web designer making those changes. We can almost imagin e a Web designer reading an HTML programming m anu al, discovering a "c ool new featu re" and implementing it in th e Web site th at very mom ent!

Web data extraction techn iqu es and applications

273

3.5. Techniques for effective data extraction

We now discuss variou s techniques for dealing with the challenges presented earlier. The techniques and associated too ls involved suggest a blueprint for constru cting and deploying an actual Web data extraction system . The pro totype system architecture described in the next section builds on the blueprint. Mu ch of the data processing described so far has focused on sto ring, retrie ving and transforming XML documents. XM L storage and querying are well- understood concepts and cur rent or future releases of all major commercial database systems include some suppor t for them, either natively or through a database system exten sion. It is important to note that XML transformation is really part of th e broader concept of XML query, for which a new XQuery standard is emerging [37]. While real XML databases will be targeted to very large XML document collectio ns with the correspo nding efforts made in XML qu ery optimization, smaller data extraction systems consisting of perhaps a few thousand documents may be achievable using a file system based storage scheme instead . In the latter case, XML query and transformation really ju st means running XSL stylesheets or XPath expressions [39] over a collection of XML files. Before XSL stylesheets or XPath expressions can be invoked over an arbitrary HTML page, however, one need s to " no rmalize" the page to a well-formed XML format, namely XHTML [351. As Web design tools become more XHTML-conformant, it is likely that a significant fraction of futur e Web conte nt will already be XHTML. Today and perhap s a few years int o the future , however, it is still the case that the vast majority of Web content is plain old HTML and badly brok en too. N ormalization tool s, such as HTML Tid y [33] are therefore still very mu ch required. As noted before, extraction from XHTML boils down to executing XSL stylesheets or X Path expressions over it. An XSL pro cessor such as Xalan is required to execute sryleshee ts and expressions, so the problem really become s one of figur ing out what those srylesheets and expressions sho uld be. Man y different approaches are possiblemanual tools, automatic tools, user- assisted tools, machin e learnin g tools, and others. We take a detailed look at this issue in Section 5. Bypassing the various obstacles of a Web site in order to get access to the "Deep Web" is a tough problem in general but tractable in practice. Our earlier discussion highlighted the fact that the R ob ot Exclusion Standard is a gentleman's agreement and basically says that anyon e wh o wants to stay on good terms with others (and avoid lawsuits) better adhere to the agreement. This "obstacle" can stop many data extraction tasks in their tracks and should be the very first thing one checks when contemplating crawling a prospective Web site. Althou gh cookies and session IDs were designed to imp rove the usability and interactivity of a Web site, nothing prevents the crawler from mimi ckin g a Web browser and responding to the Web sites coo kie requests ju st like a normal browser would do. Careful configuration of the crawler ensures that coo kies and session IDs are picked up by visiting the home page or other "session initi ation page" of the Web site before proceeding to the actual conte nt one wants to extract.

274

Jussi Myllymaki aud Jared Jackson

The same principle applies to JavaScript. Ideally, the crawler should be able to execute scripts embedded in an HTML page much like a browser would do. Script engines are available in the market and the Open Source community, so plugging one into a crawler is certainly possible. Full-scale script execution may be an overkill, however, as most scripts merely improve the interactivity of the Web site (e.g. checking input parameters before forms are submitted) and don't really contribute to the data content of the site. An exception to this rule are scripts that modify the HTML page at run time according to the browser used and those that compute URLs on the fly, telling the browser to go to an alternate location instead of the one indicated in a static hyperlink or form. The former exception occurs when a script outputs page content only when executed; if the script is not executed, part of the page is not accessible. The missing part may contain links and other content that are critical to the proper function of the crawler and/or data extractor. The second exception is more common than the first. It is not unusual to see HTML forms where the URL to which the form is submitted is computed on the fly by a script. The script may choose the appropriate URL based on form input or it may add extra field values to the target URL. In either case, not knowing what link to follow is a major obstacle to the crawler. In the toughest cases, if a script engine is not available, one may need to resort to manual "reverse engineering" of the script code and recoding it in some other language that the crawler does understand, for instance XSL. When the crawler sees a page which is known to contain one of the "tough" scripts, the crawler loads the corresponding XSL sheet and transforms the page into a new page where the script has seemingly been executed. Simple script actions like adding or removing field values from URLs on the page are easily done using XSL. Some Web sites cannot be crawled without filling out HTML forms. Forms may be used to submit query terms to the backend database of the Web site or submit user information (e.g. login ID) when entering the Web site. Even if the data extractor or crawler can deduce from the form itself what data needs to be entered (e.g. that a field is a city name), the problem remains that the values entered are domain-specific and the extractor or crawler cannot possibly know what to enter on the form. It may be possible to build up domain knowledge by analyzing the content of the Web site [27). In very targeted crawls, it may be permissible to manually control what is entered. For example, if the task is to crawl products made by a certain manufacturer, this is a clear hint that that a manufacturer name must be entered on corresponding forms on the target Web site. One way to embed hints into the crawling process is to code them as XSL transformations (as was done with scripts) which take an HTML form as an input and produce one or more filled-out forms or simple hyperlinks as the output. Filling out forms and translating them into simple hyperlinks is known as "hyperlink synthesis" [24]. Yet another source of domain knowledge are Web proxy logs common in most organizations. A proxy log contains actual forms and data values entered by users browsing Web sites and the data can be directly applied for automated crawling of those same sites.

Web data extraction techniques and applications

275

HTML

Configuration

XML Email

Retrieved XHTML

Extracted XML

Jose

•

Database

Figure 7. Outline of Data Extraction System Architecture.

4. OUTLINE OF A DATA EXTRACTION SYSTEM ARCHITECTURE

.We now turn our attention to architectural issues in Web data extraction systems. We describe a sample architecture that suggests what components are required for effective data extraction and how those components interoperate. While particular systems may differ in terms of details, we believe that the discussion in this section will be helpful and provide a blueprint common to many such systems. The architecture is based on ANDES, a research framework for reusable Web data extraction systems [24]. ANDES was inspired by previous work on Web query systems, e.g. TSIMMIS [9], STALKER [17], andJunglee [8], and has seen continuous use within IBM for a wide variety of data extraction tasks and application domains. Among the domains where ANDES has been applied are news articles, consumer product reviews and prices, real estate listings, computer products, and construction materials. The architecture is composed of a set of Java and XML-based components that implement key features of data extraction systems. The components of the architecture are illustrated in Figure 7 and their tasks and relationships are summarized below. Data Retriever gathers HTML pages from the Web using a crawler mechanism or some other method. Gathered pages are normalized into XHTML and forwarded to the Data Extractor component. Data Extractor applies data extraction patterns encoded as XSL stylesheets to a set of XHTML documents. The output of the Extractor is a new set of XML documents which contain the extracted data. The XHTML documents are discarded and the XML documents are forwarded to the Data Checker component.

276 Jussi Myllymaki and Jared Jackson

Data Checker inspects XML document s produced by the Extractor and ensures that the data co ntained in the XM L document s is semantically and syntac tically valid. Invalid docum ent s could signify erro rs in the data extr action pattern s and are marked for fur ther inspec tion by th e administrato r. Valid document s are forwa rded to the D ata Exporte r component. Data Exporter converts valid X M L documents to some output format, for example SQ L statements for database upd ate, spreadshee ts for data dissemination , or HTML output for Web publishing. T he system is co nfigured to either keep the X M L docu ment s or discard them. Adm inistrative lntetiace is a Web-based managem ent and monitoring tool for system administrators. Using the tool. th e administrator can schedule new data extraction processes at specific time s and days of th e week and can inspe ct th e progress of previously scheduled processes. D ata that was extracted but marked as invalid can be browsed and acted upon. Pattern Designer is a graphi cal tool for th e analysis oftarget Web sites and their HTML pages. The output of th e to ol is a set of data extraction pattern s specific for a given Web site. We now describe each system comp onent in more detail. 4.1. Data retriever

Th e Data Retriever is usually a crawler, such as the Grand Ce ntra l Station (GC S) crawler developed at IBM Almaden R esearch Ce nter [32]. An alterna tive Data Retriever is one th at simply reads a list of URLs from a file and fetches each page on th at list. This simple retri ever works well for scenarios w here the U R Ls are known in advance and they do not change over time. For instance, th e hom e page ofa target Web site does not m ove and could be extracted using th e simple "URL Data R etrie ver." The Data R etr iever fetches HTML pages from the target Web sites and turns them into well-formed XML document s using a tool like HTML Tidy. T he normalized XI-ITM L pages are sto red in a staging area as XML files where the Data Extractor can pick th em up. Separating th e Data R etrie ver and Extr actor into two distinct phases is imp ortant because in certain situations the Data R etriever may run on an entirel y different machine or net work than the Extractor. Similarly, th e use of the Pattern Designer and testing ofth e resultin g extra ction patterns is most co nveniently done using locally cached files so there is no need to fetch target Web pages over and over again. 4.2 . Data extractor

Th e Da ta Extractor reads th e set of X HT M L files retrieved by the Data Retriever and applies one or more XSL files on those input files [14][24]. The ultimate output of the Da ta Extracto r is a set of dom ain- specific XML files, one per input XML file. XSL files can be stacked or pipelined. Stacking means th at one XSL file " calls" another XSL file, as if it were a subroutine in a co nventional programming language. This is do ne by having on e XSL file imp ort ano ther XSL file and then invoking one of its templat es.

Web data extraction techniques and applications

277

Extractors

x

C/l

!:i

C/l

~

URL

XHTML

Extractor

Output XML

Figure 8. A Pipeline ofXSL Stylesheets is Applied to an Input XHTML Page.

Pipelining XSL files is another powerful concept. Multiple XSL files are designed to operate as a sequence of transformations on the original input file (Figure 8). The output of the first XSL file is some intermediate XML format which is subsequently transformed by the second XSL file, and so on. The output of the last XSL file in the sequence is the final, domain-specific XML format. Alternatively, each XSL in the pipeline can improve the quality of the domain data, without touching the structure of the XML per se. The first XSL can transform the input XML into the final syntax but not necessarily the final data content. Subsequent XSL files work on pieces ofthe content, removing redundant data or noise, normalizing the usage of whitespace or capitalization in the file, or performing a mapping from one vocabulary to another. Consider a scenario where data is extracted from two E-commerce Web sites whose product catalogs contain identical products but use slightly different notations for product specifications. A first XSL file is designed for each Web site to perform basic extraction of relevant information from corresponding product pages and produce output whose structure conforms to some standard product markup language. The second XSL file for each Web site would work on the Manufacturer field of the markup and correct any misspellings or alternate spellings of manufacturer names. It could also normalize numerical values so that units of measure between the two sources match. A third XSL file is common to both sources. It removes redundant whitespace from every part of the XML document and inserts a standard header into the document. The output of this XSL file is the final output of the Data Extractor.

278 Jussi Myllymaki and JaredJackson

XPath expression s emb edded in XSL files can be generated in on e of several ways. The most straightforward way is to produce them manu ally using a Pattern Designer tool (discussed later). An alterna tive is to employ machin e learnin g techniques and produ ce data extraction expressio ns automa tically based on a sample set of pages. A hybrid , semi-automatic process is also possible. One could either use auto matic tools to generate " rough" data extraction pattern s and refine them using manual too ls like the Pattern D esigne r. O r, one could use a manual tool to direct (or supervise) the learn ing pro cess of the automated too l, ensur ing that the right pattern s are found and spuriou s extraction patterns are not genera ted. 4.3 . Data checker

As Web sites change, the data extraction pattern encapsulated in an XSL stylesheet may fail to correctly extract data from the pages that changed. The Data C hecker validates the quality of extracted data by inspecting the XML output of the Data Extractor, not the source XHTML files. This me ans that if a Web site changes but the XSL filter continues to correctly extract data from the chang ed pages, the new pages pass the data validation check and no alerts are generated. Data validation is performed at several levels of abstraction. Syntactic checks are performed first: they verify that each X ML element is present in the output and that values match their expected types (numeric vs. string). In the early days of XML, the prevailing meth od for syntax check was to rely on Doc ument Type Definitions (DTDs) . Those definitions did not provide sufficient means to check o n numer ic values, for example. Today, XML Schemas [381 are used and provide the necessary power to enforce the se syntactic requir ements. If a schema is defined for the XML output of the Data Extractor, then it is a relatively simple matter to "validate" the output against the correspo nding schema. If validatio n fails, the document was not extracted successfully. T he syntactic check is followed by semantic chec ks which spot incorrect values. This is doma in- specific but very powerful. For instance, ifit is known that stoc k prices are usually less than $1000 (Berkshire- Ha thaway shares being the notable exception), this can be described to the Data Checker which then separates the " bad" data from "good." The bad data is moved to a staging area and the admin istrato r is asked to decid e what to do with it. Th e administrator can take one of four corrective action s using the Administrative Interface. The data can be accepted as-is or it can be treated as a one - time error (e.g. due to a network error) and discarded . Th e administrator can also manually correct the data if the data is mo stly good but there are some invalid fields. Finally, if the data really is valid but did not pass th e semanti c checks, the adm inistrator can mo dify the rules of the semantic chec k. For example, if the pr ice of a hard disk drive was inco rrectly flagged as invalid because it was below a previously set minimu m (say, S30), the system can be told to adj ust the boundary values to the new minimum . 4.4 . Data exporter

The Data Extractor compo nent transform s the extracted dart into some export form at. In many cases,th e final destination of extracted data is a relation al database, so the export

Web data extraction techniques and applications

279

format is a series ofSQL statements which are then executed against the database. The SQL statements can either be INSERT statements (if the data is merely accumulated) or UPDATE statements (if new values replace old ones in the database). In some domains it may be preferable to convert the data into a spreadsheet format such as Comma Separated Values (CSV) files or native spreadsheet format such as that used by Microsoft Excel. The spreadsheet files can then be disseminated via email or other mechanism without much trouble. Note that newer versions of Microsoft Excel support XML directly, so one approach is to convert the extracted XML data into the "spreadsheet XML" format expected by Excel. XML is also used natively by the spreadsheet component of the open-source OpenOffice suite (formerly StarOffice by Sun Microsystems). An alternative method for inspection and dissemination of the data is to convert it back into HTML. However, this time the HTML format is quite different from the original HTML from which it was extracted. Whereas the original HTML format may have contained a large amount of extra "baggage" like advertising material, the new HTML format is lean and simple. The precise HTML format used is up to the system administrator. The common aspect of all these conversions, and further proof of the significance of using XML as the intermediate format in data extraction, is that all conversions can be accomplished using XSL stylesheets, the same technology used for data extraction itself. Conversion ofXML to SQL or CSV is no more difficult than the conversion to spreadsheet XML or HTML. To support database updates natively, the Data Exporter attaches to a user-defined JDBC database using standard Java libraries. The access parameters of the database (its name and authentication parameters) are provided by the administrator in a configuration file. Once the Data Exporter has finished converting the extracted XML data into SQL, it executes the SQL statements without really needing to understand what those statements do. Again, the precise function of the statements is encoded in the XSL stylesheet and is up to the system administrator. 4.5. Scheduler

The Scheduler is responsible for activating the data extraction process at predetermined times and repeating the process periodically. The timing and frequency of activation is controlled by the system administrator using the Administrative Interface. The periodicity of data extraction depends largely on the frequency of data change, but also on domain-specific and corporate requirements. Data extraction is usually run during periods oflow network activity, for instance at night. As discussed in Section 3, legal issues surrounding data extraction may have a strong influence on when and how data extraction is performed. 4.6. Administrative interface

The system needs a comprehensive, Web-based management interface that a system administrator can use for monitoring and controlling the system. Figure 9 shows an example of what the Administrative Interface might look like. The administrator

280

j ussi Myllymaki and Jared Jackson

ANDES

~-.:.~~ Server Admi nistration Host:

localhost

Ser ver :

AndesSe rverO

Updated :

031261200308.57.51

Ser ver Properties

Reload Config

Testing

!1 week o

Ema il Gateway

[relay. US.lbm~com

Administrator Emai l

[email protected]

Interval'

Se""" Pool

I Refresh I

~I

Da ta Feed Email

Set Interval

I

Server Status

Q.e1;ill,s Da ta Validation

Server IS running but idle . Server was last stopped at Server was last started at Server was last acnvated at Server wi ll next activate at

0312112003005955 0312112003 OlOO 00 03/28/200301:00.00

I I

Set Testing Set Email

Ser ver Operation

IStart Io Immediately O At I Restart I IStop I I Shutdown

Figure 9. Scrcenshot of System Administrator 's C onsole.

can start and stop crawler processes and see when the next crawl is scheduled to start. C licking on the D etails link displays a detailed view of the last crawl (Figure 10). Th e statistics show n in Figure 10 tell the system administrato r that over 10,000 pages (in this case, real estate listings) were successfully extracted from the Web and the total processing tim e was 6 hour s. D ividing 10,000 pages by the total crawling time (4 hour s) yields an average crawling rate of about o ne page per every 2 seconds. Th e crawler was configured to crawl the Web site very gently so as no t to place too much load on the Web server. O nce the pages have been retr ieved from the Web site, they are pro cessed locally. Total extraction and database processing tim e was 2 hours, or abo ut one page per second. T hese numb ers were gathered on a low- cost,

Web data extraction techniques and applications

281

ANDES

:~-;.: Server Status Details Host:

localhost

Server :

AndesServerO

Configuration :

MLS Listings (South Bay)

Updated :

03/26/2003090451

Main Statu s

Data Retrie val

Stage: Started At : Completed At : Pages Crawled . Crawler Threads . Pages Anatyzed : SummarizerThreads :

Completed (2) 03/21/200301:0000 03/21/2003 04:57:16 11151 (0 in queue) 0 11150(0 in queue) 0

Data Extraction

Stage: Started At : Completed At : Summary Directory. Analysis Directory. Resut Directory: Total Data Files : Files Extracted : Files Concatenated . Files Generated . Files Converted '

Completed (6) 03/21/200304.57 16 03/21/200306.1435 lhome/db2inst1/ANDESlRealEstateCrawlerl20030321-010000/summaries lhome/db2lnst1/ANDESlRealEstateCrawler/2003032 1-010000/analySIS lhome/db2lnst1/ANDESlRealEstateCrawler/20030321-010000/resutt 10971 10970 10514 1 2

Figure 10. Detailed Status of a Crawling Task.

single-processor server. Dramatically higher throughput is achieved by parallelizing the network accesses, data extraction, and database processing. It is also essential that the system be capable of monitoring itself and alerting the administrator when problems are encountered. Table 1 shows an email message generated by the ANDES system for a deployed news crawler. The system tracks the number of Web pages crawled and extracted in each run. Significant deviations in these numbers trigger the system to send an email message warning the system administrator that attention may be required.

282 Jussi M yllymaki and Jared Jackson

Table 1 Email message noti fying the administrator that a deviation was detected in the number of pages crawled and extracted

Warning : Las t e x e c u t i on o f c o n f i gur at i on "CNet News " extracted da ta from 30 13 p a g e s wh i l e the previous cr awl e x tr acte d d at a fr om 3350 p a ge s . St a tistics Re po r t = === = = ==~= = = = = ===

St a r t e d

02 /27 /2003 02 /20 /2003 02 /13 /2003 02 /06 /2003 0 1 /30 /2003 0 1 /23/2003 0 1 /16/2 00 3 01 /0 9 / 200 3 01 / 0 2/ 2003 1 2/ 26 / 2002 12 /2 0 /200 2 12 /16 /2002 1 2/ 14/ 2 0 0 2 12 /07 /2002 11 /30 / 2002 11/23 /2002 11/16 /2002 11/ 0 9 / 20 0 2 11/ 0 2/ 2 0 0 2 1 0/ 26/ 2002 10/21 / 2002

22:01:01 22 :00 :00 01 : 0 2 : 0 2 0 1 :0 1:01 0 1 :00 :00 01:0 3: 0 3 01:02:0 2 01:01:01 01:00:00 01:01:0 1 12:07:11 11: 20: 44 01 :0 1 :01 0 1 :00 :00 01 :01:01 01 :00 :00 01:02:02 0 1:01 :01 0 1 :00 :00 0 1 :00 :00 13 : 15 : 36

Comp leted 02 /27 /2003 02 /20 /2003 02 /13 /2003 02 /06 /200 3 01 /30 /2003 01 /23 / 2003 01 /16 / 2003 01 /0 9 /2003 01 /02 / 2003 12 /26 /2002 12 /20 / 2002 12 / 16 /2002 12 /14 /2002 12 /07 /2002 11 /30 /2002 11/23 /2002 11/16 /2002 11/09 /2002 11/02/200 2 10 /26 /2002 10 /21 /200 2

23 :08: 13 23 :14 :42 02:22: 59 02 :17:0 1 02 :58 : 07 02 :28 : 51 02 : 1 8 : 50 0 2;05:18 02:43:53 02:30:45 13 :3 4 : 4 1 12 :33 : 54 02 : 41 : 18 01:33 :06 03 :27:33 02:13 :5 2 03 :05 :17 01:38:00 01: 52:45 02 :34:5 1 13 :50 :23

Page s

Files

Concatenated

3013 3350 3614 3169 3345 3728 3437 28 6 5 465 0 40 36 3927 3218 2073 2288 2558 2729 2336 2 394 2639 2072 1 875

2698 3033 3261 2829 30 14 3377 313 4 2626 4 2 23 3677 3611 2966 19 13 2109 2364 2489 2141 2226 2429 1905 1 734

2698 30 3 3 326 1 2829 3013 3374 313 2 2624 4221 36 7 5 36 0 9 2966 19 13 2109 2 36 3 2488 21 4 1 2226 2429 1905 1734

Given that Web sites are auto no mo us and can change at any time, data extraction erro rs are likely to occur at some point during the lifetime of th e system. Significant deviations in the number of pages crawled and extracted co mpared to previo us crawls cause th e system to issue an alert to th e system administrato r via email. For instance , if the navigation rul es of a Web site change, the Data Retriever may get only half the number of target HT M L pages compared to a previous crawl. When this happens, the administrator is notified, wh o will then take a closer look to determine w hat th e appropriate action is. The corrective action m ay be to adjust the crawling configuration or data extraction patt ern s. On th e other hand, the change may just be a result ofless data being available on the Web site. In that case, no change is required in th e data extraction system. 4.7. Patt ern de signer

The Pattern De signer compo nent is a tool used by an extraction engineer to design extraction templates. The tool helps the user analyze on e or more sample HTML pages to com e up wit h a set of robu st extraction patterns. Graphi cal design tools are mo st powerful. A sample screensho t of our InFact too l is shown in Figure 11. The user loads an HT M L page into the to ol and uses its search functi on to find occ urre nces of the

Web data extraction techniques and applications

p'/INfACT Conflgura,lon Utlllly : C:lIlocumenlland Settlncsljusslllle'ktop\llooIU..blhd jO_60 .hlml

tile

~

283

r;](Q)~

~C UlS

CIck on an tem In the tr ee below and choOSea fiekI n.ame rromthe Ist: b===~~

r-----:$o;:-!;.JII"':'.~p~ .------------------------...,::1 ,

"

'h3'

, :J 'b'

,

[!J Hard 0 ......· 20 10 &0 u B FIELD' CATEGORV ctable border=-'-Width="80%-" ' b'

.. !:.J -td align=·centel'" wid~2 3""· , !:J ·td vlhgn=-top·wld~77qr,··

t

,

-table tlorder="O" )o

' b'

f>o ~

,

:J 'b'

e- ~

' ld'

,

~ 'b'

,

[J ctr"

t

,

'!d'

(!) ".gaAw..om. 2 0 FIELD: WoME

f>o El ' ld' , - td-

Q ' 0 GB

FIELD: CAPACITV_G B

$o !::J .!d' , :;] '!d'

B ' 500 RPM FIELD SPEED_RPM

etr,.

$o ::J ' !d' , :.t . ld....",,,,r.- ,_i,,",or,"'-'" (!)I

-td iilig~ c8nte" Wl~23~·.

Figure 11. Screenshot of In Fact Extraction Pattern Generator.

data that need to be extracted. The occurrences are then labeled with a descriptive name. Data that is repeated on multiple table rows or columns can be marked as a repetitive field, in which case the tool generates an extraction expression that iterates over those rows or columns. More advanced extraction pattern types are discussed in the next section. 5. DATA EXTRACTION PRINCIPLES

5.1. Extraction templates

As mentioned earlier, the majority of interesting Web content that is delivered as HTML Web pages is generated through some sort of template-based technology. The good news for data extractors is that this means that most of the pages to extract will have the same general structure and nearly the same markup. This consistency can easily be used to our advantage in order to robustly extract the more interesting content that does vary across these pages. Given that most Web pages are constructed using a template technology, it seems almost intuitive that these pages are ready candidates for processing by further templates. In fact, the data extraction process can be largely viewed as an extension to the

284 Jussi Myllymaki and Jared Jackson

Server Side

Extraction Side

~"'§-BiS"'B Template

Templates

Figure 12. Process for Transfor mation of Server Data to Extracted Data.

m echanism that delivered th e content in the first place. Web data that was transformed into a Web page with a tem plate is simply th en transformed with ano ther template back into a data-centric form, as illustrated in Figure 12. Whil e this techni qu e can prop erly be termed an exercise in reverse engin eeri ng, the fact that the extr action mechanism is so like the creation me chanism simplifies the pro cess imm ensely. As it turns out, creatin g th e extraction templates is a far simpler task th an creating th e templates th at produ ced th e original Web pages. Of co urse, we are still left with the task of det ermining w hat techn ology to use as th e backbon e for extraction templates. The best cho ice for this tech nology wo uld be one that can easily represent and manipul ate Web pages, that provides navigation through the hierarchical struc ture of H T M L, that allows for robust pattern matching in th e docum ent , and that produ ces results that integrate easily with the back-en d system int o w hich we are porting th e extr acted data. XSL meets all of these criteria and is preferred over other alterna tives such as W 3QL [1 8] and WebSQL [23] because it offers on e large bonus: it has become a de facto standard for working with Web-based technologies and mo st developers working in this sector are already familiar with its workings. To illustrate that XSL is in fact a solid general choi ce for representing extraction templ ates co nsider its advantages. First, XSL was designed to wor k w ith other XM Llike languages. This includes HTML once it has been tidied up. Second, XSL relies on a techn ology called X Path to work with its input. Th e sa le purpose of X Path is to provide a robust way of navigating X M L docum en ts using a compact no tation. This aspect of XSL is ideal for creating templates designed for reverse engineering. An XPath expression can traverse an HTML document recursively (as in XWRAP [22], WebL [5], and lnfor mia [3]) and express predic ates (WebLog [21]), context and delimiter patterns (W H ISK [30]), and token feature s (SRV [7]). XSL is not limited to absolute path names like the HTML Extraction Language in W4F [29] and WIDL [2]. XSL stylesheets can also perform co mplex computation s that requ ire recur sive fun ction calls [15]. Perh aps the only drawback to th e use of XSL is its lack of suppo rt for regular expressions, a way of extracting data from po rtions of text that have no mark-up using a compact not ation that matches certain expressions. H owever, before beco ming discouraged by this news consider that the coming version updat e to XSL will add regular expression matching to the tech nology, and even now that fun ctionality can be added to XSL using a mec hanism called XSL extension s.

Web data extraction techniques and applications

285

A final advantage to using XSL for our templates is its flexible output mechanism. Once the extraction patterns have been written, it is easy to adjust the output to reflect the requirements of the back-end system. As discussed previously, the required output format may be XML, Comma Separated Values, SQL commands, or something else. 5.2. Extracting XML data from HTML From its infancy in the early 1990's until just a few years ago, the HTML language evolved continuously, introducing increasingly complex design elements. Design elements such as tables within tables, frames, and image maps, were among the early additions. Later came client-side scripting, including its handling of mouse events (e.g. mousing-over an image), which improved the interactivity of Web sites and allowed them to function more like real applications. In recent years, however, the makeup of HTML has stabilized and Web developers have shifted their focus to more programmatic Web standards such as XML and Web Services. Today, HTML no longer evolves as a language and, apart from incompatibilities that exist between the HTML features supported by different browsers, it is fair to say that HTML itself and related development and design tools are mature and produce consistent output. As a result of these developments, it is reasonable to assume that certain design paradigms in Web sites are programmed using a consistent and predictable set ofHTML constructs. For instance, excepting complex graphics and client-side scripting, there is only one way to create a pull-down menu on a Web page: using the <select> and

Canon S300

, ,

, and tags. As anyone who has observed the evolution of the Web in recent years can attest, Web pages have become increasingly more complex, with whole sections of a page being assigned to serve various business and technical needs; consider the wealth of advertising, standard headerslfooters, navigation sidebars, polls, and search bars on Web pages today. As a consequence ofthe increasing structuredness ofWeb pages, over time the "main content" of the page is pushed deeper and deeper down the HTML tree. This means that more often than not, the interesting data is found somewhere in a deeply nested table, and raises the issue of how to find that "interesting" data in a robust manner. 5.3. Pattern creation There are several different approaches to creating the templates for extracting Web data. A lot a research is being put into automating the creation of the patterns that make up these templates. While it is always possible for a developer to manually write

286 ju ssi Myllymaki and Jared Jackson

these templ ates, the assistance of an auto mated algori thm can substantially add to that develop er's productivity. The most interesting auto mated tools for finding relevant data and creating pattern s to get to th at data involve processes of machine learning. In machine learning algorithms, th e computer looks at source Web pages many times over, co ntinually refinin g its findings until satisfied with th e results. This pro cess can be done with or witho ut human supervision, and w hile work in this area is still in early stages of development , it has produ ced some reliable results [17)[20)[27]. O ther auto mated systems wo rk with general pattern match ing or struc tural analysis of the source Web pages to suggest pattern s for inclusio n in the extrac tio n templ ate. Some of the se systems have graphical user interfaces attached to enable the extractor to better see what kind of data will be extracted and to wor k with selecting and editing th e suggested patterns. Unfortunately, no amount of automation will ever completely remove the human user from the process. While it is possible for a computer application looking at several similar Web pages to suggest data of interest and the pattern s to extract them, the application w ill undoubtedly make suggestions of uninteresting data or will miss data th at is interesting. N o matter the process being used, th e principles of creating data ext raction patterns are the same : reverse engineering the template that created the Web page and keepin g the pattern as robust to increme ntal changes to that original template as possible. Since we are not privy to th e origi nal templ ate, our only method of reconstructing it is to analyze the similarities and differences found amo ngst several pages created by th e same template. These pages are easy to recognize. With few exceptions th ey are always found on th e same site, can be picked out by visual inspection, and are organized togeth er either by an ind ex page or a search compo nent . 5.4 . Sample pattern analysis

To best understand this process we will now step throu gh the creation of an extraction templ ate using a fictitiou s examp le site introduced in Section 3, an on- line electronics retailer named Acme Gizmo Superstore. While fictitiou s and a bit simple in form, this site's composition is indi cative of many existing commercial Web sites. Imagin e the following scenario: You are a new hire to the technology department of Brick & Mortar, a well establish electronics retailer. Oflate Bri ck & Mortar executives have been worr ied about th eir competition's ability to adjust pri ces qui ckly, especially in th e hard drive market, and thus und ercut th e comp any's profits. Th ey want you to come up w ith an automated way of adjusting to Acme's pri ce changes on a daily basis. Yo ur best option is to extract the inform ation from the most automa ted source Acme has, its Web site. Getti ng back to pattern creation, your first task is to examine Acme's site and gro up togeth er pages that come from the same templ ate. In th is scenario we will focus that gro uping aroun d pages having to do with hard drives. Samp lings of th e pages for th e site are show n in Figure 3 throu gh Figure 5.

Web data extraction techniques and applications

287

« t.r » ...

...

Hard Drives - 40 to 80 GB

Manufacturer:	Zygnot Hard Drives
Name:	MegaAwesome 2.0
Capacity:	60 GB
Speed:	5400 RPM
Price:	$75.00

Figure 13. Source Code for HTML Page Shown in Figure 4.

Notice that the first page (Figure 3) is an index, or catalog, into categories of the products. This page differs significantly from the others, all of which provide product details. It is clear from inspection that all ofthe pages but the first come from a common template. The catalog page most likely also comes from a template, just a different one from the product details template, and thus must be considered separately. N ow examine a portion ofthe source code (Figure 13) making up one ofthe product pages (Figure 4). A list offour hard drive products is shown, with detailed information about each to the left of a picture of the product. An examination of the source code

288

Jussi Myllymaki and Jared Jackson

from the page shows that all of the product content is stored within an inner element. For illustration, let us first try to extract the URLs of the manufacturers of the various hard drives on the page. There are several patterns we could choose from to get to these links. If we wanted to be very specific we could use the following patterns (note the increasing number inside the element): <xsl:value-ofselect="htmllbody/ table/trl3]/ td[2]/ table/tr[l] / td[2]/ tr/ td[2]/ a/@href" /> <xsl.value-ofselect="htmllbody/ table/tr[3]/td[2]/ table/tr[2]/ td[2]/ tr/ tdl2]/ a/@href" /> <xsl:value-ofselect="htmllbody/ table/tr[3]/ td[2]/table/tr[3]/ td[2]/tr/ tdl2]/ a/@href" /> <xsl.value-ofselect="htmllbody/ table/tr[3]/ td[2]/table/tr[4]/td[2]/ tr/ td[2]/ a/@href" / > These patterns direct us node by node down the HTML source tree to the links that we are looking for. These are not good patterns to use for several reasons. First, they are completely tied to the existing structure of the page. If the Acme site were to add or remove even one node in the pattern, the entire template would fail. Second, they do not take into account that number of hard drives contained on the page may change. To fix the latter problem, we need only surround our pattern with a loop: <xsl:for-each select="html/body/ table/tr[3]/ td[2]!table/tr"> <xsl.value-ofselect="td[2]/ tr/td[2] / a/@href"/> This resolves our dependence on a fixed number oflinks, but still we are completely tied to the existing structure of the source page. An alternate approach would be to free ourselves from the structure completely. Consider the following pattern: <xsl:for-each select="/ /a"> <xsl:value-ofselect="@href"/> This approach is perhaps too flexible, as all links in the page will now be included in our result set and we will be forced to implement some sort filtering before we can follow or store those links. A better approach is to come up with an anchoring point in the page that we think has little likelihood of changing and then jumping from that point to the data we wish to extract. Look again at the source code for this page and notice that all of our links are contained in tags within a table containing the text "Manufacturer." We will use this text be establish an anchor on this table and then we will hop down to the links from that point as shown in Figure 14.

Web data extraction techniques and applications 289

-

c b>HanI

<"' 11'1 ..... ,....·1.0"'.. <eM...., .r.crn. Gizmo .."<MIit;. "....., ..

!kIpltnt _

• 4G to ItO C8 <Jb>

-
tlit>

'< 'r>

.ob'.·. f rrrrF·.. n ",~ .'O· !oc>"'>tI'9"-"g' _9""WId!....·O· ~ •• ·D",

•
-
"*'7>, ..' 100-.')

~

-~.O;O

WICI II'l.· 7 5· ~t. · 7 5· ~

-< 1<:

~.· . CCC CC C·)

~· · I · >

,..c··.

- <'""Q ./~'/hdl .lPq·

.

< '1bII~ '0" >

<' d~'. · I OO ·cobp.w1. ·'·"

<"'"9 It(. ".,f Itn
/> vc . ' ..f\n>.Isq .'/'t _ _ IItI• .QIf'" WIId' ....

hf9'lt-"90 '

<....;I

J

· 4fM)·

"-'9h' .'90'~

<11"'
-<,.-.

c t d to9tQk,r. ". CCCCCC·
"~.·I op ·

.... U h. · I ' ..· '

'i. . ho.'.·../ {;;O'I ..loq .ht m. · ;oACm . Prvd ud ' '''. >

<. ".. .,.·..jlorat Of'."' ,.". · )i!iI.,... I rw'.4IIM (je >

-

.p'

<.. twuf.· . ./ t1I11J1 lo ...m vnt .ht n .. · ;of "' plo y .......t

° flpOftunltle , V . >

( ' d .
I_J

..1-1\r.··· ."

- ch:J>

~l~·~ •• • ... ~&/m.

t.... -t·· -u,·.

< t";- M .go " w. ~om. 2 .O

'I

bll l.,..,.

..)

,I.

c td > cl) Ce po d l y cJb >

lI oni Dri v e' • -to 10 YO G8

"""

Figure 14. Example of Anchors and Hops for ACME Gizmo Superstore Web Site.

O ur patt ern now becom es: < xsl.fo r-eac h select> " / !table[nor m alize- str ing(tr/ td) = 'Manufacturer' ]" > < xsl.value- of select= " tr/ td[2]/ a/ @href" /> < z'xsl.fo r- each >

T here are two patter ns In the above example. The first establishes an ancho ring poi nt on the relevant subtrees o f th e source HTML at a < table> eleme nt, and the second ho ps from th e anc ho r to the value we wish to extract. N otice that by rem oving any depe ndency on content ou tside of th e anc horing table mu ch of the content of the source of this Web page could change , and th e expression wou ld still work . A further advantage to this approac h is that by separating data hop s from their ancho rs, the patterns making up th e anchors can be reused for othe r data extraction on th e page. In the hard dri ve example, all relevant information abo ut the various dr ives is contained in within th e < table> element set by our marked. To extract furth er infor mation abo ut the dri ve we nee d onl y insert more hop exp ressions, as the following indicates: < xsl.fo r- each selec t> " / / table]no rmalize- string jcr/td) = 'M allufactu rer' I"> < xsl.variable nallle = "lIlanufacturerUR L" select= " tr/ td [2]/ a/@href" / > < xsl.variable llallle= " productName " select ="tr\2j/ td[2]" / > < xsl:variable name = "price" select= "s ubstrillg-after(tr[5]/ td[2],'S')" / > < / xsl:fo r- each>

290 Jussi Myllymaki and Jared Jackson

The best advantage realized by using this technique for creating extraction patterns is the robustness and dependability of your extraction templates. The examples above merely illustrate a principle in its simplest form. To extend the concept, anchors could be chained together (e.g. an anchor to the

element containing the text "Price" can be chained under the anchor establishing the element to remove the dependency in the example above of price being in the fifth

-  - ccualifier s00061 -c/ccce c-c rubncs-ertertetc/ruonc> -eZquahfier>



-  0006Z-







-



-c/qualifters >







- <documentation xsi;type="DOCUMENTATIDN"> -e descriptiorc-Artertal !iiystemic blood pressure. In the future, the stte of measurement (which artery) may be edded.c/descnotionc«uses-for all measurements of systemic arterial blood pressure.c/usese misuse s-for pulmonary or venous pressures-c/misuse>



e/occumenteticn>



 -  -cme aninq >blood pressure-c/meemnc> - -dixec_ value xsi:type -"DV _COOED_TEXT" >-c..elue.-blood pressure e/vetue> OPENEHR CLINICAL-1.0 -  -







-c rubric-blood



-c/pnmerv >



nressure-c/rubnc>



 -False



- <list> - "!" maximum occurrences e"f"> - -<::na~e xsi:t1P;='C_DY_PLAIN_TEXT';:. -c rneeninc> systolic - -efixedveloe xsi:type="PlAIN_TEXT"> -c value >systolic<:/value > < corrtextj-equired >False



- -ccomolexconstremt xsi:type="C_DV_QUANTITV">



-cvalue_minimum >O cvelue maximum>1000 mm[Hg] -cficedjirupertv c-bbaad pressure -







- -  -  -c meaning >Dlastollc -  c value_minimum >O 1000 mm[Hg]















-



-e/list>







- -cprotccol xsi:tvpe="C_LIST_S">



-  c roee-unc >8lood pressure protocol   - <prim~ry xsi:hpe="TERM_REFERENCE"> -c code >DD14 







 False 



- 



- -c value mimrnum joccurrence s e" t" rn<'!:>,;irnurn_JcJ;urrences='l "c-



- -cnerne xsi:t,.pe="C_DV_PlAIN_TEXT"> -cmeaning -nevtce-c/rneeninc > -  -c-.. elue »uevtce-e/velue > .o:/name>-



Figure 6. XML fragment of a blood pressure measurement observation entry archetype instance.



 Techniques in the utilization of the intern et and intranets 281



BP protocol : List



Blood Pressure : Observation subject = 190021 subjectrelationship = "self" provider = 744236 provider_relationship= "GP" confidence= 1 is_exceptional = False



I.....hasProtocol I....



~~



meaning= "device" name= "device" value= "sphygmomanometer" meaning= "cuff size" name= "cuff" value= "adult" meaning= "position" name= "position" value = "si tting "



hasData BP value: List



History: Single_Event origin = "2003-06-30 11 :45:00" offset = 0 duration = 0



I..... I....



hasItem



meaning= "systolic BP" name= "systolic BP" value= "120 mmHg" meaning = "diastolic BP" name= "diast olic BP" value = "SO mmHg"



Figure 7. Example physical structure for an EHR instance for blood pressure measurement observation entry.



within the stru ctures are represented as compounds or elements, wh ere an element contains a single value, and a compound allows groupings of data items, which may consist of further compounds and/o r elements; and • activity, which may be either composite (consisting of at least two sub- activities) or atomic, details the-• action.iype: indicates whether this is a system action (e.g., system notification) or a manual action (e.g., to be carried out by the GP). • is-mandatory: a Boolean value indicating whether this is a mandator y activity or not. • priority: indicates the level of urgency of this activity's execution . • status: the current state of the activity according to a generic state machine definition (as shown in figure 9). • start-activity: a Boolean value indicating whether this is the first activity in the instruction. • end. activity: a Boolean value indicating whether this is an end activity in the instruction. • pre-condition: the condition that must be satisfied in order for the task to be carried out . • post-condition: the condition that must be satisfied in order for the task to be completed. The pre- and post-conditions may be system conditions (i.e., on activity statuses) to specify the sequencing of activities, or patient conditions (acquired from EHR queri es



 282



Sistine A. Barrett o and James R . Warren



Instruction : ENTRY quideline jd [0..1] : OBJECT_REF data: STRUCTURE



~~



1 consists



0..1 Activity



2..*



.------



action_type [1..1] : COORDINATED_TERM is_mandatory [1..1] : Boolean priority [0..1] : DV_ORDINAL precondition [1..1] : DV_EXPRESSION postcondition [1..1] : DV_EXPRESSION start_activity [1..1] : Boolean end_activity [1..1] : Boolean status [1..1]: DV_STATE



co nsists



*



~~



r



I Atomic _Activity



Composite_Activity



work_item [1..1]: STRUCTURE



Figure 8. opetlEHR instruction constru ct.



about the patient's health- related data). An ato mic activity contains the details of the work.item that must be performed by the system or human particip ant. Figure 10 illustrates an example of a single medicati on order as an instruction. ERR system architecture for CIGs and workflows



T he following is the system architecture that facilitates the integration of guidelinespecialised EH R and other guideline artefacts- in particular, the C f G and workflow. Figure 11 shows a diagrammatical view of the system architecture followed by a description of its components.



EHR A rchetype Database: Database that contains all standard archety pe definitio ns (e.g., cur rent medicati ons, diagnoses) including addition al customised archetypes that are specific to the clinical domain bein g considered such as community-based service



 Techniqu es in the utilization of the interne t and intranets



283



re-enable



Fig ure 9. Gen eric state machine model for instruc tion.



Medication : Instruct ion guid eline_id = null



I



consists



Medication Order : Atom ic_Activ ity



= =



action_type manual is_mandatory false priority = null precond itlon e true postcondition = false start_activity = true end_activ ity true status executing workj tem : STRUCTURE



=



=



r



data : Structure



I



genericname = verapamil productjiame = calan route



~



oral



form = tablet dosage 30 mg frequency /24 hrs durat ion = 30 days



=



=



Figure 10. Example structur e of a single medication order instruction .



documents, or allied-health-related documents (e.g., O ccupati onal Therapy assessment form ). Process definiti on s defined as opel/EHR instru ction definiti ons (e.g., chained medi cation , immunisation recall) are also stored here that can be used as temp lates and instanti ated for an individual patient when needed at run-tim e. An archetype editor is used by domain experts to construc t new archetype definitions. or to make further specialisations of existing archety pes.



 284



Sistine A. Barretto and James R . Warren



~ ~ --- ---.:-~-----



&



Dan. n Expert



WlMS



•



• •



P'ovldef"



Figure 11. Integrated system architecture for EHR, CIGs and workflow.



A rchetype Server: Dynamically loads archety pe definitions from the EHR archetype database. openEH R Kernel: Uses the opellEHR R eference Model and the set ofstored archetype definitions to validate any clinical data placed inside a locally- created EHR, or sent from anoth er EHR system as an EHR extract.



Persistence Layer: Provides mechanisms for searching and storing of the raw EHR data. It also updates the persistent plan transactions that are affected by event transactions recorded . For example, posting of a new encounter event transaction that has the evaluation entr y " problem = hypert ension ", and instructi on " recommendation = ACE Inhibitor" would update the "Current Problem List" , and "C urrent Medications" persistent transactions to include the new problem diagnosed and the new dru g therapy. Worklist Manager: Ident ifies the "work list" that specifies the list of activities, which are either curre ntly executin g (i.e., set of activities whose state is cur rently executing) or able to be started (i.e., the set of activities whose preconditions have been satisfied). It also updates the states of activities by monit or ing to see if actions to be undertaken within activities are performed/ cancelled and using that inform ation to update the activities' state machines. Such worklists may be specific to a particular human or application participating in the workflow, or it may be a worklist shared by a numb er of participants. Worklists are generated via a query to all instruc tion activity states relevant to the one /group participant. The worklist manager allows mechanisms for



 Techniques in the utilization of the internet and intranets



285



participants to select activities to be performed, reassign activities, abort activities, suspend activities, and confirm that an activity has been completed. It may also invoke any applications that are assigned to do particular activities.



WJMS: this is the Workflow Management System, which is defined by the Workflow Management Coalition (WfMC) as a "system thatcompletely defines, manages, andexecutes



workflows through the execution if sciftware whose order of execution is driven by a computer representation if theworkflow logic". In addition to having the functionalities of a worklist



manager, it provides mechanisms for resource allocation (i.e., assigning tasks to users), scheduling tasks, prioritisation, optimisation, audit trails, notifications, automatic system action invocation and execution, etc. The WfMS serves as an optional extension to the worklist manager.



Server: The server is responsible for handling session management, client requests, and user identification, authentication, and authorisation. Guideline Engine: Executes CIG instances (i.e., on a case by case, or individual patient basis). The engine accepts and processes the user inputs, then outputs the recommendations that resulted from executing the guideline algorithm. Information about the guideline as well as the collated indications are automatically populated whenever the clinician chooses to comply with its recommendations (with provision for explanation of variation from the guideline where some aspect of the guideline intention is preserved). Furthermore, such information is used to link back to the online guideline document and enable clinicians to view the hypermedia guideline document with the specific decision point highlighted. Case study i-hypertension in diabetes



We illustrate our method using a case study ofguideline-compliant treatment of hypertension in diabetes with reference to the guideline algorithm from the Texas Diabetes Council (Texas Department of Health) [37]. In contrast to problems such as poststroke rehabilitation where workflow support features highly in the coordination of care among service providers, management of hypertension in diabetes presents more opportunities for clinical decision support and guidance with only some aspect ofworkflow support required. The general structure of an EHR encounter event transaction instance is as shown in figure 12. A detailed step-by-step example is provided in [96]. In tracking the relationship of a series of encounters with the GP to the CIG, a rationale construct is populated automatically by the DSS-alternatively, this can be manually populated by GP where variations to the guideline is preferred. Figure 12 details the rationales (indicated by dark shaded boxes) for the blood pressure target, and the instruction to prescribe a medication (the details of which (not shown in the diagram) would be stored within the "data" attribute of the instruction-e.g., drug name, dosage, strength, repeats, route, etc). The transaction records a list of two problems identified during an encounter-(1) hypertension (SOAP note content indicated by non-shaded boxes), and (2) proteinuria (SOAP note content indicated by lightly shaded boxes).



 286



Sistine A. Barretto and James R. Warren



Encounter : Event Tx



Proteinuria : Observalion



name "" -contactnole:Dr Jim WarrenOm awson·t8kes.clinc .8u2910112002Tl 1:35:32subject. 190021 subject_relationship . ·self' provider :;: 744236 provider_fe&ationship. -GP" conhdence • 1



value . ' .6 g /24 hrs



BloodPressure : Observation



f--



is_exceptional . False



0



Diagnosis: Evalu81iOn



Subjective : Organiser



consists



.--



Problem_SOAP : Organ" e, value . Renal Function



~



contains



hasl1em



r....



Obiective ; Organiser



-



Assessment :Organiser



value :I!: Hypertension



IT



'-



value • hypertension



Medc aOOn : Instruction



~



contains



.---Blood Pressure Target : EvaluabOn



Plan : Organiser



contains



.....



value . 125175 mmHg



......



eonlatn$



Aationale : Structure guideline_used. "'Hypertension Algofithrn fo


0 has



Rationale : Structure



-



micatiofl. pat1324195::-contaetno1e:OrJim WarrenOmawson-lakes.cInic.au29lO1I2OO2Tll :35:32"/ "' Problem usrI "'Renal lunetion- I "'Assessment- I



"'Diagnosis-



indca tion ~ patl324195::-conlaetnote:Or Jim WarrenOmawson-lakes.clinic.au29lO112OO2T11 :35:32", "'Probktm usr I -Hypertension"' I -Assessment-I



I



conlains



Medication: InslnJCtion gu


data : SIruc1ure



con~t~ Prescrille : Atomic AcWiIy ac1;o,Uype = manual



guideline_used = "l-iypertenslon Algoritlvn lor Diabetes Melitus in Adults· glrideline_S'ep =° 1.1'11>1&-1.2"



lncficabons: List



=Indation •



Jim Warren 0 mawsonIakes.dinic .au29lO1l2OO2Tll: 35:32"" Prol>lem Us!" , °Aenal function° l ' Plan° l



' nsttuetion-



contains



guideline_step . ·'.tabkt-' .4·



va,,",



pat132419S::-conlaClnote:Dr



~



~~



"Oiagnosis°



value = proteinuria



Diagnosis : Evalualion



I~



ProDlem_SOAP : Organise, ~



value . 160190mmHg



T""S



is_mandalory ~ false



priOnty . null



precondition~ true



posloond


= false



Slart_activity• true



end_aetNity = true



status. executing wor1Utem : Structure



Indications : Ust indication = paI1324195 ::-eonlaet note:Or Jim WarrenOmawson-lakes.cIinic.8u3OlO6l1999T09:15;()3' I



'Problem Us!" /"lliatletes Type I "



indication. patl324195::-conlaetnote:Dr Jim Wa"en Omawson-lakes.Cfinic.au29,\)112OO2T11 :35:32' / "Problem Ust· /" AenaJfunction"I "Assessmento l "Diagnosis" tldication • pat1324195::-contaetnote:OrJim WarrenOmawson-lakes.ctinic.au291O lf2002Tl1 :35:32' , "'Problem UsI""°Hypenension- 1 ' Assessmenr I -Diagnosis-



Figure 12. Fragment structure for an EHR encounter event transaction instance with rationale.



 Techniques in the utilization of the intern et and intranets 287



Problem entries in this encounter transaction are collated by the DSS as well as queried from a separate "Cur rent Problems" transaction (a persistent transaction recording all the patient's diagnoses) to determine indications for a particular dru g. (Other relevant transactions such as "allergies/drug int olerances" may also be queried) . A particul ar medication (e.g., ACE inhibitor) is recommended due to presence ofproteinuria, hypertension and diabetes type l-e-these are recorded as link items within the " Indications List" as part of the ration ale for the medication instruction . The name of the guideline used and the precise step from which it came are also recorded. Additionally, the second SOAP note's plan refers to the first SOAP note since the ACE inhibitor is used to address both the proteinuria and hypertension problems. This reference is given by providing a link item . From the method illustrated, the relationship of the decisions to precise steps in the guideline is then able to be easily reconstru cted from the EHR . Case study 2-early supported discharge



The following illustrates the use of openEHR in the context of an Early Supported Discharge (ESD) for post-stroke rehabilitation whose wor kflow schemas can be found in [97]. In summary, once the patient is discharged from hospital and enrolled in ESD, a care plan is devised to address the patient 's needs, and the services (e.g., referrals to) to meet those needs are executed within a common period of two weeks. Initially, a Hospital Assessment event transaction will be posted into the patient 's EHR that contains the recom mendation (instruction) for ESD. Figure 13 shows a fragment of the general stru cture of a Hospital Assessment event transaction archetype instance, where its content is constrained to archetypes, which are of type ENTRY, and has an organiser containing th e headings "Patient Details" throu gh to "Recomm endations" in an ordered list structure. This event will result in the persistence layer to update the patient 's care plan (persistent transaction) to include and instantiate an ESD instruction for the patient. Figure 14 shows a fragment of the general struc ture of the ESD instruction instance in the EHR. In this instance, we assume that "Review Patient " activity is the first activity in the ESD workflow, and the current execut ing activities are " R efer Patient " , and its subactivity " O ccupational Therapy" (i.e., the patient has currently been referred to the Occupational Therapy). CONCLUSION



We began this chapter by pointing out that it is non-acut e, and particularly chronic, disease that is growing in prevalence and that requires our attention if we are to achieve more effective health care overall. Across the spectrum of medical care there is a general acknow ledgme nt of the importance of practicing in accordance with evidence-based clinical guidelines. In chro nic disease management , execution ofa regimen of evidencebased practice requires that a individualised plan of action maintain its consistency across multiple healthcare providers, locations and , of course, significant periods of time . To address this challenge requires, at the minimum, the ability to exchange electroni c health records (EHR s) across organisational boundaries. Ideally, the shared



 288



Sistine A. Barretto and James R. Warren



1



Hospital Assessment: EventTx



Patient Details: List



1 contains 1



subject



subjectjelationship provider provider_relationship confidence



-



1



~~1



Diagnosis(es): List



Headings Organiser



1 contains "



-



.......



1



-



-



1



1



value DV_TEXT



...



Drugs: List



1 has Item



·· ·



Problem : Evaluation



l..a..



1



1



value : DV_TEXT



...



Hospital Details: List



is_exceptional: DV_BOOLEAN



consists



Name: Obserrvation



I....



Transport Arrangment: List



·· ·



Recommendations: List



I....



r'" 1 contains"



Recommendation Instruction guideline_id data: Structure



Figure 13. Fragment structure of Hospital Assessment event transaction archetype instance.



EHRs should include plans; and, for maximum benefit, the computer systems should retain the information to actually process these plans and thereby take an active role in promoting chronic disease management activities on a proactive basis in accordance with clinical guidelines. Coiera [98] has emphasised the importance of using IT to support human-tohuman conversation, as compared to the more traditional focus on 'computing' per se as a means to calculate statistics and otherwise computationally derive information to support a given human in individual work. Our position is that IT support for both conversation and computation will be necessary to leverage the Internet for appropriate support of chronic disease. Even when care providers are distributed across time and space, we should not expect to rigorously define computer-friendly codes for every subtle aspect of such care-many aspects will continue to best be conveyed by natural language ("free text" or spoken words transmitted electronically). However, to achieve the benefits of computer-based reminders, we must engage in more aggressive coding ofinformation into the computer than is done at present. To achieve computer support for long-term disease management, long-term plans must be among the data code into the computer. This needs not be as overt and arduous as it sounds if user interfaces are well-designed to mesh appropriately with the tasks of healthcare providers. Workflow modeling technology, and associated WfMSs, are currently seen as organisation-centred system, oriented toward the reliable execution of manufacturing



 ~



N 00



~.



»



Figure 14. Fragment structure of an ESD instruction instance in an EHR.



actontype = manual is_mandatory = false priority = null preconditio n == true postcondition = false startactivity e false end_activity = false status = executing



Occupational Therapy; Composite_Activity



action_type = manual is_mandatory = false priority = null precondition = true postcondition = true start_activity == false encactlvlty = false status == completed work_item: Structure



action_type = manual is_mandatory = false priority = null precondition = true postcondition = true start_activity == false end_activity == false status == completed work_item: Structure



action_type = manual is_mandatory = true priority = null precondition", true postcondition == true startacnvnv , true end_activity == false status = completed work_item: Structure



I



Plan Services: Atomic_Activity



I Accept Patient: Atomic_Activity



consists



~.



false



status = executing



endjactivlty



actiontype » manual is_mandatory = false priority = null precondition", true postcondition = false start_activity = false



ESD : Composite_Activity



consists



Review Patient: Atomic_Activity



I



Recommendation: Instruction



guideline_id '" 1500334 data : Structure



consists



~~



action_type = manual is_mandatory = false priority = null precondition » true postcondition = false start_activity == false end_activity == false status = executing



I



Refer Patient: Composite_Activity



 290



Sistine A. Barretto and James R. Warren



or routine business processes. Through Internet-based portals or Web services, access to a WfMS across organisational boundaries provides the potential for management of active, patient-centred chronic disease management plans. We see the integration of WfMS, shared EHRs, and process-specific computer-interpretable clinical guidelines (CiGs), as forming a key infrastructure for the long-term solution to high-quality, consistent health care service provision. Towards this end, in this chapter we described our own experimental system architecture to support evidence-based chronic disease management and briefly illustrated it in terms ofhypertension management in diabetes and early supported discharge. These two cases exemplify some of the key facets of high-quality IT-enhanced coordination. With the hypertension problem, we have relatively few care providers involved (perhaps just a patient and their family physician), but it is important to maintain the 'memory' of what medications have been tried, how well each worked, and to keep track of the place of each prescribing decision in terms of a larger guideline. Computer support for such reasoning will become even more important in the future as we may expect outcomes from the mapping of the Human Genome to result in rapid increases in the number of medications and an increasing use of genetic test results as decision data in prescribing, resulting in a general increase in the complexity of family physician decisions. The early-supported discharge case illustrates situations where multiple care providers must be coordinated across organisational boundaries to achieve an efficient outcome where the patient is returned to their home as soon as possible by the proactive assignment of necessary support, such as occupational therapist assessment of their home and rapid provision of appropriate home modifications and devices. It is important that we clarify that we see our architecture as a suggestion for the sort of system architectures that are needed in the future, and certainly not as a mature solution to be adopted verbatim. The system architecture we illustrated in this chapter featured use of the openEHR health record architecture. openEHR illustrates some important characteristics for future health information systems technology. It is levelled, to separate clinical information representation per se from other (critical and challenging) concerns, including security and patient identification. The key innovation of openEHR is in separated the generic components of the record architecture from the malleable and ever-evolving clinical details through the use of archetypes. The openEHR architecture explicitly acknowledges that we will never be able to model all of health in one go, and that we shouldn't try to do so. Archetypes represent the structure ofinformation to be collected in specific classes of clinical encounter-each speciality in every region will have their own unique archetypes to represent their particular healthcare practices. Furthermore, we have found openEHR to be flexible enough in its representation of 'instructions' to provide composite instructions for workflow models of complex care plans (such as early-supported discharge). While we have given particular focus to openEHR, and certainly there are some unique concepts and approaches in that architecture, it is appropriate to acknowledge that the international work toward full introduction of Health Level 7 (HL7) version 3 and the UK-based PRODIGY project are both evolving in the same context and in many ways producing similar and/or complementary solutions (there is in fact



 Techniques in the utilization of the internet and int ranets 291



active dialog among the group s). In PRODIGY, a virtual medi cal record underlies the scenario-based decision supp ort advice that is central to the system, and this is very similar to our suggested use of archetypes to provide EHR components that are tailored to the information needs of specific clinical guidelines. Both opetzEHR and HL7 make use of the important technology of the Extensibl e Mark-up Language (XM L). (It is fascinating to observe the layering of technologies-opetzEHR and HL7 version 3, using XML , using the Internet, etc. This 'onion' approach to systems has always been a hallmark of progress in computer science.) XML provides a standard way to semantically ' tag' information on the Internet. openEHR uses XML both to represent its archetypes and to represent that actual patient data that adheres to the constraints imposed by tho se archetypes. It is the ubiquity of the Internet that now allows us to speak so easily of exchanging information among a multitude of geographically distributed providers and temporally distributed encounters. Moreover, it is not just that the Internet is everywhere in a broad geographic sense, but that our connection to it is becoming increasingly common and convenient. Among the more recent developments is the rapidly growing accept ance of wireless local area networks (LAN s) in organisation al settings (despite ongoing concern s about security and interference with other equipme nt). It is almost ironic that this comes as many organisations have made massive investment s to ensure that there are PCs on desks, and that these PCs have wired Intern et co nnec tions (and in many hospitals, this has had associated investment to ensure there are appropriately placed desks on which the PCs can be placed-with the space requirement now being somewhat redu ced by the upt ake of flat plasma displays over C RTs). Wirel ess LAN naturally complements the increasing use of various handheld computing or per sonal digital assistants (PDAs), allowing staff to roam in the halls and wards of the organisation and retain their network connection. Both the tradition al PC s and the newer PDAs (and novel combinations of the two) will form the terminals for new Internet-enh anced health care service provision. Furthermore, 3G phon e technology is enhancing our ability to roam out in the community while still not losing co ntact with the Internet-which is vitally imp ort ant for substantial community-based compo nents of chroni c disease management . A focus on chronic disease management also leads us to a focus on the consumer (using this term to indicat e the 'patient' acting on their own outside the immediate attention of a health service provider), as well as the community based 'lay caregivers' such as family members, friends, clergy and volunteers . In many chronic conditions it is this home-based support network that is essential to successful management includin g diet , exercise, smoking cessation, moderation of alcohol, medication regimen adherence and effective administration technique, monitoring of key observations, and taking an active role in pursuing the services dictated by a proactive care plan, as well as seeking professional advice wh en needed . This expansive role warrants its own information systems suppo rt, and again it is natural to see the ubiquitous Internet as the key enabling technology; Whil e mu ch of the population is not capable (or at least comfortable) with accessing the 'raw ' Intern et (e.g., a Web browser on the Windows operating system), it is becoming increasingly reachable for consume rs and/ or their



 292



Sistine A. Barretto and James R. Warren



carers. As such, the continued growth of Web portal technologies that provide not just on-line access to general information, but direct support for the consumer-based chronic disease management activities, is a vitally important area for further research, development and evaluation. Security of patient-specific health information on the Internet continues to be an area of significant concern. While there are convincing technical solutions, the threat to privacy presented by EHRs and their common distribution over the Internet, should not be trivialised. Even with a 'de-identified' record (where, for instance, name and address have been removed), a single medication prescription or pathology test result matched to a location, date, gender and age (nonetheless date ofbirth) could potentially reveal damaging information when combined with outside information known to the unauthorised viewer. Such an information leakage problem is not unique to medicine, or to the Internet, but certainly the types ofIT solutions we espouse herein introduce unique new threats to privacy. An insidious trade-off associated with this threat, is that many of the obvious measures to increase security (notably, more substantial forms of authentication, such as smartcards or SecurID devices, as well asexplicit patient consent to access and update of EHRs, and quicker authorisation timeout periods) introduce greater barriers to the use of the technology in terms of immediate convenience of access within the natural flow of a task and in terms of system requirements for access. There is no perfect solution to problems of information security. We believe a key element for progress on the security problem lies in further evaluation of system benefits to provide an appropriate basis for individual and organisational decisions about the cost/benefit of privacy risks associated with the acquisition, retention and distribution of specific data items. The weakest link in the use of the Internet for decision support in medicine lies with evaluation. It should be grounds for embarrassment that the technologies that purport to promote evidence-based medicine have themselves very little evidence of effectiveness in terms of ultimate healthcare outcomes. At this stage the results are sparse and quite mixed. There is every reason to believe that the communication and processing of information by computers and over the Internet has the potential to improve health care processes and lead to superior patient outcomes; but we are not to the stage where this can be claimed with confidence. Moreover, we need more refined studies of the use of decision support in clinical contexts so we can not only make confident claims about benefit, but make specific claims about the benefit of particular kinds of decision support, in particular clinical contexts and with particular classes of patients. There is no reason to think that benefits will be uniform. Much further study is needed to understand the areas of maximal opportunity. ACKNOWLEDGEMENTS



Thanks to Svetla Gadzhanova and Eric Browne (Advanced Computing Research Centre [ACRe], University of South Australia) for their help in preparation of this manuscript, and to Markus Stumptner (ACRC) for his ongoing mentoring. Special thanks also to Russell Shackel, Zar Zar Tun, Andrew Goodchild and Linda Bird (Distributed Systems Technology Centre, DSTC) and Sam Heard and Thomas Beale



 Techn iques in the utilizati on of the int ernet and intranets



293



(O cean Informatics) for their contribution particularly with respect to the opellEHR framework. REFERENCES [1] Bru nd tland , G. H . (2003) Press-releasejrom the Director-Ceneral of the IVcJ rld Health Orgallisatioll, Geneva. 16 M ay 2003 . [2] World H ealth Organization (W H O ) (200 2) 1V<"l d Health Report 2002: Reducillg Risks, Promoting Healthy



Life. [3] World Healtll Orgallisatioll (WHO), U R L: http:/ / www.wh o.int, accessed 30 J une 200 3. [4] World He alth Organisation (W H O) (2003) World Cmuer Report, Geneva, 3 Apr il 2003 . [5) Agencyfor Healthcare Research and Quality, U R L: http:/ / www.ahcpr. gov/ , accessed 30 June 200 3. 16] Institu te of Medi cin e (1999) To Erris Human: Bllildillg a Safer Health System, Washington , D C : Nation al



Academy Press. [7] H olland , E. G. & De gru y, F. V. (1997) D rug - Induced Di sorders, A mericml Family Physician, N ovemb er 1, 56(7). [8] Sowerby Centre for Health Infor matics at N ewcastle (1998) PRODI GY Phase 7ivo:Report on the Results of PRODIGY Phase Two, Prod uced for NHS Executive (Primary C are Branch). [9] Boyages, S., Ma,]., Stennetr.]. & Penny, R . (2001) Chronic and Co mpl ex C are R eform for the NSW H ealth System, HealthcoverArticle, Apri l/May 2001. [10J Dazzi, L., Fassion, C , Saracco, R ., Quaglini, S. & Stefanell i, M . (1997) A Patient Workflow Manageme nt System Built on Gu idelines, Prot. of the A merican Medical biforlllatics Association Anllual Fall Symposium, Nash ville, TN. , 146-50. [11) Abbott , S., Johnson , L. & Lewis, H . (200 1) Participatio n in Arr anging Cont inuing He alth C are Packages: Experiences and Aspirations of Service Users,joumal of Nursing Mallagemem, March , 9(2): 79- 85. (12) Summe rs, R ., C loke, A. P., Nurse, D. & Kay,]. D. S. (1998) Wo rkflow M odels of H ospital Disch arge Co mmunicatio ns, IEEE International Conjerence 011 lnkm nation Teclmolo.!iY Applications ill Biomeduine, 144-8. [13J Kindb erg, T , Br yan-K inn s, N . & R anjit, M . (1999 ) Suppo rti ng the Shared C are of D iabetic Patie nts, Proc. of the International ACM SIGGRO UP Conference 0/ 1 Supportillg Grollp Hi" k, Ar izona, April, 9 1100. [14] Dadam , P., R eichert , M . & Kuhn , K. (2000) C linical Wor kflows- The Killer Application fo r Processori ent ed Information Systems?, Proc. of the 4th International Corference 011 BIIsi,less Information Systems, Poznan, April , 36--59 . (15) Brennan , P. F. (1999) H ealth Informatics and Community H ealth : Support for Patients as Collabo rato rs in C are, Methods of b!{ormatioll ill Medicine, 38: 27 4- 8. 11 6] Schoo l', M. & Wastell, D. G. (l 999) Effective Multidisciplin ary C o m municatio n in H ealth care: Coopera tive D ocum ent ation Systems, Methods of Illf ormatioll ill Medicine, 38: 265-73. [171 Commonwealth D epartment of He alth and Ageing, Enhanced Primary Care-r-Medicare Benefits Items, URL: http :/ /www.health .gov.au / epc/. last upd ated 24 Apr il 2002. [18] M id N orth SA Division of Gene ral Practice, " Assessing the Ne w MB S Item s-Group 18 & 19" , URL: http://www.mndgp.org.au/A ccessing%20new%20MBS%20items.pdf. accessed 28 May 2003. [19] Grim shaw, J., Freemantle, N., Wallace, S., Russell, I., Hurwitz, B., Watt, I., Long, A. & Sheldon, T (1995), Developing and Impl em enting C linical Practice Guidelines, lnternationalj ournal jor Quality ill Health Care, March; 4(1): 55-64. [20] Sim , I., Gorman, P., Gre enes, R ., H aynes, R ., Kaplan, B., Lehmann , H. &. Tang, P. (200 1) Clinical Dec isio n Support System s for the Practice of Evidence-based M edic ine. [ournat ~( the American Medical bifonnatics A ssociation, N ov / Dec, 8(6): 527- 34 . [21] H unt, D. L., H aynes, R . B., H ann a, S. E. & Sm ith, K . (1998) Effects of Computer- based Clinical Decisio n Suppo rt System s o n Physician Performance and Patient O ut co mes, JOIlTl/al ofthe A ma icar! Medical Association, O cto ber 2 1, 280(15): 1339-46. [22) Eccles, M ., M cC oll, E., Steen, N ., R o usseau, N ., Gr imshaw, ] ., Parkin, D. & Purv es, I. (2002) Effect of Co mput erised Evid ence Based Guidel ine s o n M anagem ent of Asth ma and Angina in Adults in Primary Care: C luster R andomised Controlled Trial, British Medicalj ournal, O ctober 26,325(73 70): 941. [23] Medline Thesaurus, URL: http :/ / www.nlm .nih .gov / m esh / M Brow ser.html. last accessed 30 June 2003 . [24] Merriam-Wehsters Collegiate Dictionary ( l Oth ed .) (1998) Springfield, M . A. M erri am-Webster, Inc.



 294



Sistine A. Barretto and James R . Warren



[25] Medic8 Gnline Medical Dictionary, VR L: http:/ / www.medic8.com/ MedicalDictionary.htm. accessed 30 June 2003. [26J van Demmel, J. H. & Musen, M. A. (1997) Handbook 


 Techniques in the utilization of the internet and intranets



295



[52] Muller, R., Heller, B., Loffler, M., Rahm, E. & Winter, A. (1998) HematoWork: A Knowledge-based Workflow System for Distributed Cancer Therapy, Proc. of GMDS98, Bremen, September. [53] Barretto, S., Warren,]., Stumptner, M., Schrefl, M., Quirchmayr, G. & Nield, S. (2002) Coordination ofInter-Organisational Healthcare Processes via Specialisation of Internet-based Object Life Cycles, Proc. of the 35 t h Animal Hawaii International Conference on Systems Sciences. [54] Fallis, D. & Fricke, M. (2002) Indicators of Accuracy ofConsumer Health Information on the Internet: A Study of Indicators Relating to Information for Managing Fever in Children in the Home,Journal of the American Medical Informatics Association, 9(1): 73~9. [55] Eysenbach, G., Powell,]., Kuss, 0. & Sa, E-R. (2002) Empirical Studies Assessing the Quality of Health Information for Consumers on the World Wide Web: A Systematic Review, Journal of the American Medical Association, May 22,287(20): 2691-700. [56] Eysenbach, G. & Kohler, C. (2002) How Do Consumers Search for and Appraise Health Information on the World Wide Web' Qualitative Study Using Focus Groups, Usability Tests, and In-depth Interviews, British MedicalJournal, March 9, 324(7337): 573-7. [57] Morris, T, Guard,]., Marine, A., Schick, L., Haag, 0., Tsipis, G., Kaya, B. & Shoemaker, S. (1997) Approaching Equity in Consumer Health Information Delivery: Net Wellness,Journalof theAmerican Medical Informatics Association, 4(1): 6-13. [58] van Woerkum, C. (2003) The Internet and Primary Care Physicians: Coping with Different Expectations, American journal of Clinical Nutrition, April, 77(4 Suppl): 1016S-IOI8S. [59] Tetzlaff, L. (1997) Consumer Informatics in Chronic Illness,Journal of the American Medical Informatics Association, 4(4): 285-300. [60] Moscrop, A., (2001) Managing Chronic Diseases: Kids' Stuff, British MedicalJournal, 323: 1010. [61] Commonwealth Government of Australia, Health lnsite, URL: http://www.healthinsite.gov.au/.accessed 17 June 2003. [62] Kaplan, B. & Flatley Brennan, P (2001) Consumer Informatics Supporting Patients as Co-Producers of Quality, Journal of the American Medical Informatics Association, 8(4): 309-16. [63] de Clercq, P, Hasman, A & Wolffenbuttel, B. (2001) Design of a Consumer Health Record for Supporting the Patient-Centered Management of Chronic Diseases, Medinfi), 10(Pt 2): 1445-9. [64] Magrabi, E, Lovell, N., Huynh, K. & Celler, B. (2001) Home Telecare: System Architecture to Support Chronic Disease Management, Engineering in Medicine and Biology Society. Proc. of the 23rd Annual International Conference of the IEEE, 4: 3559-62. [65] Lovell, N., Balakrishnamoorthy, K., Le, n, Paleologos.j., Huynh, K., Celler, B. & Harris M. (2001) Design of an Internet-Enabled Hospital in the Home Information System for Ambulatory Care, Engineering in Medicine and Biology Society. Proc. of the 23rd Annual International Conference of the IEEE, 4: 3658-61. [66] Cimino, J, Socratous, S & Clayton, P (1995) Internet as Clinical Information System: Application Development Using the World Wide Web, Journal of the American Medical lnjormatics Association, 2: 273-84. [67] Ball, M. & Douglas,]. (1995) Health Information Networks, Medinio, Edmonton: IMIA and Healthcare Computing & Communications Canada, Inc., 1465-9. [68] Morris, S., Cooper.]., Bomba, n, Brankovic, L., Miller, M. & Pacheco, E (1995) Australian Healthcare: A Smart Card for a Clever Country, International Journal ofBiomedical Computing, 40(2): 101-5. [69] Alkhateb, A., Singer, H., Yakami, M. & Takahashi, T (2000) An End-to-end Secure Patient Informati on Access Card System, Methods of lnjormation in Medicine, March, 39(1): 70-2. [70] Beuscart, R., Renard,]. M., Delerue, D. & Souf, A. (1999) Telecommunication in Healthcare for a Better Coordination Between Hospitals and GP's: Routine Application for the "ISAR-'Ielematics' Project, IEEE Transactions on lniormation Technology in Biomedicine, 3(2): 101-8. [71] Godin, P, Hubbs, R., Woods, B., Tsai, M., Nag, n. Rindfleisch, T, Dev, P & Melmon, K. L. (1999) A New Instrument for Medical Decision Support and Education: The Standford Health Information Network for Education, Proc. of the 32"" Hawaii International Conference on System Sciences, January 5-8, IEEE Computer Society Press. [72] Utah Statewide Immunization Information System, URL: http://www.usiis.org!aboutUSIIS.html.accessed 25 June 2003. [73] A Health Information Network for Australia (2000) National Electronic Health Records Taskfi>rce Report to Health Ministers, July 2000. [74] Health Level 7, URL: http://www.hI7.org/, accessed 25 June 2003. [75] CORBAmed, URL: http://www.acl.lanl.gov/OMG/ accessed 25 June 2003. [76J OMG, URL: http://www.omg.org/, accessed 25 June 2003.



 296



Sistine A. Barretto and James R. Warren



[77] Cimino,]., Sengupta, S., Clayton, P, Patel, V, Kushniruk, A., Huang X. (1998) Architecture for a Web-Based Clinical Information System that Keeps the Design Open and the Access Closed, Proc. of



the American Medical Iriformatics Association Annual Fall Symposium, 121-5.



[78] Rind, D., Kohane, I., Szolovits, P, Safran, C; Chueh, H. & Barnett, O. (1997) Maintaining the Confidentiality of Medical Records Shared Over the Internet and the World Wide Web, Annals of Internal Medicine, July 15, 127(2): 138-41. [79] Health Services Division Coordinated Care (1999) URL: http://www.health.gov.au/hsdd/cocare/ index.htm, Australian Commonwealth Department of Health andAged Care, 12 February, accessed 09 May 2001. [80] Grimshaw,]. & Russell, I. (1993) Effect of Clinical Guidelines on Medical Practice: A Systematic Review of Rigorous Evaluations, Lancet, 324(8883): 1317-22. [81] Elson R. B. (1997) Clinical Practice Guidelines: the Role of Technology in Perspective, Disease Management and Health Outcomes, 1: 63-74. [82] Tu, S. w., Johnson, P 0. & Musen, M. A. (2002) A Typology for Modeling Processes in Clinical Guidelines and Protocols, Proc. of the American Medical Informatics Association Annual Fall Symposium, San Antonio. [83] Stefanelli, M. (2001) Careflow Management Systems, OpenClinical, 2001. [84] Sudgen, B., Purves, I. N., Booth, N., Johnson, P & Tu, S. (1999) The PRODIGY Knowledge Architecture Requirements for Chronic Disease Management in Primary Care, journal of theAmerican



Medical Informatics Association Symposium Supplement.



[85] Musen, M. A., Fergerson, R. W. Grosso, W. E., Noy, N. E, Crubezy, M. & Gennari,]. H. (2000) Component-based Support for Building Knowledge-acquisition Systems, Conference on Intelligent Information Processing of the International Federation for Information Processing World Computer Congress, Beijing. [86] Musen, M. A., Tu, S. w., Das, A. K. & Shahar, Y. (1996) EON: A Component-based Approach to Automation of Protocol-directed Therapy, journal of the American Medical Informatics Association, 3(6): 367-88. [87] Goldstein, M. K., Hoffman, B. B., Coleman, R. w., Musen, M. A., Tu, S. w., Advani, A., Shankar, R. & O'Connor, M. (2000) Implementing Clinical Practice Guidelines While Taking Account of Changing Evidence: ATHENA DSS, an Easily Modifiable Decision-support System for Managing Hypertension in Primary Care, Proc. oftheAmerican Medical Informatics Association Annual Fall Symposium, (20 Suppl.): 300-4. [88] Humber, M., Butterworth, H., Fox,]. & Thomson, R. (2001) Medical Decision Support via the Internet: PROforma and Solo, Medinfo, London, UK. September 2-5, lO(Pt 1): 464-8. [89] Boxwala, A. A., Greenes, R. A. & Deibel, S. R. (1999) Architecture for a Multipurpose Guideline Execution Engine, Proc. of the American Medical Informatics Association Annual Fall Symposium, 701-5. [90] Elkin, P, Peleg, M., Lacson, R., Bernstam, E., Tu, S., Boxwala, A., Greenes, R. & Shortliffe, E.H. (2000) Toward Standardization of Electronic Guidelines, MD Computing, 17(6): 39-44. [91] Rousseau, N., McColl, E., Newton.]., Grimshaw,]. & Eccles, M. (2003) Practice Based Longitudinal, Qualitative Interview Study of Computerised Evidence Based Guidelines in Primary Care, British Medicaljournal, February 8, 326(7384): 314. [92] HealthConnect, URL: http://www.health.gov.au/healthonline/connect.htm. accessed 07 March 2003. [93] CEN European Committee for Standardization (1999) Health Informatics-Electronic Healthcare Record Communication-Part 4: Messagesfor the Exchange of Information, prENV 13606-4. [94] The openEHR Foundation, URL: http://www.openehr.org, accessed 04 March 2003. [95] The GEHR Foundation: URL: http://www.gehr.org, accessed 04 March 2003. [96] Barretto, S. A., Warren,]., Goodchild, A., Bird, L., Heard, S. & Stumptner, M. (2003) Linking Guidelines to Electronic Health Record Design for Improved Chronic Disease Management, accepted for the Proc. of the American Medical Informatics Association Annual Fall Symposium, Washington, DC. [97] Barretto, S., Warren,]. & Goodchild, A. (2004) Designing Guideline-based Workflow-enabled Electronic Health Records, submitted to the 37'" Hawaii International Conference on System Sciences. [98] Coiera E. (2000) When Conversation is Better than Computation, journal of the American Medical Informatics Association, May/June, 7(3): 277-86.



 RISK ANALYSIS AND THE DECISION-MAKING PROCESS IN ENGINEERING



MAU R IC IO SANCHE Z-S ILVA



1. INTRODUCTION



The developm ent of any kind of enginee ring facility requires, at some stage, to make decisions, and a thorough consideration ofthe context within which these decisions are made. In engineering, there is always the chance that unintend ed conseque nces might occur. Co nsequently, there is a perm anent search for measures to assess the margin between the capacity of an enginee red facility and the demands upo n it. Since both the demand and the capacity canno t be described accurately, modeling and managing thc un certainty is paramount. Th is chapter presents a discussion on the relatio nship between risk analysis and decision making. Furthermore, a gener al framework in whi ch risk analysis is considered as the main tool for the decision-making is proposed. Special emphasis is given to optimization as a fundamental tool for supporting assertive decisions. 2. THE NEED FO R RISK MANAGEMENT



The reason for the existence of engineering is to provide bett er and efficient means to improve the quality ofl ife. Therefore, for any engine ering project to be successful, it is necessary to estimate all possible futur e scenarios and make the appropriate considerations in the planning, design, constru ction and operation stages. Some scenarios cannot be foreseen at all, and those wh o can be predicted might be difficult to model. If an un expected event occurs, engin eerin g and society canno t do mu ch ; as stated by Plato: "How can I kn ow what I do not know?" H owever, regarding the events we can 297



 298



Mauricio Sanchez-Silva



predict, our responsibility is to find technical requirements that balance safety and cost, in order to be included in design and construction specifications. It is in the meaning of the word balance where the decision making process and risk analysis become relevant and risk analysis relevant. Besides, appropriate decisions should also take into account the context and the preferences of the person or institution that makes this choice. Within this required balance, safety is related to avoiding losses of any type, e.g. economic, human, and environmental. Cost is referred to as the value of such losses for an individual, an institution or for society as a whole. Value is defined as the proportion of the satisfaction of needs divided by the use of resources. In other words, value is proportionate to quality divided by cost. Thus, the main quandary is: how many resources is someone willing to invest in safety given that they are limited? The answer lies in one of the most interesting and difficult fields in engineering, risk acceptability (section 6). The paramount importance of this concept is that it defmes the way the economy and social welfare evolves. Although it is not evident for many engineers, behind any regulation, e.g., code of practice, there is a whole framework which determines our life. 3. RISK



Haimes (1998) defines risk as a measure ofthe probability and severity ofadverse effects. Harr (1987) defines it as the lack of ability of the system to accommodate the imposed demands placed upon it by the sponsor, the user and the environment. More specifically, Melchers (1999) defines the term risk in two ways: 1) the probability of failure of a system from all possible causes (violation oflimit states); and 2) the magnitude of the failure condition, usually expressed in financial terms. These definitions of risk are commonly used in specific engineering problems called "hard systems" which tackle well-structured systems engineered to achieve a given objective (Checkland, 1981). A common and widely used definition of risk is in terms of the expected value of a given level oflosses (i.e. cost): Risk



=



II



E(L)



= LP(L;)L;



(1)



i=l



where L is the loss (i.e. measure of the consequences) of the system for a given failure scenario i; P(L i ) is the probability of occurrence of such a loss; and n the total number of scenarios considered. For instance, the risk, or the expected lossesin a city, might be described as the losses caused by an earthquake multiplied by the probability of having such losses in case of an earthquake, plus the losses caused by landslides multiplied by the probability of having such losses in case oflandslides, and so forth. Note that p (L i ) in equation 1, not only refers to the probability of occurrence of the trigger event (e.g. earthquake), but to the occurrence of any scenario i. Detailed risk analysis should consider the immediate and long-term consequences, as well as the changes in the probability assessment in time (Blockley and Dester 1999). The definition of what is "high" or "low" risk depends upon the context and the decision to be made (section 6).



 Risk analysis and the decision-making process in engineering 299



The probability oflosses in a given scenario (equation 1) (e.g. caused by earthquakes) can be expressed as (Sanchez-Silva, 2001):



L p(Li I Aj)p(Aj) m



p(L i) =



(2)



j=l



where p(L; I A j ) is the probability of a loss level L; given that the event A j has occurred; and p(Aj ) is the probability of Ai- Events A j are defined by the context of the problem and can describe, for example, different intensities of the same phenomenon. Then, following the previous example, the probability of having losses due to earthquakes is the sum of the probability of having a level ofloss due to earthquakes given that an earthquake of intensity I = 4 has occurred, multiplied by the probability of having an earthquake of intensity I = 4, plus the probability of having a loss given that an earthquake of intensity I = 5 has occurred, multiplied by the probability of having an earthquake of intensity I = 5, and so on. Therefore, in its general form risk can be described as: Risk = E(L)



=



t [~P(Li



I Aj)P(A;)] t;



(3)



Probability assessmentdepends highly on the definition oflimit states, which in turn, is highly related to the model ofthe system. The probability oflosses (i.e. probability of failure, p f) depends upon the relationship between the demand (D) and the capacity (C) of a system, and it can be described in terms of the limit state function, g (D, C), as: Pf



= p(g(D, q :::: 0)



(4)



Reliability theory states that in its general form, failure probability can be computed as: (5)



where f xCx) is the joint density function of the state variables of the system and D the unsafe region. A detailed discussion on the mathematical aspects of equation 5 can be found in Harr (1987), Kottegoda and Rosso (1997), and Melchers (1999). Blockley (1992) argues that a dependable risk analysis requires the complete identification and assessment of all unwanted possible futures for computing the probability Although many of them can be identified, risk assessment can only be carried out over a small number of future scenarios, which are technically foreseeable. In other words, risk analysis may be a limited guarantee of a proper description of possible future scenarios, especially when those possible futures are difficult to predict. Therefore, the analysis should focus on hazard. However, that is not to say that a risk analysis does not



 300



Mauricio Sanchez-Silva



provide useful information, but that it is information which is an input into an overall analysis. Blockley (1992) proposes a definition of risk slightly different from traditional engineering approaches but more general: "risk is the combined effect of the chances of occurrence of some failure or disaster and its consequences in a given context". This definition considers three important factors: probability, consequences and context, which are the key for a dependable risk analysis and have been widely discussed (Blockley 1992, Sanchez-Silva et al. 1996, and Elms(Blockley 1992)). The value of this definition is that it is robust enough to be used in the analysis of diverse situations such as natural risks and industrial safety. Frequently, vulnerability functions are combined with hazard data in order to estimate the probable distribution of losses for all possible events in a given time period to determine the risk (Coburn and Spence, 1992). For instance, in earthquake engineering, risk is defined as "the probability that social or economic consequences of a specific event will equal, or exceed, specified values at a site, at several sites, or in an area, during a specified exposure time" (EERI, 1984). The way hazard and vulnerability functions are combined to obtain risk is motive for continuous debate. An event can be considered a hazard, only if it is a threat to a system, and a system is only vulnerable if it can be damaged by an event (i.e. hazard). There is a strong dependence between these two concepts, and only in a few situations can hazard and vulnerability be assumed to be independent variables. Therefore, risk cannot be evaluated just by multiplying these functions because this implies an independant relationship. The term "convolution" is sometimes used as a way to describe the complex connection between hazard and vulnerability, but it does not have any meaning in the strict mathematical sense. On the whole, risk focuses on the identification and quantification of those factors contributing to cause a loss of fitness for purpose (i.e. function) of a system. 4. DECISION-MAKING PROCESS



4.1. Basic concepts



Decision is a choice or a judgment that is made about something. When a decision is required, the person or institution is faced with a set of alternative actions and the uncertainty about the consequences of all or some of these actions. The problem lies in deciding what the best possible action is. Usually, the best action is defined in terms of a rational decision strategy which is based only on the information available. Any engineering study is directed at providing information for decision making and the decision is usually made by optimizing the objective function F 0: aF(X j , X 2 ,



••• ,



XII)



-------=0



ax;



i=1,2, ... ,n



(6)



F 0 describes the decision criteria (e.g. cost) and Xi, the main variables involved in the decision. In spite of the fact that there are a significant number of numerical methods



 Ri sk analysis and the decision -m aking process in engineering 301



Decision 'ode



Probability node



Conse quences



I



Possible decisions



Alternative actions



Figure 1. General structure of a decision tree.



for solving equation 6, the applicability and scope of such an approach is limited by the mathematical requirements and the nature of enginee ring problems. Makin g decisions require s consid erin g different types of information and evidence that cannot be described within a single mathematical model. Therefore, several meth ods, such as event threes (section 4.2), have been developed and are widely used in many different enginee ring problems. 4.2. Decision trees



For problems related to engineerin g decisions, decision trees are a com mon alterna tive which is a convenient meth od to integrate graphic and analytic conce pts. In particular, the analytic component is based on conditional probabili ty and the Bayes rules. Decision trees have a structure which is similar to an event tree. It is drawn by identifying the various possible decisions. Fur ther subdivisions of every possible decision present alternative actions. Then, the final struc ture provides an overview of all possible actions and consequences over w hich estima ted probability of occur rence can be computed (Figure 1). Making decision s based on the probabilities at the end of every branch is not appropriate, since the nature of the conseque nces has to be considered. In ot her words, it is not enou gh to make a decision based on a criterion such as maximu m probability of loss, but it is necessary to carefully consider the implications and possible consequences of such a decision . This leads to define a criteria for selecting the best alterna tive among all available. A criterion widely used for that purpose is the maximum expected value. It is based on the Von N ewm an and Morgenstern (1944) idea that any decision has



 302



Mauricio Sanchez-Silva



to be made based on the expected value; otherwise it will not be appropriate. It is important to mention that appropriateness does not mean success; it implies that given certain conditions, this is the best option. The expected value analysis is possible only if all parties interested share the same objectives and use the same attributes for making the decision (e.g. economic value). Thus, the best decision is:



D=max



[



~Pi ~Xij >II



("



)]



(7)



where m is the number of alternative decisions, n the number of actions associated to every alternative, Pi is the probability of every alternative and Xij the consequences of action j asociated to alternative i. Note that the expected value computed in equation 7 is the same conceptual approach described in equation 1 to define risk. Many other criteria have been defined for improving the selection of alternatives. For instance, the minimax strategy focuses on the minimum value of the maximum risk of every possible decision. Similarly, other approaches such as the maximin, maximax or the Hwrlitz rule are also well known strategies. The mathematics on this matter has been widely discussed in the literature. However, what is important is that there has to be a way to select the best alternative rationally. 4.3. Defining utility criteria



Quantifying the result of every possible action is only possible if the decision criterion is "measurable". When the decision has to be made in terms of parameters difficult to measure, e.g. preferences, level of quality, comfort, danger, it is difficult to calculate the expected value. An appropriate decision is the one that maximizes the expected benefits of the outcome, which are not necessarily economic, but might also be social or personal. In order to solve this eventuality, a common approach to obtain quantifiable criteria for decision making is through utilityJunctions. Utility functions are defined as a measure the implications of the decision, for the decision maker, in an overall form. Utility is defined as the true measure of value for the decision maker. Thus, a utility function is a factious function which describes the relationships between the actual values of a given decision. Let us assume, for instance, that someone is facing the decision to take route A or B to get to their work place in the morning. Route A is longer but less expensive, and route B is shorter but very expensive. The decision about what route to take depends upon the values of the person who makes the decision. Ifhe/she prefers to get on time to his office, route A is not an option. However, if cost is the main decision factor, the best option is route A. On the whole, decision making is based on preferences; that is, on the utility or disutility function (figure 2).



 Risk analysis and the decision-making process in engineering



303



Utility function u(x) Higher preference



1.0 Risk -aver1Sion



.,



..,



:::.::::.::::.:::.::: . ."



Neutral to risk



Lower 0.0 Preference



Risk -affinity



Attribute (e.g. US$)



Figure 2. Typical utility functions.



The choice of a suitable utility or disutility (i.e., loss) function is perhaps the main problem in the Bayesian approach and in defining the best alternative in a decision. There are many ways to define such functions; some are based on expert opinions and others based on standard well known functions (e.g., exponential: a + b - yx, Logarirmic: a In(x + f3) + b, quadratic: a (x - 0.5 a x 2 ) + b). The detailed discussion on this matter is beyond the scope of this chapter but has been widely discussed by many authors. Currently, utility functions are widely used for modeling decision processes which move away from pure technical problems. Within the context of applied engineering, utility concepts will continue developing and, in the future, it will take a more relevant role in risk analysis and the decision making process. The discussion on risk (section 3) and the basics of decision making (section 4) will be now integrated in the next section. 5. RISK-ANALYSIS BASED DECISION PROCESS



Decision-making is based upon a process of collecting evidence and developing models to combine this evidence. However, one of the main problems is that there is not a widely accepted notion of what a "good" decision is. In risky decisions, the expected benefits or risks are not assured; on the contrary, they mayor may not occur. Therefore, uncertainty management becomes a key element in the process. The cyclic process of the interaction of a decision maker with the world has been described by Blockley (1992) as a Riiflective Practice Loop (RPL) (Figure 3). In the RPL a decision maker perceives the world, reflects on it and takes a decision to act. According to Blockley and Dester (1999), the decision making process is driven by



 304



Mauricio Sanchez-Silva



Figure 3. Reflective practice loop, adapted from Blockley and Godfrey (2000).



expectation, which is the result of the perceived past, present and forecast rates of change for outcomes and consequences. They also suggest that decisions should consider not only whether there is a benefit, or a risk, but also their respective chances and the potential for success or failure. Thus, a decision depends also on the opportunity and the risk, understood as the relative chances of having a benefit or a loss, respectively. In addition, the potential for success or failure determines the concern of the decisionmaker after considering the present situation in the light of the potential consequences of his/her decision; this is a fundamental element for the fmal decision. Risk analysis is a decision tool and, as a consequence, cannot be abstracted from the decision process. A decision process based on risk analysisusually connotes quantitative assessmentsand therefore reliance on probability and statistics. Many authors, e.g., Ang and Tang (1984), Kottegoda and Rosso (1997) and Haimes (1998), present and discuss the most common quantitative and qualitative strategies for decision-making (e.g. decision trees, decision matrix, fractile method, etc.). However, risk-based decisionmaking methodologies do not necessarily require knowledge of probability. Through risk, the identification of hazards, the definition and description offacilities, the assessment of the susceptibility to damage and the consequences of failure may be handled within a single framework, as discussed in the section 5.1. 5.1. General framework for integrating risk to the decision making process



The decision process consists of defining, analyzing, classifyingand ranking all possible scenarios in terms of their likelihood and consequences, and to define an acceptance criterion for selecting the best option. A complete strategy for using risk analysis



 Risk analysis and the decision- making process in engineering 305



Identification and definition of the system



I



ature of the decision problem



-FUnClio n - (Wha t it is for ) -Ele rne nts and relat ionships [ -Boundaries/Co nrexuli nvi ro n rnent -Sysrems approac h ("Hard"rSoft") -O bj Cctivc of thc dec ision -Aspccts involved within the decision



[



-Decision characteristics (i.c, limits, context ) - Bencflt/Conscqucnccs of the decision



Risk analysis



Assessment of risk acce ptability



Risk analysis base d decis ion Figure 4. R isk based decision process.



as part of the decision makin g process is present ed in Figur e 4. In this pro cess, the identification and definiti on of th e system, the nature of th e decision problem , a risk analysis strategy, the assessme nt of risk acceptability and th e selection of an alterna tive (i.e., decision) are considered within a single framework . Th e identificatioll and dejinuion of the system focus on und erstanding the problem as precisely as possible. It sho uld be considered from a systemic point of view and must include the definiti on of the bou ndaries, the elem ent s and their relation ships. In addition, other imp ortant aspects that are fundam ental in the problem definition , such as transformations, owners, players, client and resources, should also be considered. An en gineering system includes physical as well as organizational components, wh ich interact tightly to fulfill its function . Thus, the con text plays a significant part, and it has been argued th at it is a key element in the definition of risk acceptability. Wh at drives the decision is th e need to move from a current state of affairs to a new state of affairs. The motive of this process m ay be the result of the ow ner's decision or th e result of imposed needs by extern al conditions. Therefore, a clear understanding of the relationship between th e function of th e system and the implicatio ns of the scenarios resulting from the decision is required. The scop e and aims of the decision , as well as the relationships between benefit/ consequences, oppor tunity/ risk and preparedness/hazard (Blockley and Dester, 1999), must be carefully listed, studied and reco rded. The risk analysis process, as described in Figure 5, rearranges and includes the concepts of hazard, vulnerability and risk. System modelil1g conce ntrates on the fun ction al



 306



Mauricio Sanchez-Silva



Na tur8 oI 1h8 decision problem



Risk Analysis • Functions/attribute s of system co mponents [ • Dependen ce relationships of components



System modelling



[



• Selection of the decision criteria: strength/ form! grounding! adaptability.



• External factor (trigger event) • Internal factor (precondition for the failure) [ • Haza rd modelling characteri stics.



.s r---------··-·-- - -- --.-.---.j ~I



~I ~I .~



Probabilistic analysis of failure scenarios



t_. ._.



Sl! ..;c



I !



!



I



~i



I



Analysis of consequences



i



I



.J



Identify and define all failure scenarios. • Look for scenarios related to the decision criteria. • C lassify and rank scenarios on the bases of the [ decision criteria.



[



• Detailed probabil istic analysis for all scenarios defined. • Check dependabil ity of the probabi listic models.



• Evaluation of consequences by scenario. • Co nsider techn ical/social/economic aspects. [ • Qua ntification of the consequences



t



Ass6ssm8nr 01risk acceptability



Figure 5. Risk analysis (follows from Figure 4).



relationships between system components and deals with the dependability of the cause-effect model. Modeling dependence between parameters and identifying the parameters' statistic model is paramount. The defmition of assessment criteria (e.g. form, strength, grounding) is decided in terms of the function of the system, and it might be possible that more than one criteria is needed (Table 2). The identification of riskgenerating hazards defines all the different sets of state variables which are preconditions for failure. They will be used later to describe the scenarios considered in the analysis. Risk assessment requires the consideration of all possible future scenarios although they might not all be easy to predict (Table 2). Not all future scenarios identified are technically foreseeable and may require the use of alternative assessment procedures. Uncertainty due to the difficulty of predicting scenarios must be an essential part of the decision process. Each scenario requires a probabilistic analysis, not necessarily based on classic probability, and a detailed consideration of the consequences. The probabilistic analysis must be as complete as possible and has to manage carefully all factors related to the uncertainty: incompleteness, fuzziness and randomness. Quantitative parameters should be estimated by using probabilistic methods or empirical evidence. They should not be estimated on the basis of extrapolation beyond the limits of empirical data unless



 Risk analysis and the decision-making process in engineering



307



Table 1 Basic concepts included in the risk analysis



Criteria



Description



Assessment criteria



• • • • •



Strength (i.e. the ability to cope with external demands) Form (i.e. its shape or pattern) Grounding (i.e, basis on which the model is founded) Adaptability (i.c, change to deal with new situations) Response capacity (i.e. ability to recover)



Characterisation of losses



• • • • • • •



Costs of goods and services Economic effects (e.g. unemployment, low productivity) Deaths and Injuries (e.g. Fatality Rate) Health (e.g. illness, life expectance) Psychological factors (e.g. traumas) Environmental impact Quality oflife



• • • •



Most likely Most frequent Maximum losses Expected recovery time



Definition



offailure scenarios



•



there is dependable evidence. Sensitivity analysis might be a very valuable tool for estimating the effect of imprecision and uncertainties (Stwart and Melchers, 1997). The consequences must be viewed from a systemic perspective and must be considered in terms of technical, social and economic aspects (Table 1). Immediate and long term consequences must be also considered. Risk analysis should not search for precise solutions to the problem, but provide relevant elements (i.e. evidence) for a decision. Thus, after all possible scenarios have been considered, enough evidence must have been collected for a decision. The level of detail used for the risk analysis has to be considered very carefully for the relevance of the decision. In fact, the level of detail of the analysis should not go beyond a limit where relevance and precision are mutually exclusive (Zadeh, L. A., 1965). The final criteria for the decision should consider the problem of acceptability oj risk (section 6). It should be assessed on the basis of current risk levels (e.g. code of practice) and expected changes. The acceptability of risk should be assessed with regard to explicit assessment of all relevant quantitative and qualitative characteristics of the system. It should not be assessed on the basis of single-valued measures of risk. Alternatives for the decision must be evaluated with reference to the economic, legal and political context. Finally, a decision is made, and further actions, such as monitoring and review, have to be designed to ensure dependable long-term solutions. 5.2. Final remarks



To close this section it is important to stress that a good decision maker is not only a person who has a good strategy and makes appropriate choices, but rather a person with the ability to see the world clearly in a coherent picture. Blockley and Godfrey (2000) state that this clarity is simple but not simplistic and depends upon strong underlying



 308



Mauricio Sanchez-Silva



conceptual models. They also stress the concept of "wisdom engineering" and quote Elms (1989), "A wise person has to have knowledge, ethicalness and appropriate skills to a high degree. There also has to be an appropriate attitude, an ability to cut through complexity and to see the goals and aims, the fundamental essentials in a problem situation and to have the will and purpose to keep these clearly in focus. It has to do with finding simplicity in complexity. More fundamentally it has to do with world views and the way in which the person constructs the world in which they operate, which is to say, in engineering, that wisdom hasto do with having appropriate conceptual models to fit the situation". In summary, good decisions are strongly related to wisdom engineering. 6. ACCEPTABILITY OF RISK



Risk based decisions are not only about the best option in terms of measurable parameters (cost, utility, etc), they are also related to what a person, or group of people, is actually willing to accept. Risk acceptance depends on complex value judgments, which consider both qualitative and quantitative characteristics of risk. Reid (Blockley 1992) argues that the acceptability of risk is determined by the need for risk exposure, control of the risk and fairness. For instance, a decision might be optimum, in a mathematical sense, but the psychological, social or personal conditions may make that option not viable. For instance, in tossing a coin, the probability of winning or losing, is 50%. If people are asked how much they will bet, most people will bet 1 dollar, fewer people will bet 50 dollars, only a few 100 dollars, and very few 1000 dollars. The shape of the function that describes the willingness to bet defmes the utility and the aversion or affinitive functions described in section 4.3. It has been observed that, in general, people are risk averse. The definition of acceptable level of risk has always been a key issue in engineering design. In reliability terms, this is related to the decision about whether the probability of a limit state violation is acceptable or not. However, the decision about acceptance should also include an assessment of the consequences of failure and the context within which an unfavorable event might happen. Among the most common criteria for making decisions about risk acceptance are: the comparison between the calculated failure probability with other risk in society, the definition of the ALARP (As Low As Reasonably Possible) region, the calibration at past and present practice, and the cost-benefit analysis (Sanchez-Silva and Rackwitz, 2003). In comparing calculated failure probability with other risks in society, special attention should be given to the differentiation between individual and collective risks (Figure 6). An individual acts with respect to his/her needs, preferences and lifestyle. Thus, risk acceptance depends on the degree to which the risk is incurred voluntarily. Collective (public) risk is of concern for the government, or the operator of a technical facility, who acts on behalf of society as a whole and is not concerned with the individual's safety. Table 2 presents the main causes of death in Colombia and the corresponding failure probability obtained from the National Statistical Office (DANE, 2002). The values



 Risk analysis and the decision-making process in engineering



309



Individuals 0



~1O-3



~1O-6



Not important



a



10-4



1.0 Unacceptable



Costlbenefit analysis



Society 0



~1O-7



Not Important



a



10-8



~1O-5



a 10-6



Costlbenefit analysis



1.0



Unacceptable



Figure 6. Risk perception for individuals and society (Pate-Cornell, 1994). Table 2 Main causes of death in Colombia



Causes of death Heart attack Cancer Respiratory problems Digestive system diseases Malnutrition Homicide Traffic accident Drowning Fire



Number of fatalities



Annual probability of death



50504 26477 16981 9280 8562 26163 10807 1152



1,22 ·10-.1 6,37.10- 4 4,09.10- 4 2,23.10- 4 2,06.10- 4 6,20.10- 4 2,60. !O-4 2,77.10- 5 4,53·10-('



188



presented in Table 2 vary significantly from country to country, in particular between developed and less developed, The effect of these differences is directly related to the life expectancy of a population, which is a concept that will be discussed in section 8.2What is relevant in terms of the acceptability is the relation between actual annual probability of death and peoples' perception. A representative statistical study was carried out in Colombia's Capital, Bogota, over 1100 people with different socioeconomical backgrounds in order to have a better knowledge of their perception of risk of death. The results were compared with those values obtained by the official statistics. The comparison was made in a normalized space for which different models relating perception and actual failure probability values were calibrated. In Table 3 the result of people's perception regarding the affinity to risk can be seen. It can be observed that for most causes of death considered, risk-aversiveness is common. In spite of the fact that this approach seems to be a "rational" way ahead, defining acceptable risk by comparing death rates of different activities within a society may be misleading. Acceptability of risks varies with age, gender, socioeconomic conditions,



 310



M aur icio Sanc hez-S ilva



Table 3 Risk perceptio n for Colombi a Ca use of death



R isk attitu de



Trafftc acciden t H erat Atta ck Enf. Respiratorias N atural disasters En f. Sistema Di gestivo Ca nce r Fire Me di cal co mplications H om icide N ervo us system Air accide nt Infrastru cture dam age / collapse D rowning Infection Brain dam age M alnutrition



10.1 10-2 ~



1\1 10-3 ::::: >.



o



10-4



...



10.5



C U ::l



0"



U



u,



10-6 10-7



i i



...



----~-..::-+ _



_1i



Aversion Affinit ive Aversion Aversion Aversion Afftnitive Aversion Aversion



Affinitive Aversion Aversio n Aversio n Aversio n



Affinitivc Aversion



Affinirive



.Ii



_



E!tremely high !



;



··1-··---··1----···-·I i i



'AiARP-r' ----1---



00



. -



-



-



-



---"'~--,+-_._ot important



I !



100



10



1,000



10,000



umber of fatalit ies, N Figure 7. D efin ition of th e ALARP Region.



level of education, cultural background, available information , medi a influence , physiological aspects, and so forth. The ALARP approach defines a region of acceptable values of probability of failure in a plot of the occurrence prob ability of adverse events versus their conseque nces (Figure 7). It has been used widely in the industry as part ofHealth and Safety programs. Althou gh this approach might be appealing, there arc significant difficulties in its interpretation , openness of decision processes, morality of actions and comparability between facilities (Melchcrs, 1999).



 Risk analysis and the decision-making process in engineering 311



Calibration of acceptable levels of risk at past and present practice has also been used for defining target reliabilities. It is tacitly assumed that this practice is optimal although this is not at all obvious. Rackwitz (2001) argues that despite this acceptability criterion is based on trial and error, it cannot give totally wrong numbers because of their long history. Nevertheless, analysis shows that there is great variation in reliability levels. Finally, a reliability oriented cost-benefit analysis considers that technical facilities should be economically optimal. However, the question of the economical value of human life, or better, the question of how to reduce the risk to human life, cannot be avoided. This approach has been recently updated by including the Life Quality Index (LQI) as proposed by Nathwani et al. (1997), leading to the conclusion that risk acceptability from the public perspective is essentially a matter of efficient investment into life saving measures. This approach will be discussed in detail in section 7.2. Acceptable risks for most engineering artifacts which might cause fatalities, measured in terms of annual probability, have shown to be below the level for common chronic disease (10- 3 /yr) but somewhat above the "de minimis" risk threshold, around 10- 7 (Pate-Cornell, 1994), where individuals and society are indifferent to the risk (Ellingwood, 2000). Since this range is extremely wide, a strategy for selecting appropriate target reliabilities and risk acceptance criteria becomes a key element for making decisions. The optimization strategy presented in the following sections is suggested as a dependable and effective approach for defining the criteria for optimum seismic structural design. 7. OPTIMIZATION



The overall discussion in the previous sections has taken us to the problem of making assertive decisions. In spite of the fact that a decision involves to great extent the context of the problem and a significant degree of subjectivity, "numerical models" are important to provide criteria with a low degree of subjectivity. Therefore, although incomplete, optimization techniques are rational dependable models which may support coherent decision processes. This section describes some basics of optimization and its applicability to engineering problems. 7.1. Basic optimization concepts



The essence of numerical optimization is described by equation 6, which provides the maximum or minimum value of the objective function F O. Considering uncertainty, a reliability-based optimization process consists of defining the optimum value of the vector parameter p for which the engineering facility is financially feasible. The vector parameter p stands for any measure capable to control the risk offailure. For instance, in the case ofa building structure, p could be the dimension ofthe structural elements, the reinforcement, the quality assurance program during construction, the maintenance program during service and so forth. The general objective function for maximization can be expressed as: Z(p) = B - C(p) - D(p)



(8)



 312



Mauricio Sanchez-Silva



$ B



(Benefit)



Figure 8. Description of cost functions for the optimization.



where B is the benefit derived from the structure assumed independent of the vector parameter p; C(p) is the cost ofdesign and construction; and D(p) the expected failure cost (Figure 8). Since the structure will eventually fail after sometime, the optimization has to be performed at the decision point (i.e. t = 0). Therefore, all cost need to be discounted, for example, by using a continuous discounting function such as 8(t) = exp[-yt]. In accordance with economic theory benefits and expected cost, whatever types of benefits and cost considered should be discounted by the same rate; however, different parties may use different rates. While the owner may use interest rates from the financial market, the interest rate for an optimization in the name of the public is difficult. A detailed discussion on participation and importance of the interest and benefit rates can be found in (Rackwitz, 2002). For typical engineering facilities, Hasofer and Rackwitz (2000) proposed several replacement strategies: (1) the facility is given up after service or failure or (2) the facility is systematically replaced after failure. Further, it is possible to distinguish between structures that fail upon completion or never and structures that fail at a random point in time. Assuming a constant benefit (i.e. b = f3 Co), the objective function for all cases is presented in Table 4, where H is the cost of failure; y is the annual discount rate corrected for in/deflation averaged over sufficiently long periods; and A(p) the rate of failure for stationary Poissonian failure processes. The merit of the objective function for random failures in time with systematic reconstruction (e.g., seismic design case) is that it does not depend on a specific lifetime of the structure, which is a random variable very difficult to quantify and usually much longer than values specified by codes of practice. The solution is based on failure intensities and not on time dependent failure probabilities. It is neither necessary to define arbitrary reference times of intended use nor is it necessary to compute first passage time distributions. The same targets, in terms of failure rates, can be set



 Risk analysis and the decision-making process in engineering 313



Table 4 Objective function for various replacement strategies Replacement strategy



Objective function



Failure upon completion due to time invariant loads - Systematic reconstruction



Z(p)



Random failure in time - Given up after completion - Systematic reconstruction -



b A(p) Z(p) = Y +A(p) - C(p) - H Y +A(p)



Random failure in time due to random disturbances



Z(p) =



=~Y



~ Y



C(p) - (C(p)



- C(p) - (C(p)



+ H)



Pf(P)



1 - Pf(P)



+ H)_A--,,-Pf---,(P_)_ y +APf(P)



for temporary structures and monumental buildings, given the same marginal cost for reliability and failure consequences (Rackwitz, 2000). Thus, by using the third equation in Table 4 there is no need to perform the optimization in terms of the expected total cost over a time period (i.e. design life for a new facility, or remaining life for a retrofitted facility), usually called Lifecycle Cost Design criteria (section 8), but to refer the analysis to yearly rates of occurrence of the event and annual failure probabilities. 7.2. Cost of saving human lives



It was stated in section 2 that decision making requires a balance between safety and cost, and that safety is related to avoiding losses. Commonly in engineering, safety is related to saving human lives. There has been always a great amount of discussion about the cost of human life, but despite the moral and ethical considerations, economic values are still assigned, mainly by insurance companies. For instance, FEMA (1992) reports that, for the United States, the cost of injury can be taken as US$1,000/person and US$lO,OOO/person, for minor and serious injury respectively. According to the same source, the life saving cost has been assessed at US$1 ,700,000. In general terms, fatality and injury losses can be evaluated using one of two approaches, namely, human capital approach and willingness-to-pay approach (Cho et al., 1999). These are interesting approaches that support a common need in engineering, expressed by Lind (2001) as "a measure of tolerable risk should be based on human values and expressed in human terms". Many widely used social and economic indicators have been developed for international organizations as an attempt to measure and compare the "quality" of life and development of different societies. Basic social indicators are statistical time series such as life expectancy or Gross Domestic Product (GDP), while compound social indicators are functions of such data for specific purposes. Lind (2001) argues further that any social index that is a differentiable function of life expectancy and GDP per person, imply a tolerable and simultaneously affordable risk value. The principle of equal marginal returns defines those risk-cost combinations. Although the management of public risks has several ethical, psychological and political dimensions, the core of the overall management of risk is a problem of allocation of economic resources to serve the public good.



 314



Mauricio Sanchez-Silva



Risk management is the purchase of extra life expectancy. Thus, "the cost of lifesaving is not so many dollars; rather the cost ofa dollar is so much life" Lind (2001). The cost of human life will be taken into account through the Life Quality Index (LQI), which is a compound indicator of the well-being of a society defined as (Nathwani et aI., 1997): L = f(g)h(t)



(9)



where the function jig) represents the intensity, while the factor h(t) represents the duration of the enjoyment oflife. The LQI assumes that jig) and h(t) are independent functions. The parameter g is the individual mean contribution to the GDp, and t the time for enjoyment of life whose quality is measured by g. Life expectancy, e, is proposed as a measure of safety and the GDP per person, g, as a surrogate measure of the quality oflife. On the whole, the LQI is a cost/efficiency-based criterion expressed in terms of a marginal utility that does not depend on absolute values oflife expectancy at birth or the gross national product. It is based on considerations about the potential loss oflife and does not deal with risks to any particular group, nor does it deal with risks to any identifiable person. The LQI is based on the assumption that GDP per capita (i.e. g) is proportional to working time w; thus, if the time spent in economic activities is we (0 < w < 1), then, the time for enjoyment oflife is t = (1 - w)e. It is therefore reasonable to assume that individuals maximize their income with respect to the time they spend in earning it, that is, dL



-=0



dw



(10)



After some mathematical manipulation, the LQI can be approximated as (Nathwani et aI., 1997; Rackwitz, 2001): (11) where w is the fraction of e devoted to economic activities in order to raise g. It has been observed that the value of w varies between 0.10 for developed countries and more than 0.20 for undeveloped countries. An activity, regulation, project or undertaking changing life expectancy and involving cost is reasonable if (Nathwani et al., 1997): dL dg de -=w-+(l-w)->O L g e



(12)



The cost of life saving operation requires estimating the cost of averting a fatality in terms of the gain in life expectancy .0.e. The cost of the safety measure is expressed as a reduction .0.g of the GDP. Thus, the Implied Cost of Averting a Fatality (leAF) can be obtained from the equality of equation 12 after separation and integration from g



 Risk analysis and the decision-making process in engineering



315



to g + !1g and e to e + !1e. Therefore, the cost per year (i.e. !1 C = -!1g) to extend a person's life by !1e is: (13)



Because !1 C is a yearly cost and the (undiscounted) lCAF has to be spent for safety related investments into technical projects at the decision point t = 0, one should multiply by er = !1e and the lCAF becomes: lCAF(e r ) =g(1- (1 +e,/e)-(!/q))e,



(14)



The societal equality principle prohibits differentiating with respect to special ages within a group; therefore (Rackwitz, 2002), lCAF=



1'" o



lCAF(e - a)h(a)da



(15)



where h (a) is the density of the age distribution of the population. The density of the age distribution can be obtained from life tables. Recently, Pandey and Nathwani modified the LQI by defining a Societal Life Quality Index (SLQI) and a detailed discussion ofthis can be found in Rackwitz (2003). The lCAF in equation 14 is in fact very close to the human capital, i.e. the lost earnings if a person is killed at mid life; therefore, it might be used as an orientation for ajustifiable compensation by insurance or the social system in a country. The lCAF is derived from changes in mortality by changes in safety-related measures implemented in a regulation, code or standard by the public. Values of g, e, wand lCAF for selected countries are shown in Table 5. Therefore, in an exposed group of technical projects, with N F potential fatalities, the "life saving cost" is (Rackwitz, 2002): HI



= lCAF . k . N I



(16)



where k (0 ~ k ~ 1) is a constant that relates changes in mortality to changes in the failure rate and can be interpreted as the probability of actually being killed in case of failure. H F implies that incremental investments into structural safety should be undertaken as long as one can "buy" additional life years. It is emphasized that H F is not an indicator of the magnitude of a possible monetary compensation for the fatalities, such as in case of an earthquake event, nor a measure of the human life. It is a number which society is willing to pay to save life years, i.e. investments in structural safety via codes of practice or the like. It enters into the optimization as a fictitious number at the decision point (Rackwitz, 2002). 7.3. Life cycle costing



A particular case of optimization has to do with the life cycle of any engineering artifact. This is a topic that has taken more relevance every day given its importance for



 316



Mauricio Sanchez-Silva



Table 5 GOP/per capita (all monetary values in PPP US$, 1999, i.e. adjusted for purchasing power parity), life expectancy and w for selected countries



Country Canada France Germany USA Mexico Brazil Colombia India China Japan Egypt South Africa Kenya Mozambique



g (US$) (GOP/capita)



e (Years)



W



leAF



26,251 24,470 23,742 31,872 8,297 7,073 5,749 2,248 3,617 24,898 3,420 8,908 1,022 861



78.7 78.7 77.6 76.8 72.4 67.5 70.9 62.9 70.2 80.8 66.9 53.9 51.3 39.8



0.125 0.125 0.125 0.125 0.15 0.15 0.15 0.18 0.18 0.15 0.15 0.15 0.18 0.18



2.1.106 1.9.106 2.2.106 4.7.10 6 6.2.10 6 4.8.10 5 4.0.10 5 1.5.105 2.6.10 5 2.1 .10 6 2.4.10 5 5.0. 105 5.2.10 4 3.7.10 4



efficient assignment of resources in the long term and its relationship with the concept of sustainability. The basics of life cycle costing will be described in this section. 7.3. 1. General aspects



As in industry, engineering products such as infrastructure facilities have to extend the cost evaluation from the simple "counting" approach to the life cycle where value is created and to employ foresight instead of hindsight. Thus, the analysis should look forward in time, beyond the organization production costs. It should focus on the underlying drivers of business performance, which are essential for managing the statistical nature ofcosts. Within this context, Life Cycle Cost analysisplays a far greater role than traditionally thought. Proactive cost management should handle all kinds ofrisks that can incur lossesto the infrastructure. Those risks range from classical engineering (failure of the structure) to business risks, which have shown recently to be a new focal point of corporate governance. Within this context, it is evident that cost management should ideally be expanded to risk-based cost management as well as focus on total cost. Life Cycle Cost analysis should take risk and uncertainty into consideration to be really useful for decision making. Emblemsvag (2003) argues that the Life Cycle Cost analysis idea requires acting on the following directions: • From • From • From • From



partial focus to holistic thinking structure to process orientation cost allocation to cost tracing deterministic to uncertainty management.



 Ri sk analysis and the decision-making process in engineering



317



Planning Manufacturer User Planner Owner



Conception



Planning



Design



Construction



Operation



Replacement IDisposal



Figure 9. Life cost cycle from different perspectives.



Moving to holistic thinki ng involves recognizing the real effect of a project on the well-being of a society. It is of course related to quality in a broader sense and consequently with the relevance of the decisions. Considering systems as processes is a paramount concept in a holistic approach since that conce pt recognizes the dynamic nature of decision s. Cos t allocation refers to assigning costs using arbitrary allocation bases, whereas tracing relies on cause and effect relationships. Finally, un certainty managing is at the heart of any decision whose outcome is not certain. 7.3.2. Basics oflife cycle costing



The life cycle of any artifact depends up on the perspective from which it is looked at; however, it refers to lapse of time during which "someo ne" invests resources and expects to obtain some benefit (Figure 9). Therefore, Life Cy cle C ost refers to the cost incur red by "someone" dur ing the lite cycle of the artifact. N eedless to say that it is expe cted that ideally costs sho uld be lower than benefits in the lon g run; ot herwise , the produ ct is not worth being built. C ost and benefit estimation is not easy to quantify since the social impact ofa produ ct plays a significant part in the decision to develop it. It is imp or tant to distinguish here between produ ct life cycle and market life cycle. The forme r is related to one or a few items, while the latter is conce rn ed with busine ss management of product. In this chapter the discussion will focus on th e product life cycle. The life cycle cost is closely related to design and development because it has been realized that it is better to elimina te costs before they are incurred instead of trying to cut costs after they are incurred. T his represents a paradigm shift away from cost cutting to cost control during design; in other words, the tradition al approach of cost cutting is very ineffective. In term s of infrastructure it has been shown that the investme nt J ur ing the life span is several times higher than the original cost. Nowadays, public institutions and companies canno t afford to segregate cost accounting from design engineeri ng, constru ction and other core business processes. T his puts new challenges on management that traditional cost accounting techniques can not handle properly. O n th e whole, the Life- Cycle- C ost can be defin ed as: "the total cost that is incurred, or may be incurred, in all stages of the product life cycle" (Emblemsvag, 2003). The



 318



Mauricio Sanchez-Silva



Life-Cycle-Cost is a decision support tool that must fit the purpose and not an external financial reporting system. 8. EXAMPLES



In order to illustrate the main concepts described in this chapter, two applied examples of decision making will be discussed. 8.1. Allocation of resources to transport networks



Allocation of resources for construction, maintenance and rehabilitation of transport network facilities has become a priority in most countries, and in particular, in those where most freight is transported by land. This section presents a model for optimizing the allocation of resources based on the operational reliability of transport network systems. The optimum assignment of resources is carried out based on a set of possible actions described in terms of the failure and repair rates of every link. Thus, the model optimizes the assignment of resources so that the accessibility of a centroid or the total network is maximized (Sanchez-Silva et al., 2004). 8.1.1. Basic considerations



Reliability of transport systems is a complex issue that involves several factors that differ in nature. Transport systems analysis must combine physical and functional considerations, which are not necessarily independent. Physical aspects are related to the impossibility for the user to reach a destination due to damage of the infrastructure (e.g. collapse of a bridge). Functional aspects are concerned with level of service provided, such as excessive travel times (Sanchez-Silva et al., 2004). A transport network system can be thought of as a stochastic dynamic system, where the state oflinks (i.e. failed or not failed) and the users' decisions change permanently. A road network is defined as a system, which can be represented mathematically as a graph G( N, A) made up of a finite set of N nodes and A Links. The network includes a set ofroads selected on the basis ofany technical, functional or administrative criteria. Centers ofspecial interest such as cities are designated as centroids and should be clearly defined within the network. In order to assess the reliability of a network system, it is required to consider the network's variation with time. This can be looked at from two perspectives: (1) the decisions that the user has to make as hel she goes along a route from one node to another; and (2) the average failure and repair rates of a link within a route between two nodes. These two aspects are considered in this example within a single model. For more details of the model refer to (Sanchez-Silva et al., 2004). 8.1.2. Decision criteria



Accessibility is selected as the main decision criteria. It is the ability to command the transportation facilities that are necessary to reach desired destinations at suitable times. It is the most important relationship emerging from the interaction between the elements of the network.



 R isk analysis and the decision-m aking process in engineering 319



8. 1.3. Accessibility



Accessibility is widely used in transport studies under different contexts. Different authors , such as Mo seley (1979), Halden (1996), Geertman and Van Eck (1995) agree that accessibility depends on two factors: (1) an activity or motivation based on the opportunities available in a location and (2) a resistance factor based on generalized cost of traveling (e.g. efficiency, low cost). Others (Shen, 1998) have also included concepts such as the dem and for the foregoing mentioned opportunities. Accessibility is strongly related to the location and relevance of centroids, the willingness to move and the opportunities and ben efits of moving in accord ance with the attributes of the network . The operation conditi on of the network is then evaluated thro ugh the Accessibility Index A, which is used to describ e the efficiency of the netw ork to communicate centroids. Accessibility can be defined in terms of any variable of the network system; however, disutility, which is the cost of the trip as network users perceive it, is a usual parameter. Disutility encompasses all factors that affect the cost for the user and the way he/ she integrates them. That includes aspects such as travel time, speed limit, quality of the road, safety, landscape, congestion, and so forth . Although the disutility considers every important factor in the decision making process, travel time and direct cost are usually the most relevant components (Bell and Iida, 1997). The accessibility to centroid i is defined here by: Ai



=



L"



j ; l, j¥i



l\j-.J (E [Cj -4;])



(17)



where f( E [Cj--->;] ) is a monotonic decreasing function , e.g., f (E [ C;---> i]) = 1/ E [Cj ..... ;] , n is the numb er centroids in the network, Nj--->i is the number of vehicles (traffic); and E[ Cj --->;] is the expected cost of traveling from every centroid j to centroid i. This definition implies that as the cost of traveling increases, accessibility decreases (Sanchez-Silva et al., 2004). H owever, equation 17 does not provide enough information about the behavior of the wh ole network. Therefore, the N etwork Reliability Index (F) is proposed to measure the change in the accessibility of the entire network, and it is defined as (Sanchez-Silva et aI., 2004): II'



FN=Lw jAj



(18)



j; \



where 111 is the number of centroids in the network and w j the weight of centroid j in accordance to its importance for the netwo rk (e.g. economi c, social). Every centroid has a weight IV i calculated according to the evaluation obj ectives, such as amount of freight or passengers generated or attracted. In order for the values of IV j to guarantee completeness, it is necessary that the sum of all w j be equal to 1 (Lleras and SanchezSilva, 200 1).



 320



Mauricio Sanchez-Silva



8.1.4. Optimization of resource allocation



The appropriate assignment of resources depends on an effective use of the resources available to induce a change of the accessibility as function of the changes in the failure and repair rates (Ak and f..Lk)' If the repair rate (f..Lk) is increased, the response capacity of any interruption of link k improves; similarly, a decrease of the failure rate (Ak) means that prevention measures have been successful. The approach using Ak and f..Lk as the main parameters of the model, facilitates the definition of performance indexes and the decision making process. Any change in these rates has an associate cost associated. Therefore, every action has to be looked at in terms of the relationship between the cost and the benefit obtained in the operation of the network if such action is taken. The objective of optimization is to maximize the change in the accessibility, which may be looked at from two perspectives: (1) improving accessibility to a particular centroid; or (2) enhancing the accessibility of the whole network. The first analysis focuses on assessing the change in the accessibility of a given centroid as a function of modifications of the parameter A and u, The second alternative considers an increase in the accessibility of the network as a whole. For the centroid analysis, the optimization can be expressed as (Sanchez-Silva et al., 2004): Maximize:



L /I



A i()'j,A2, ... ,A,,,J.1.1,J.1.2, ... ,J.1./I)=



~->J(E[C)->;])



(19)



j~l.jfi



Subject to:



L C,i(Ai) + L C/li(J.1. i) = C tI



1/



i=l



i=l



L



(20)



A, ::: 0 M, ::: 0



where Ai (Al' A2, ... , A,,, f..Ll, f..L2, ... , f..Ln) is the accessibility as defined in equation 17; C L is the maximum amount of resources (e.g. US$) available for investing in the road network; and C (Ai) and CM, (f..Lj) are the cost of modifying the failure, A, or repair, u, rates of link i. The results of the optimization are the changes on the failure and repair rates of every link such that the accessibility to centroid i is maximized. The optimization of the operation of the complete network can be expressed simply replacing the objective function (equation 19) by equation 18:



=L /I,



FN()" 1, ),,2, ... , A,,, J.1.1, J.1.2,·'·' J.1.")



UJjAj(AI, A2,"" A,,, J.1.1, J.1.2,···, J.1./I)



(21)



j=1



The proposed model requires a nonlinear optimization of 2n variables. Note that the constrains Ai > 0 and u, > 0 will alwaysbe unbinding (Sanchez-Silva et al., 2004).



 R isk analysis and the decision- making process in eng inee ring



T RAH IC :\IATRI X Destin)



:\ Ia n iza lcs



23~567



o



100 60 '10 20



200 0



1'10200 270



20 10 60 50 30 '10



70 30 0



~ 30



, 0 20



0



~



10 10 25



100 70



80 60 15 110 0



130



110 80 26 94 130 0 ~



321



20 5



~



so



120



~5



'10



'10 110



115 130 0



170



290 SO 10 I~O 110 120 30



C a li



Figure 10. Transpo rt network co nsidered for the illustrative exam ple. Adapted fro m Sanc hez-Silva et a1. (2003) .



Table 6 Accessibility to every centroid C ent roid



Weight



Accessibility



1 2 3



0.277 5 0.OM8 0.0056 0. 1388 0.1850 0.092 5 0.0046 0.23 13



0. 1213 0.0853 0.0383 0. 1559 0.46 14 0.4 566 0. 1579 0. 1497



4 5 6



7 8



8. 1.5. Case study



Asan example, a part ofthe main transport network in cent ral Co lombia was considered (Figure 10) where the relevant inform ation of the parameters, in suitable units for the study, is also presented (Sanchez-Silva et al., 200 4). As ment ioned before, the decision criterion for allocating resour ces is the change in the accessibility; in particular, investmen t should lead to a reductio n in the total accessibility of the network. Therefore, in first place it is necessary to compute the accessibility to every centroid independently. This is performed by using Equation 17 with the results show n in Table 6 (Sanchez-Silva et a!', 2004). By considering the weights also shown in Table 6, which where obtained based on traffic demand and specific socioeco nomic characteristics of each centroid, it is possible



 322



Mauricio Sanchez-Silva



O'



~



:;: ~



~



z>



0.0



t-



0 .'



~



-; -



0 .1



:; :;



06



~



~e ~



oj



02



O' OJ 0 .2



0 .1 0



01 06



~



0'



;: O' -c x



0'



,



0



01 6



10



II



L1:'oo K



(a)



s



6



10



II



L1SK



(b)



Figure 11. (a) relative investment in the failure rate with respect to the maximum; and (b) relative investment in the repair rate with respect to the maximum.



to compute the



FN =



L'"



Network Reliability Index, F, as:



wJA j =



0.224



j=1



This value corresponds to the current accessibility of the network. Thus, the optimization consists in defining the assignment of resources (fix budget) to modify these parameters so that the Network Reliability Index is maximized. This requires defining a cost function for every possible action, which in this case are the changes ofthe failure and repair rate of every link. Defining this cost function depends on the context of the problem and has to be carefully structured. For this example, the cost function for every link has the form C(x) = k(x - xO)4; where x represents A or fJ., and Xo is the corresponding original value. These values were obtained from a regression analysis and reflect the actual socioeconomic conditions of the region (Sanchez-Silva et al., 2004). Optimization is clearly a non linear problem which can be solved using standard methods such as the projected gradient. The restrictions on optimization are the amount of resources available and the final value of the parameters A and u, which have to be positive. For a limited budget of US$1867 million, Figure 11 shows the results of the optimization in terms of the relative investment in every link for the failure and repair rates. It can be observed in Figure 11 that investments on improving the repair rate are higher and more even throughout the links than those required for enhancing the failure rates. The investment in the failure rate is highly concentrated in link 9 and is followed by the investment in links 5 and 6. Actions directed to reduce A are related to physical interventions such as construction of retaining wall structures, retrofitting bridges and so forth. In terms of the repair rates, links 3 and 4 are significant for the



 Risk analysisand the decision-making process in engineering 323



network since they have the smallest cost of repair and provide the redundancy for the route from centroids 1 to 8. A failure oflink 3 makes links 2 and 5 to become extremely critical and any alternate route very expensive. In general, it has been observed that the investment directed to improve repair rates has a lesser impact on the network since the time a link is out of service is very low. 8.1.6. Summary andfinal remarks



Allocation of resources for enhancing the reliability of a transport system is a priority and a very controversial issue due to the differences in the criteria used for that purpose. An approach to computing the transport network systems reliability based on an entirely probabilistic view, as proposed by Sanchez-Silva et al. (2004), has been presented. It considers the state of the network through the relationship between the failure and repair rates of every link comprising the network. These rates are directly related to social or physical characteristics of the road, such as condition of the road, or frequency and size oflandslides. The example illustrates how optimizing the assignment of resources can be an efficient decision making strategy to enhance the reliability of any transport network system. 8.2. Design of structural systems



Building design and construction is a fundamental activity for a society. Technical requirements are defined by codes of practice based on well tested mechanical models. In areas were earthquake activity is important, the uncertainty of seismic load defines the design requirements. In this particular case, the balance between safety and cost is extremely important and a careful consideration of the cost of saving human lives is required (section 7.2). This example discusses how the decision of the acceptability of the earthquake design criteria is controlled by saving life years criteria. 8.2.1. Decision criteria



For this example, the decision has to do with the level ofacceleration for which a building structure has to be designed in accordance with the socioeconomic characteristics of the society where it is going to be built. 8.2.2. Probabilistic model oftheground motion



Earthquake hazard assessment focuses on defining the probability of excedence of a particular ground motion parameter at a site, in a given period of time, T. For convenience and in agreement with current practice, the peak ground acceleration is taken as the design parameter. Attenuation laws, which relate peak ground acceleration with magnitude and epicentral distance, have the general form: a



= h(m,



r)



= b1(r)e/"'"



(22)



where a is the peak ground acceleration at the site of interest, b: = 0.573, m is the Richter-magnitude and b 1 (r) is a function ofdistance describing the energy dissipation.



 324



Mauricio Sanchez-Silva



For the seismic conditions of California, which are similar to those of the region of interest, i.e. Bogota, bl (r) = (9.81*0.0955 exp(-0.00587r))/ Jr 2 + 7.3 2 (Joyner and Boore, 1981). For the attenuation law in equation 22, a coefficient of variation of about 0.6 is reported. The data collected showed that the magnitude can be fitted by an extreme value distribution type III (for maxima). The upper bound is defined by historical data of earthquake events and by the regional geological characteristics. In addition, only events with magnitudes which may cause significant damage are taken into account, i.e. m > M mi" = 4.0. Therefore, the conditional probability density function for the magnitude will be given by (Sanchez-Silva and Rackwitz, 2004): (23) The values of the distribution parameters wand u can be determined based on the available data of the seismicity of the area. The denominator of the equation corresponds to 1 - FM(m), which is the probability ofhaving an earthquake with magnitude higher than M mi,.. The parameters of the distribution were computed by a maximum likelihood method based on data for Bogota. The yearly rate of occurrences of earthquakes has been determined as A = 2.9. Based on the attenuation law considered in equation 22 and the conditional density function for the magnitude equation 23, the derived conditional density distribution function for the acceleration can be calculated as: (24) with M mi" < h-l(a, r) = (1/b 2)ln(a/(b l(r)) < Mil. In order to obtain the unconditional density function for the acceleration, it is necessary to integrate over the area within which the analysis is performed (i.e. circular around the point of interest, Rmax = 200 Km). Therefore, the probability density function of the distance R is considered uniform with a value offR (r) = 2r / R;tax. Ofcourse more elaborated models can be developed if the location and earthquake pattern of the seismic sources are clearly identified. Assuming stochastic independence between magnitude and distance, the density for the acceleration is (Sanchez-Silva and Rackwitz, 2004): (25) The mean and the standard deviation for the acceleration without considering the uncertainty in the attenuation law are 0.075 m/s 2 and 0.116 m/s 2 respectively, implying a coefficient ofvariation of 155%. The maximum and minimum acceleration expected on site are a max = 9.44 m/s 2 (R = 0, M = M u ) and ami" = 0.014 m/s 2 (R = 200 Km, M=4).



 Risk analysis and the decision-making process in engineering



8.2.3. Model



325



of theprobability offailure of the structural system



The structural system is modeled as a one degree of freedom system. The demand on the building structure subjected to a ground motion depends upon the probability distribution function of the acceleration and the variation of the acceleration obtained from the response spectrum. Thus, the limit state function can be defined as g (R, S, A) = R - SAs = 0, where R is the resistance including all structural characteristics like distribution of masses and ductility, S accounts for the variability of the response spectral acceleration of the system to the ground motion (p.. s = 1, as = 0.6), A is the peak ground acceleration at the base, and e accounts for the uncertainty in the attenuation law. Ifboth resistance and demand are modeled as lognormal distributions, the conditional probability of failure of the system can be expressed analytically as (Sanchez-Silva and Rackwitz, 2004):



(26)



The ratio plsa corresponds to the central safety factor, where p is the mean value of the resistance. The variability of the attenuation law is included within the variability of the demand Vs , which was increased for the purpose of this study to 0.8. It is pointed out that equation 9 is valid only conditionally on the acceleration, a; therefore, expectation has to be taken to remove the condition. The unconditional probability of failure can be computed by integrating over the acceleration range. Thus, the unconditional probability depends only on the parameter p and represents the failure probability of the structural system when subjected to acceleration with probability density as defined in equation 25. 8.2.4. Estimation of cost



The definition of cost is fundamental for the optimization process. The cost of interest for the optimization process is: (1) cost of construction; (2) cost of retrofitting; (3) cost of expected material losses; and (4) cost of human life losses (Table 7). The construction cost of the structure consists of two parts (equation 27), one that does not depend on the structural characteristics (non-structural elements), and the Table 7 Cost functions used in the optimization Type of cost



Cost function



Equation No.



Construction and retrofitting Rehabilitation Expected damage Saving Human Lives



C(p) = (Co + ClpD) CR(a, p) = (Co + CzpDar) HM(a) = CM1/' HF(a) = (lCAF . k· N F) . 1/a



(27) (28) (29) (30)



C II (C(, = 1 x 10")-construction cost. C I (C I = 1 x 100)-construction cost that depends on the amount of resistance provided to the structure and 8 = 1.1; C, (C2 = 8000 (C,/C(, = 0.008)) is the cost of retrofitting the structure; r = 1.5; CM (C.\! = 1.5 x lO')-cost of material damage, ~ = 1.25; a-acceleration in m/s-.



 326



Mauricio Sanchez-Silva



direct cost of the structure itself. It is widely known that the cost of the structure, if designed to withstand earthquakes, accounts for some 20% to 30% of the total cost of the building. The investment in rehabilitation or retrofitting is related mainly to the structural system although it may imply some repairs to non-structural elements. This cost depends upon both the structural behavior and the actual ground motion. The parameter p defines the requirement in structural terms to upgrade a structure to the reliability level of the current code of practice after it has been damaged, and the acceleration expected in site relates it to the expected structural damage (equation 28). Expected earthquake damage is usually measured in terms ofthe earthquake intensity (eg. MMI, MSK) and can be modeled as vulnerability curves, damage matrices, damage intensity curves and so forth. In this paper, the damage cost is referred to not as intensity, but to peak ground acceleration (equation 29). Empirical relationships and tables were used as reference to relate damage observed and acceleration. The damage functions developed do not discriminate between different intensities of damage (eg. low, moderate, high), but take into account the total value of the damage. In addition to the direct cost of damage, the cost of loss of opportunity, H o , was included as a constant depending upon the socioeconomic climate, i.e. Hr, = ¢Co, but independent of acceleration. For comparative purposes, ¢ was selected as 3.615, 1.0 and 0.231 for high, moderate and low socioeconomic climates (Sanchez-Silva and Rackwitz, 2004). 8.2.5. Optimization



The random failure in time with systematic reconstruction can be applied to earthquakes in which the seismic events follow a Poissonian process with occurrence rate A and failures can occur independently with probability Pr(p) (Hasofer and Rackwitz, 2000) (Table 4). An area with moderate to high seismic activity was considered as a case study. The ground motion characteristics used in the model correspond to data obtained from the US Geological Service (UGS, 2001). The study focused on the implications of the LQI in the optimization process and the consequences in terms of structural safety for different socioeconomic contexts. Thus, three different socioeconomic conditions were reviewed (Table 8), with an annual benefit of O. 03C o and a discount rate of 2% per year. Table 8 Data for comparing different socioeconomic contexts Socioeconomic level



Parameter g Cjw ICAF F. opportunity



High (Western Europe, USA, Japan)



Moderate (Latin America & Caribbean)



Low (Least develop countries)



23,500 20 0.125 3.10 6 3.615



6,500 28 0.15 4.10 5 1.0



1,500 40 0.18 7.10 4 0.231



 Risk analysis and the decision-making process in engineering 327



High Social CUmBie · V, riaUon with k ' 00



- - - - - - • • • • •• • • • • • • • • , •• • • • • .,



,.~ " : : ''' ' . 1 •



~



•



1.!:. ,. -- - - - - -~ ~ -"'- , -



~



• ..... .



.



I



I



•



~ . . . . . . •



,



- -..,..,- ~



.-



~ .



1::r .~'::-:"t.~:: ;::: ::: ~::::::~ '!



' ~~



.•



'



...



~



~



b O.1



*:0.01 k:O .OO1



I



' 00



(a)



.J" V



- :,...-



"



,,,.UU.,



~



Numb« ot



'00



(b)



Figure 12. (a) Spectral acceleration for k = 0.01 and for different social climates. (b) Spectral acceleration for high social climate and different values of k.



A reliability analysis using FORM/SORM was performed in order to determine the acceleration associated to every p-optimum obtained in any case considered. This corresponds to the design value ofthe peak ground acceleration for which the structure has to be designed. Figure 12a presents the spectral design acceleration as function of the number of fatalities in case of collapse of the structure for a value of k = 0.01 and different social climates, while in Figure 12b. the results are shown for the high social climate depending upon the value of k. First of all, it can be observed that in all cases, as the number offatalities per building increase, there is also an increase in the required design acceleration, and it changes depending upon the socioeconomic condition of the population. In all cases it is clear that a highly developed country should spend more in order to save life years than a developing country. Furthermore, the design acceleration levels defined in current codes ofpractice are, in many cases, sub-optimal. They do not account for the number of fatalities nor for the socioeconomic characteristics of the population. Therefore, the average design acceleration specified in the code of practice leads to overdesign in moderate and low socioeconomic climates and to underdesign of structures in highly developed countries. Results showed that the design criteria cannot be set in terms of the return period alone, but they have to consider also construction and reconstruction costs, opportunity losses, and the potential for human life saving (Sanchez-Silva and Rackwitz, 2004). 8.2.6. Summary andfinal remarks



Structures should be optimal in terms of the capital invested and the saving oflife years. Selecting appropriate target reliabilities and risk acceptance criteria is paramount for making decisions on infrastructure development. The reliability-based optimization process defines the optimum value of the design vector parameter p, for which the building is financially feasible. Results showed that optimum design values are very sensitive to the socioeconomic context and to the number of fatalities as the economic



 328



Mauricio Sanchez-Silva



level of the society increases. In addition, it could be inferred that countries with different socioeconomic contexts should invest their resources in structural safety differently. Low developed countries should invest less in structural safety, and redirect the resources to other aspects such as education or health, which may prove to be more important for development (Sanchez-Silva and Rackwitz, 2004). 9. CONCLUSIONS



Risk analysis is an essential tool for making decisions since it uses evidence effectively to provide information on the potential consequences of a given scenario. The main challenges that engineers face is how to make decisions which balance cost and safety within an uncertain environment. Decisions range from pure technical aspects to planning and idealizing engineering projects. There is not a way to define the correct decision because it not only depends upon the strategy of comparing alternatives, but also on the point of view of who makes it and his/her perception of risk. In other words, it depends highly on the acceptability criteria. An alternative for defining rational acceptable criteria has to be linked to actual facts and must be the result of an optimization process. A great deal of work has to be done on integrating risk analysis and decision making strategies; it is definitely an important way ahead for developing better engineering which serves society. REFERENCES Ang A. H-S. and Tang W. H. (1984), Probability concepts in engineeringplanning and design. Vol II Decision Risk and reliability. John Wiley & Sons, New York. Bell M. G. H. and lida Y. (1997), Transportation Network Analysis, Wiley, Chichester. Blockley, D. I. (1992), En,!;ineering Seifety, Mc Graw Hill, London. Blockley D. I. and Dester W. (1999), "Hazard and Energy in risky decisions".]. of Civil Eng and Ev. Syst, Vol. 16, pp. 315-337. Checkland P. Systems Thinking, Systems Practice. John Wiley and Sons, 1981. Cho H. N., Ang A. H-S. and LimJ. K. (1999), "Cost effective optimal target reliability for seismic design upgrading oflong span PC bridges". Proc. Application of Statistics and probability. (Ed. R. E. Melchers & M. G. Stwart), Balkema Rotterdam, pp. 519-526. Coburn A, Spence R. Earthquake protection. John Wiley & Sons, 1992. DANE (2002). http://www.DANE.gov.co. EERI Committee on seismic risk (1984). Glossary of terms for probabilistic seismic-risk and hazard analysis. Earthquake Spectra, Vol. 1, No.1, November, 35-40. Ellingwood B. (2000). "Probability-based structural design: prospects for acceptable risk bases". Proc. of International Conference on Applications of Statistics and Probability, pp. 11-18, Balkema, Sydney. Elms D. G. (1989), "wisdom engineering: the methodology of versatility". J. of Applied Eng. Education, 5, (6),711-717. Emblemsvag (2003), Life Cycle Costing. John Wiley. Federal Emergency Management Agency (FEMA). (1992). A benefit-cost modelforthe seismic rehabilitation of buildings. Volume 1 & Volume 2, FEMA-227 & FEMA-228. Geerrman S. C. M. and Van EckJ. R. R. (1995), GIS and models of accessibility potential: an application in planning, International Journal Geographical Information Systems, 1995, Vol. 9, No.1, 67-80. Halden D. (1996), Managing uncertainty in transport policy development, Proc. Inst. Civ. Engrs Transport, 117, Nov, 256-262. Haimes Y. Y. (1998), Risk modellin,!; assessment and management. John Wiley & Sons, New York. Harr M. E. Reliability-Based Design in Civil Engineering. Dover, 1996. Hasofer A. M. and Racwitz R. (2000), "Time-dependant models for code optimization", Proceedings ICASP99. Balkema Rotterdam, Vol. 1, pp. 151-158.



 Risk analysisand the decision-making process in engineering 329



Joyner and Boore, (1981), "Peak horizontal acceleration and velocity from strong motion records including records from the 1979 Imperial Valley, California earthquake". Bull. Seism. Soc. Amer., 71, pp. 2011-2038. Kottegoda and Rosso, (1997), Probability, statistics and reliability for civil and environmental engineers. McGrawHil!. Lind N. (2001). "Tolerable Risk". Proc. of the International Conference Safety Risk and Reliability, Trends in Engineering, pp. 23-28. Malta. Lleras G. c., and Sanchez-Silva M., (2001), Vulnerability of Highway Networks, Proceedings of the Institution of Civil Engineers (ICE), United Kingdom, November 2001, Issue 4, pp. 223-230. Melchers R. E. (1999), Structural reliability AnalysisandPrediction. 2nd Edition,John Wiley & Sons, Chichester, UK. Moseley M. J. (1979), Accessibility, the rural challenge, London, 1979. Nathwani J. S., Lind N. C. and Pandey M. D. (1997), Affordable safety by Choice: the Life Quality Method. Institute for Risk Research, University ofWaterllo, Ontario Canada. Pate-Cornell E. (1994), "Quantitative safety goals for risk management of industrial facilities". Structural safety. Vo!. 13, No.3, pp. 145-157. Rackwitz R. (2000), "Optimization-The basisof code making and reliability verification". Structural Safety, Vo!. 22, No.1, pp. 27-60. Rackwitz R. (2001), "A new approach for setting target reliabilities". Proc. ofthe International Conference SafetyRisk and Reliability, Trends in Engineering, pp. 531-536. Malta. Rackwitz R. (2002), "Structural optimization and the Life Quality Index". Structural safety. Special Issue, To be published. Rackwitz R. (2003) "Discounting for optimal and acceptable technical facilities", to be presented at the ICASP9, San Francisco. Sanchez-Silva M. (2001), Basic concepts in risk analysisand the decision making process.] Civil Engineering and Environmental Systems, Vo!. 18, No.4, pp. 255-277. Sanchez-Silva M., Taylor C. A., Blockley D. I. (1996), Uncertainty modeling of earthquake hazards, Microcomputers in Civil Engineering, Vo!. 11, 99-114. Sanchez-Silva M., Daniels M., Lleras G., Patino D. (2004). "A transport network reliability model for the efficient assignment of resources".] of Transport Research B. Elsevier. In print. Sanchez-Silva M., Racwitz R. (2004). "Implications of the Life Quality Index in the optimum design of structures to withstand earthquakes". The] ofStructures, American Society of Civil Engineers ASCE. In print. Shen Q. (1998). Spatial Technologies, Accessibility, and the Social Construction ofUrban Space. Computers, Environment and Urban Systems, Vo!. 22, No.5, pp. 447-464. Stwart M. and Melchers R. (1997), Probabilistic riskassessment of engineering systems. Chapman Hill, London. U'S: Geological Service (2001), http://eqint.cr.usg.gov/neic/, National Earthquake Information Center, World data Center for Seismology, Denver. United Nations Human Development Report, 2001, http://www.undp.org/hdr2001 /completenew. pdf. Von Neumann J. and Morgenstern 0. (1944), Theory of games and economic behaviour. Princeton University Press. Zadeh L. A. (1965), "Fuzzy sets".] Information and Control, Vo!. 8, 338-353.



 MECHATRONICS AND SMART STRUCTURES DESIGN TECHNIQUES FOR INTELLIGENT PRODUCTS, PROCESSES, AND SYSTEMS



VICTOR GIURGIUTIU, PhD., PE.



1. INTRODUCTION



Mechatronics is an emerging engineering area that will likely alter the fundamental nature of engineering, particularly (and initially) in the disciplines of electrical and mechanical engineering. The mechatronics approach provides multi-disciplinary knowledge and skills necessary to apply mechatronics in product design, development and manufacturing; and serve as a model for multi-disciplinary programs cut across traditional engineering disciplines. Mechatronics integrates the classical fields of mechanical engineering, electrical engineering, computer engineering, and information technology to establish basic principles for a contemporary engineering design methodology. In this methodology, engineering products and processes have moving parts that require manipulation and control of dynamic constructions to a required high degree of accuracy. Also, the design process requires enabling technologies such as sensors, actuators, software, optics, communications, electronics, structural mechanics and dynamics, and control engineering. A key factor for the design process involves integrating modern microelectronics and information technology into mechanical and electromechanical systems. Therefore, the mechatronics concentration area supports the synergistic integration of precision mechanical engineering, electronics control and systems thinking into the design of intelligent products and processes. The discipline of adaptive materials and smart structures, also known "Adaptronics", is an emerging engineering field with multiple defining paradigms. However, two definitions are prevalent. The first definition is based upon a technology paradigm: "the 330



 Mechatronics and smart structures design techniques



331



integration of actuators, sensors, and controls with a material or structural component". Multifunctional elements form a complete regulator circuit resulting in a novel structure displaying reduced complexity, low weight, high functional density, as well as economic efficiency. This definition describes the components of an adaptive material system, but does not state a goal or objective of the system. The other definition is based upon a science paradigm, and attempts to capture the essence of biologically inspired materials by addressing the goal as creating material systems with intelligence and life features integrated in the microstructure of the material system to reduce mass and energy and produce adaptive functionality. It is important to note that the science paradigm does not define the type of materials to be utilized. It does not even state definitively that there are sensors, actuators, and controls, but instead describes a philosophy of design. Biological structural systems, for example, are the result of a continuous process of optimization taking place over millennia. Their basic characteristics of efficiency, functionality, precision, self-repair, and durability continue to fascinate designers of engineering structures today. When considering the interdependence of Mechatronics and Smart Structures, the subject of multi-disciplinary programs and teams springs forward. Today's engineering disciplines are increasingly intermeshed, and Mechatronics as well as Smart Structures are typical examples of such situations. It goes without saying that an engineering professional with Mechatronics specialization needs to have a sound basic knowledge in both Mechanical and Electrical engineering disciplines. The same could be said about a Smart Structures specialization. Much of our present effort is focused on bridging the traditional divide between Mechanical and Electrical engineers, on identifying means and methods to teach multi-disciplinary courses, to have multi-disciplinary teams working together on a common project. We think the case of Mechatronics and Smart Structures offers a very timely opportunity to do this in a continuous improvement effort that is currently taking place across the engineering community. This chapter will present several design techniques that can be successfully applied in the analysis of mechatronics and smart structures to generate intelligent products, processes and systems. First, the reader's attention will be focused on the analysis of induced-strain actuation, a central theme of active materials and smart structures applications. After covering the analysis of induced-strain actuators for static applications, the reader's attention is refocused towards dynamic applications. The common theme of this analysis is to design induced-strain applications in which the power and energy transfer from the induced-strain actuator into the external load is maximized. Special requirement for stiffness and impedance match are highlighted and conditions for optimum design under various static and dynamic conditions are presented together with design examples. Next, the chapter considers the design of smart structures with embedded ultrasonics capabilities for structural health monitoring. The embedded ultrasonics concepts are built around the piezoelectric wafer active sensor (PWAS) which is the enabling technology for this type of application. The PWAS acts as both receiver and transmitter of ultrasonic Lamb waves, thus being both a sensor and an actuator. The PWAS are permanently attached to structure, and can be placed into closed-up spaces. Their



 332



Victor Giurgiutiu



interaction with the structure is subject to a complex exchange of electrical and acoustical energies coupled through the piezoelectric effect. The chapter presents how this coupling takes place through the shear lag bonding layer between the PWAS and the structure. Conditions for optimum power and energy transfer between the PWAS and the structure are derived. Furthermore, an analysis is developed for finding conditions in which specific Lamb modes can be selectively tuned to achieve certain structural health monitoring desiderates. Conditions that maximize either the axial Lamb modes (So mode) or the flexural Lamb modes (Ao mode) are derived. The analysis is performed via the space-domain Fourier transform, and closed form solutions are derived for the case of ideal bonding between the PWAS and the structure. Experiments that verity these theoretical predictions are presented. The chapter wraps up with a summary and conclusions section in which the main points are reviewed and conclusions about the applicability ofthese techniques to actual design situations are highlighted. Suggestions for further work are also presented. 2. ANALYSIS OF INDUCED-STRAIN ACTUATION



2.1. Actuator-structure interaction



In this section we analyze the basic interaction between an induced strain actuator and the application structure. For simplicity, the structure will be assumed to behave linearly elastic. Hence, the application structure will be represented by an external spring of effective stiffness ke (Figure 1). The induced-strain actuator will be assumed to have



(a)



I~I~



Rigid support



(b)



k,



(c) Figure 1. Schematic representation of an induced-strain actuator with rigid support: (a) the assembly of actuator and external spring; (b) the forces acting on the actuator; (c) the forces acting on the external spring.



 Mechatronics and smart structures design techniques



333



the internal stiffness ki • The interplay between the internal and external stiffness will dictate how much displacement, force, and energy is transmitted from the inducedstrain actuator into the application structure. 2.1.1. Displacement analysis



Recall the 1-D constitutive relations for piezoelectric materials



S=s·T+d·E



(1)



D=d·T+e·E



(2)



where S is the strain, T is the stress, E is the electric field, s is the compliance, e is the dielectric constant, and d is the piezoelectric constant that relates the induced strain, S, to the applied field, E. Figure 1 presents an induced strain actuator (ISA device) of internal stiffness ki acting against an external spring load of stiffness ke . The compressive force acting in the actuator is (3)



where u, is the compression of the external spring, ke . Inside the induced-strain actuator, the force, F produces the compressive stress F A



T=--



(4)



whereA is the cross sectional area of the electroactive induced-strain actuator. The negative sign indicates that the induced strain actuator is in compression. The strain, S, is given by du



S=-



(5)



dx



Introduce the symbol S/SA to denote the strain induced in the electroactive material by the applied electric field E, i.e., S/SA



=d.E



(6)



Substitute Equations (5)-(6) into Equation (1) to get du = s (F) -dx - - + S/SA A



(7)



Upon integration, u(x)



F



= -s -x + S/SAX A



(8)



 334



Victor Giurgiutiu



The displacement at the end of the stack, which we have defined as the "external displacement", is u,



F



= u(l) = -5 -I + SISAL A



(9)



Introduce the symbol UISA to denote the induced-strain displacement at the end of the stack (free stroke), i.e., UISA



= SISAL = d . E . I



Also introduce the symbol actuator, i.e., Ui



(10) Uj



to denote the compression of the induced-strain



F



(11)



== -



le,



where kj is the internal stiffness of the induced-strain actuator, i.e., A k; =51



(12)



Hence, Equation (9) becomes Uc



= -U; + UISA



(13)



This equation expresses the fact that the sum ofthe internal and external displacements equals the total induced-strain displacement UISA, i.e., U;



+ U e = UISA



(14)



This simple analysis reveals that the total induced-strain displacement, UISA, is partly consumed as an internal displacement, U i, due to the compressibility of the ISA device, and partly delivered as useful output displacement, U e . Upon elimination of U j between Eliminate the force F between Equations (3) and (11) to get (15)



Substitution of Equation (15) into Equation (14) yields Ue



1



= --k- UI SA 1+ ~ k;



(16)



 Mechatronics and smart structures design technique s 335



1.00



Rigid support



0.75



E,JEref and 0.50 U,JUlSA 0.25 0.00 Iol::==..._-'----_ _-'--_ _-'-_==_ 0.01 0.1 10 100 Stiffness ratio, r = k./k, Figure 2. Variation of output displacement coefficient, 1) = Ue / UISA, and output energy coefficient, E; E, / E ref, with stiffnessratio, r kc/ k; for an induced-strain actuator on a rigid support.



=



=



Equation (16) indicates that the output displacement depends not only on the inducedstrain displacement, U/SA , but also on the relative stiffness between the ISA device and the extern al spring. Introducing the stijJtless ratio,



r



k,



=k;



(Stiffness ratio)



(17)



and the output displacement coefficient, II ,



7/(' ) = UISA



(O utput displacem en t coefficient)



(18)



We write U,



= 7/(') . UISA·



(19)



Where TJ(r) is the output displacement coefficient given by 1 7/ (' ) = - -



1+ r



(20)



Variation of the output displacement coefficient TJ (r) with stiffness ratio, r , is shown as the curve U f I U/SA in Figure 2. As th e external stiffness increases, the fraction of the induc ed-strain displacement, U/SA , whi ch reaches the output, dimini shes. For very large extern al stiffness, one gets the blocked condition, i.e., II f ~ 0, as r ~ 00.



 336



Victor Giurgiutiu



2. 1.2. Output enelJ?y analysis



The output energy is the energy delivered by the ISA device into the externa l spring, i.e., (21)



Upon substitu tion, E,



= - -I -, (1 + It



(1



2)



- ki /liSA



2



(22)



.



N ote that the output energy expr ession given in Equ ation (22) consists of a variable coefficient that depends on the stiffness ratio, r, and a constant energy term (~ ki lIisA) that dep ends only on the free stroke, "ISA, and internal stiffness, k i , ofthe induced-strain device. Since "ISA and kj are reference parameters of th e induced-strain device, we call the latter term of Equ ation (22) the referente energy of th e induced-strain device, i.e., 1



E,e.r = -2 k i /lISA' 2



(23)



The variable term in Equ ation (22) is the out put energy coefficient, E; ,



I



E (r ) - -e (1 + 1)2 '



= Ee/EreI' i.e., (24)



The governing design factor in the constru ction of a conventional lSA device is the stiffness ratio, r. Variation of th e displacement and energy coefficient s with the stiffness ratio, r , is show n as the curve Ee /Er~r i n Figure 2. The maximum displacement output is obtained as r approaches zero, i.e., when th e external stiffness is so weak that practically no resistance is presented to the ISA device and hen ce th e full indu ced-strain displacem ent, IIISA , is delivered externally. T his case is of limited int erest for actuator applications since the actuation force is zero and no out put energy is delivered . In the other extreme, when the external stiffness is very mu ch greater than the int ern al stiffness, the output force reaches a maximum, but th e output displacement is zero. Hence, th e output energy is again zero. Between th ese two extremes, an optimum soluti on may be reached. By differenti ating the energy coefficient with respect to the stiffness ratio, r, one gets d E'



-dr ' = 0 -



2r 1- -1+r



=0-



Iopt



= 1.



(25)



This is th e st[{flless match principle. When the external stiffness matches th e internal stiffness, the energy output reaches a maximum and its value is max



Ec



1 = -Er er 4 '



(26)



 Mechatronics and smart structures design techniques



337



U /SA



Figure 3. Schematic representation of an induced-strain actuator with compliant support.



2.2. Induced-strain actuators with compliant support



Figure 3 models an induced-strain actuator in which the support structure is not infinitely rigid. A stiffness value, le., is assumed to apply at the support end of the ISA device. The presence of finite support stiffness, ks , can significantly modify the output displacement and output energy ofthe induced-strain actuator, as seen in the following analysis. 2.2.1. Displacement analysis



The induced-strain displacement, U/SA, is consumed in the support stiffness, ks , in the internal stiffness of the actuator, ki , and in the external stiffness, k", i.e., U/SA



F = -



le,



F



+ - + Ue



(27)



k;



The external displacement is given by F = ke • u e •



(28)



Upon elimination, u,



k; )-1 = ( 1 + -k,ke +k;



(29)



U/SA·



Introducing the structural stiffness ratio, k,



r =, k;



(Structural stiffness ratio)



Hence, write the output displacement coefficient,



(30)



1],



as a function of both rand r5'



i.e.,



17(r,r,)=



( ) 1+r 1+.l r,



(31)



 338



Victor Giurgiutiu



1.00 Compliant support 0.75



EelEre! and



0.50



UelUISA



0.25 O.OOb",:=::::::::...l_...:....:.:.~_...=::::::E~--.J



0.01



0.1 1 10 Stiffness ratio, r = klk;



100



Figure 4. Variation of output displacement coefficient, 1/ = Uc/ UISA, and output energy coefficient, E; = E, / Ere[' with stiffness ratio, r = ke/ k;: for an induced-strain actuator on a compliant support with various values of the support stiffness ratio, r s .



The output displacement, placement, UISA, as Ue



U e»



is expressed in terms of the reference (free-stroke) dis-



= l7(r, rs ) . UISA·



(32)



A plot of the output displacement coefficient, 17, vs. the stiffness ratio, r, for two values of the support stiffness ratio, r s, is given as the curves Ue/UISA in Figure 4. It can be seen that the output displacement coefficient, 17, increases as r s increases, i.e., a stiffer support will give a better displacement output. As r 5 -+ 00, the expression of 17 approaches the simpler expression 17 (r) = 1/ (1 + r), which was derived as Equation (20) in the previous section. 2.2.2. Output energy analysis



The output energy delivered by the induced-strain device is, as before, E, = ~ ke Upon substitution,



•



u;. (33)



The last parenthesis in Equation (33) is the reference energy, Ere[' defined by Equation (23). Dividing Equation (33) by Ere[ yields the output energy coefficient, (34)



 Mechatronics and smart struc tures design techni qu es 339



N ote that the output energy coefficient depends on both the stiffness ratio, r, and the structural stiffness coefficient , r s: The curves ErlEre! in Figure 4 presents the variations of the displacement and energy coefficients with the stiffness ratio, r, for two values of the stru ctural stiffness coefficient , rs = 10, and rs = 1. For the case of "stiff" support (rs = 10), the displacement and energy curves in Figure 4 are almost identi cal with those given in the previous section for an induced-strain actuator with rigid support (Figure 2). For the case of " elastic" support (r5 = 1), a significant shift is observed in the output displacement and output energy curves. More importantly, the peak of the output energy curve is reduced by abo ut a factor of 2. This 50% redu ction in energy output is explainable, since half of the available energy ends up as elastically stored in the support. 2.3 . Displacement-amplified induced-strain actuators



2.3. 1. Displacement analysis



Displacement amplifiers are currently used to increase the displacement output ofISA devices. The simplest representation of a displacement amplification concept is that of a lever with unequal arms (Figure 5).



ISA input



. G Ie Gam, J =T I



ue=leB Output



Figure 5. Co nceptu al models for displacement-amplified induced-strain actuator on rigid suppor t and compliant amplification mechanism.



 340 Victor Giurgiutiu



This simple representation can be used to analyze even more complicated displacement amplifiers (flextensional, deformable triangles, hydrostatic, etc.). At the input end, the lever is actuated by an ISA device, which has an internal stiffness ki , and produces an induced-strain displacement UISA. The displacement of the input lever arm is UISA i.e. the induced strain displacement minus the internal compressibility. Note that 'Pi, is the input force in the amplification mechanism and also the force in the ISA device. This displacement produces a rotation of the lever and hence,



1:,



e



= UISA -



l;(i



F;



(35)



-.



k;



If the amplifying lever is considered rigid, then the displacement at the output arm, is given by simple rigid body rotation, i.e.,



Ue,



(36)



where U e is the external displacement. For an equivalent external stiffness ki , the external reaction, Pc, is Fe



= keue·



(37)



The input force, Pi, and the output reaction, Pe , satisfythe lever equilibrium relation



(38)



where Ii and l, are the lever arms connected to the internal and external displacements, respectively. Introducing the kinematic gain Ie



G=I·, '



(39)



one expresses the internal force in terms of the external reaction, and vice-versa, i.e., F;



= Fe



.G



Eliminating (40), yields Ue



=



G



F;



and



= Fe/G



e between Equation (35) and Equation (36), and using Equations (38)-



k 1 + G2-.!... k;



r)



(41)



UISA·



Using the definitions r ment coefficient 1] (G,



(40)



G



= 1 + r·



G



2·



= ke/k i



and 11



= Ue/UISA,



one writes the output displace-



(42)



 Mechatronics and smart structures design techniques



0.3



G=50



341



G=4 G=10



0.00001 0.0001 0.001 0.01



G=1



0.1



I



10



100



Stiffness ratio, r = k/k;



Figure 6. Output energy coefficient kinematic gain, G.



E;, versus stiffness ratio, r = k



c/



k;, for various values of the



The output displacement coefficient, 1/(G, r), is always less than the kinematic gain, G, since part of the amplification effect is always lost due to the internal compliance of the induced-strain device. 2.3.2. Output energy analysis



The energy output from a displacement-amplified induced-strain actuator is simply the energy stored in the external spring, i.e., (43)



Substituting ke = r . ki and tion (42), yields E e -



2



r·C . (1 + r . C2)2



(1



-k· u2 2 lISA



Ue



)



= 1/ . UISA into Equation (43), and then using Equa-



(44)



.



ul



Dividing the output energy, E e , by the reference energy, E rej = ~ ki • sA ' one gets the for a displacement-amplified inducedexpression of the output energy coefficient, strain actuator



E;,



rC 2 E'-----,;e (1 + rO)2'



(45)



The energy coefficient is a function of the stiffness ratio, r , and the kinematic gain, G. Figure 6 shows plots of the energy coefficient vs. stiffness ratio, r , for various values of the kinematic gain, G. For G = 1, i.e., for an ISA device without amplification, the



 342



Victor Giurgiutiu



peak value ofthe energy coefficient is reached for rapt = 1, i.e., the stiffness matchconcept. For values of G greater than 1, i.e., for devices with displacement amplification, the value of the optimal stiffness ratio, rapt> changes, and the peak energy output is reached at lower values of [opt- For a ten times kinematic gain (G = 10), the optimal stiffness ratio is about 1/100. Exact expressions of the optimal stiffness ratio in terms of the kinematic gain, G, are obtained by differentiating the energy coefficient with respect to r , and setting the derivative equal to zero, i.e., 2



2



c C ] . ( 1 - 2r -d E' = [ dr c (1 + r . 0)2 1+r . 0



)



= 0



(46)



hence rapt



=



1



C2'



(47)



Conversely, for fixed external and internal stiffness values, the optimum gain is Copt



1



= ?r-



(48)



The output displacement coefficient (overall amplification ratio), corresponding to the optimum kinematic gain, is obtained by substituting the expression of G opt into Equation (42). Upon simplification, we obtain 1



110pt = 2.jY



(49)



It should be noted that the value of the optimum amplification ratio, Y/opt> is half the value of the optimum gain, Gopt' To obtain a certain overall amplification ratio at the optimum operating point, one has to provide twice as much kinematic gain in the internal construction of the displacement amplifier. Plots of the optimal kinematic gain, Gopt, and of the amplification ratio, Y/opt, versus the inverse stiffness ratio, 1/ r , are given in Figure 7. 2.3.3. Optimal kinematic gain, C,for agiven value of 11



When the output displacement coefficient, Y/ = Ue/UISA, and the stiffness ratio, r = ke / k;, are specified, the value of the kinematic gain, G, must be selected. The expression of the output displacement coefficient given in Equation (42) can be solved for a given Y/ to obtain the desired value of G. A plot of Y/ vs. G for a fixed value of r is given in Figure 8. The kinematic gain, G, that will produce a required output displacement coefficient, Y/, is obtained by intersecting the Y/-G curve of Figure 8 with an Y/ = const. line. Two solutions are generally possible, G 1 and G2 . The same result can be obtained analytically by solving Equation (42) for Y/ in terms of G, with r as a parameter. One gets a quadratic



 Mechatronic s and smart stru ctures design techn iques



343



30r------ - . - - - - - - - - , r - - - - - . . , . ,



Optimal kinematic gain , Gopr



20



10



100



Inverse stiffness ratio, l/r



= klke



1000



Figure 7. O ptimal kinematic gain and amplification ratio versus inverse stitTness ratio.



10



.----,------r----,--,-----,



8 6



1] vs. G



1]



4



curve



2



o0



100 Kinematic gain, G



Figure 8. Variation of output displacement coefficient. ratio. r = 1/ 300.



1].



with kinematic gain, G. for a fixed stitTness



equation in G, which accepts two solutio ns



G1 (I] . r ) = ~ ~ [ 1 + ) 1-



4'12r]



G2 (I] , r ) = ~~ [1 - ) 1-



4'1 2rJ



2'1 r



- I] r



(50)



O f the two possible solutions given by Equation (50), the one containing the minus sign, i.e., G 2 , is desired because it achieves the same output coefficient, TJ , but with



 344



Victor Giurgiutiu



a lower kinematic gain. Hence, we conclude that the kinematic gain, G, required to obtain a certain output coefficient, Y), can be calculated with the formula



= ~~[1 -



C(1], r)



21] r



Jl - 41]2 r



J



(51)



The existence of the solution is determined by the discriminant being greater than zero, i.e., (52)



In practice, this means that, for a given value of the output displacement coefficient, and a given stiffness ratio, r, one mayor may not be able to construct an actual amplification device depending on whether the discriminant is greater than zero or not. For a given displacement coefficient, real values ofthe kinematic gain, G, are only obtained if the stiffness ratio, r, is less than a critical value, rcr, i.e., Y),



r



:s rcr



(53)



where f cr



1



(54)



= -2·



41]



Equation (54) means that real values of the kinematic gain, G, only exist if the internal stiffness of the ISA device, ki , is greater than a critical value, i.e., (55)



where (56)



Thus, the condition of Equation (55) can be written as (57)



At the critical stiffness ratio, r cr s the discriminant gain, G, is given by the expression 1 1



C =-" ?~1] r.,



~



is zero, and hence the kinematic



(58)



E;,



The output energy coefficient, given by Equation (45), can be also expressed in terms of the critical stiffness ratio, r cr, in the form fir



Ec = - - · 4 r.,



(59)



 Mechatronics and smart structures design techniques



345



1.0 .-----,-----.,-----,----,



0.8 ~ 0.6 "0



c



<1l



Cj 0.4



Energy coefficient, Ee'



0.2 0.0



400



600



800



Relative internal stiffness, kfk;



1000



= l/r



Figure 9. Variation of normalized kinematic gain, G', and of output energy coefficient, E', with relative internal stiffness, k;/ k, = 1/ r , for a fixed value of I] (I] = 8.333).



Figure 9 gives plots of the normalized kinematic gain and output energy coefficient versus normalized stiffnessratio for a fixed value ofthe output displacement coefficient. The output energy is maximum when the stiffness ratio matches the critical stiffness ratio given by Equation (54). Hence, the critical stiffnessratio defined by Equation (54) is actually the optimum stiffness ratio for the displacement-amplified induced-strain actuator. This observation is of utmost importance in the design of a displacementamplified induced-strain actuator. Another observation stemming from Figure 9 is that the behavior of the kinematic gain and ofthe output energy coefficient around the optimum point is not robust. Small variation in the stiffness ratio can produce large variations in G and E e . It can be easily verified that the derivative of G with respect to r has an infinite value at the optimum stiffness ratio point, r/ rcr = 1. The use of optimum stiffness ratio in displacementamplified induced-strain actuators, though attractive, may not be desirable, since small manufacturing variations easily lead to "de-tuning" and loss of performance. A design point slightly away from the optimum may offer a better choice from this point of view, and may lead to a more robust behavior. 2.4. Electric response



The electric displacement (charge per unit area) is calculated starting with Equation (2), I.e.



D=d·T+e·E



(60)



 346



Victor Giurgiutiu



The stress T will be expressed in terms of stiffnessand mechanical displacement. Recall: II ,



1



= - -k1+



-.:.



UISA



(61)



F= ke . ue



(62)



k;



and F T=- - , A



Hence, .



r



k,



1



= --:4 1 + ~ UISA



(63)



k;



Up on substitution



D= d



D



(



kr 1 ) - ~--k-IIISA A 1+ ~



(64)



kj



1)



SI = - -d ( -ke--kS



+ et:



1 + ~ kj



A



IIISA I



+ eE



We identify the internal stiffness kj =



(65)



fl, and the stiffness ratio, r = ke / k



j •



d r "ISA D= €E- - - - - s 1+ r I



Hence (66)



Expression (24) clearly identifies the counter electrom otive effect associated with the piezoelectric material operating under non-zero load. T he second term in Equation (66) is zero for zero extern al stiffness, r = 0, i.e., for no load. Under these condition s, the electric charge accepted by the piezoelectric material reaches a maximum. As the load increases, r i= 0, and the electric displacement decreases. As the "blocked" conditions are approached, r -+ 00, and the maximum redu ction in electric displacement is obtained. Further insight into this phenomenon is gained by expressing UISA in term s of E through IIISA = d . E . I. H ence, d2 r ] D = € 1---- E [ S€ 1 + r



The ratio d 2 / S e is the electromechanical coupling coefficient, r



-JE



D= €[1-K 1 + r 2



-



(67) 2 K .



H ence, (68)



 Mechatronics and smart structures design techniques



347



The electric field is defined as the voltage per unit thickness, i.e., E



(69)



V= h



Integration over the cross-section yields Q= C [1 -K



where



2



-



'



1 +r



]



(70)



V



e is the electri c capacitance defined as A h



C= E:-



(71)



This aspect is also reflected in the iffective capacitance, actuator, defined as C*(w)



= iWC [ I -



2



K



e *,



of the induced strain



(72)



- ' - ]



1 +r



It is apparent that a redu ction of the effective capacitance will take place under high spring loads. Denoting this capacitance reduction by elSA, we can express it in terms of the nominal (zero-load) capacitance in the form



e



C*(w) = C - CISA ,



C ISA (W)



= K 2 -r- C 1+r



(73)



This effect can be expressed in nond imensional form throu gh the introduction of a capacitance reduction coefficient , eL~A ' defined as I



C ISA



C ISA r = -= K2 C 1+r



(74)



For magnetoactive actuators, a similar derivation leads to the expression



. ( 2'+, ) )-1 +,



Zd w) = ica l:



1-



KL--



1



YL(w) = -.-1 ( 1 - Ki - 'Iw L 1



KI



(75) (76)



where = k~'hi is the effective piezom agnetic coupling coefficient, L is the magnetic indu ctance, and I is the maximum cur rent.



 348



Victor Giurgiutiu



3. ANALYSIS OF INDUCED-STRAIN ACTUATION FOR DYNAMIC APPLICATION



In this section we discuss the analysis of induced-strain actuators for dynamic applications. The types of applications under consideration are assumed to operate at frequencies well below the resonance frequency of the free induced strain actuator. This situation is common in mechanical engineering applications of induced strain actuators. However, resonance of the complete system, consisting of the inducedstrain actuator and the dynamic external load, is possible. In considering the dynamic operation of induced strain actuators, two aspects are important: 1. Dynamic stroke 2. Resonance of the complete system



The dynamic stroke is an important aspect since most induced strain materials do not display symmetrical behavior with respect to polarity reversal. For example, electroactive stacks accept some reverse polarity, while PMN stacks do not accept any polarity reversal. To deal with this situation and create a symmetric stroke, the induced-strain actuator is usually biased about a midpoint position. For example, the TERFENOL actuator is internally biased to accept complete polarity reversal. In other cases, the bias is done through the external power supply. In a generic case, dynamic operation of a solid-state actuator can be assumed to take place about a mid-range position by superposing dynamic voltage amplitude onto a bias voltage component v(t) = VI}



+ Vsinwt,



(1)



where V() is the bias voltage, and V is the dynamic voltage amplitude. The corresponding induced-strain displacement will be u(t)



= Uo + UISA sinwt,



(2)



where, Uo is the bias position, and ihsA is the dynamic displacement (Figure 10). For dynamic operation, the total static displacement is equally divided on the two sides of the bias point, such that equal positive and negative excursions are achieved. Thus, the effective maximum displacement for dynamic operation (dynamic stroke) is usually half the effective maximum displacement for static operation (static stroke). The analysis for dynamic operation has many similarities with the analysis for static operation. Of course, the mechanical applications considered in this analysis are well below the resonant frequency of the actuator itself, such that elastic waves effects in the actuator are negligible and can be ignored. The behavior of the external load, on the other hand, is truly dynamic, such that the effective structural stiffness, ke (w), is strongly dependent on frequency, and may go through resonance within the operational frequency range.



 Mechatronics and smart structures design techniques 349



25



>



t--+--+-j'--+-~+--+--I--j'--I--\-f-r-- WI



ai



s



Ol



(5



>



~



"& «



(a)



-750 - 1000



"/SA(I)



.~



E



-=:



1f m ~



"8



120



="0 + Ii /SA sinwl



80



~ 40 -6~ ..=g. O+--+---+-\--+--f-+'6



(b)



--+---+--\---+---,I-+ ~



-30



Figure 10. Dynamic operation of Polytec PI P247.70 induced- strain actuator : (a) applied voltage, pet), has bias and dynamic componen ts. ~{, and V: (b) the corresponding induced-strain displacement, "ISA(t ).



---'-----, F(t )



U(I) i(t) Figure t 1. Schematic of the interaction between an induced-strain actuator (ISA) and an elastic structure: under dynamic conditions showing the frequency dependent dynamic stiffness.



Figure 11 shows a solid- state indu ced-str ain actuato r operating against a dynamic load of parameters ke (W), m e (w), and c, (w). T he actuator is energized by an AC power supply that applies a time varying voltage, l'(t), at the actuator term inals and provides a time varying cur rent, i (t ). As the charge is built up, the voltage and the electric field increase. Under the action of the electric field, the electro-active materi al expands and produ ces an outp ut displaceme nt , u (t) , which gene rates a reaction force from the mechanical system , F (t). T he reaction force, F (t), acts on the indu ced-strain actuator. This produ ces loss of output displacement throu gh the actuator compressibility and



 350



Victor Giurgiutiu



the counter electric motive force (counter emf) due to the piezoelectric effect. Hence, an actuator under load always has a lower output displacement than a load-free actuator energized by the same voltage. Assume a harmonic variation of the form u(x, t)



= u(x)e



i W1



,



E(x, t)



= E(x)e



i W1



,



(3)



where the implied notation convention is cos tot = Re (e' W f ) . In-phase (conservative) mechanical response is manifested as a real quantity, while the out-of-phase (dissipative) response is manifested as an imaginary quantity. The complex stiffness ratio is given by i(w)



= ke~w), ki



(4)



where the complex stiffness expressions are A



(5) (6)



A material under dynamic operation displays internal heating due to several loss mechanisms. A simplified representation of this behavior assumes the complex compliance 5



= (1 -



i1])5



(7)



A similar convention applies for the electrical quantities, but with a different meaning. A real electrical quantity signifies dissipation, while an imaginary electrical quantity signifies conservation. In an LRC circuit, the voltage V(t) and the current I(t) are related by the complex impedance Z(w) = R + (iwL + 1/ iwC) in the form V(t)



= Z(w)I(t) = [R+ (iwL + 1/iwC)] I(t)



(8)



The electric dissipation associated with dielectric loss is incorporated in the complex permittivity i;



= (1- i8)8



(9)



Complex compliance and permittivity (even if the bar sign is omitted for simplicity) will be assumed henceforth. In addition, we define the complex coupling factor -2



K



d2 =-



s·s



(10)



 Mechatronics and smart structures design techniques



351



3.1. Mechanical response



Recall the I-D constitutive relations S=s·T+d·E



(11)



D=d·T+e·E



(12)



Newton's law of motion and strain-displacement compatibility are applied to an infinitesimal stack element of thickness dx, i.e., dT d2 u - A = p A2dx d t du



(13) (14)



S=dx



where A is the cross sectional area of the electroactive stack. Substituting Equations (13) and (14) into the constitutive relations of Equation (11) yields (15)



Assuming uniform electric field throughout the electroactive stack yields d2 u dx2



d2 u



= sPdt2



(16)



This is the wave equation, which has the general solution of the form u(x, t)



= (C1 sin yx + C 2 cos yx) e;wr = u(x)e,wt



(17)



The symbol y, is the wave numbergiven by y



= w..fiJs



(18)



The constants C j and Cz are found from boundary conditions. Taking u(O, t) = 0 yields Cz = O. At x = 1, the actuator interacts with the external structure represented by the equivalent dynamic stiffness (19)



where k" me, and c, are the equivalent stiffness, mass, and damping, respectively. The dynamic stiffness ke is a complex function ofthe operation frequency, w. For structures with multiple degrees of freedom, one can approximate the dynamic behavior by a series expansion of the dynamic stiffness, and a one-degree-of-freedom behavior can sometimes be identified by taking the dominant term corresponding to the resonance



 352



Victor Giurgiutiu



frequency closest to the operating frequency. One can express the dynamic stiffness in J ke Im e , and the associate terms ofa resonance frequency of the external system, Wo cel(2mwo), i.e., damping factor, ~



=



=



(20)



where p is the frequency ratio p = wlwo. The boundary condition at x = 1 can be expressed as the equilibrium between external forces and internal stress resultants, i.e., T(l) A



= -ke (w) U(I)ei wr ,



(21)



where the minus sign signifies external reaction. Equations (11), (15) and (18) yield the expression u(x;w)



=



d· E y cos yl



s _



+ -ke(w) sin yl



(22)



sinyx



A



u



Note that the function (x; w) (displacement variation along the stack) is a complex function containing w as a parameter that affects both real and imaginary values. In many mechanical engineering applications, the operating frequency is below 100 Hz. At these frequency the wave propagations effects are negligible. For example, for 10 = 25 Hz and a stack actuator oflength I = 288 mm, yields yl = 0.018 = 1.8%. In such typical case, the imaginary part of (x; w) is around 1.6% of the real part. If wave propagation effects are ignored ( yl » 1) the quasi-static solution is obtained



u



(23)



The exact and quasi-steady solution at x = 1 (the actuating end of the electro active stack) calculated for a typical actuator, yields IUqs (l; w) 1/I (1; w) I = 99.991 %. This shows that the use of the quasi-steady solution is justified for the low frequencies specific to mechanical engineering applications. At this stage it is useful to introduce the notation



u



A A k = - = -(l+;n) , 51 51 '/



where



ki signifies the d·



(24)



complex internal stiffness of the induced-strain actuator. Hence



E



y



Uqs (x; w)



=



d·



_E



k,(w)



1+-_k,



x



(25)



 Mechatronics and smart struc tures design techniqu es 353



To simplify notation s, it is convenie nt to den ote the ratio between external and internal com plex stiffness by i (w)



= k,~w),



k..



(26)



r = -



k,



k;



where the complex external and internal stiffness kr(w) and kj are given by Equations (19) and (24), respectively. Equation (25) becomes • u(x ;w)



=



d· E



cosyl



.



y!



sin yx



Y



+ i (w) -Sill



yl



d·E



I' q,. (x ; w) = 1 + r (w) x



(27)



We are going to use the symbo ls S/SA and U/SA to denote the strain induced in the electroactive material by the applied electr ic field E, and the cor respo nding induced displacement at the end of the stack (free induced displacement), as follows (28)



Hence Equatio n (27) takes the simpler for m 1



II (x ; w) •



II q, (x : w)



cos y! + i (w)



=



1



_



I Sill Y .



7



sin yx II/SA yl



(29)



x



- lJ/SA



1 + r (w) I



3. 2. Electric response



The electric displacement (charge per un it area) is calculated using Equation (12), wit h the stress T exp ressed in terms of stiffness and mechanical displacement . For harm on ic operation, • • D(w)=sE - -d ( 15



co s yl. uny!



cosyl+ - -yl· i(w)



) -U/SA I



(30)



The electric displacement depicted in Equ ation (30) varies along the stack length since it depe nds on the space variable x. It also depends on the frequency co. Ifwave prop agation effects along the stack length can be ignored (i.e., yl » 1), the expression for the quasi-static electr ic displacement is obtained •



•



o; (w) = e E -



d i (w) -; -1-+ -i(w-)



I'/SA



I



(31)



Expression (24) clearly ident ifies the coun ter electro-mo tive effect associated with the piezoelec tric material operating under non-zero load. The second term in



 354 Victor Giurgiutiu



Equation (31) is zero for zero external stiffness, i(w) = 0, i.e., for no load. Under these conditions, the electric charge accepted by the piezoelectric material reaches a maximum. As the load increases, i (w) =j:. 0, and the electric displacement decreases. As the "blocked" conditions are approached, li(w)1 ---+ 00, and the maximum reduction in electric displacement is obtained. Further insight into this phenomenon is gained by expressing [hSA in terms of E through Equation (28). Hence,



[d 2



,



i(w) ] , Dqs (w) == s 1 - _ E ssl+r(w)



The ratio d 2/se is the electromechanical coupling coefficient, ,



[



D qs (w) == e 1 -



K



2



(32)



K



2



= d 2/sCo Hence



i(W)] _ E,



1 + r(w)



(33)



The current and voltage developed during the operation of the electroactive stack are deduced from the basic relationships relating them to the electric displacement and electric field 1(t)



==



f



d citD(t)dA, V(t)



E(t)



= -h-'



(34)



A



Integration over the cross-section, and assumption of harmonic operation yield ,



[



1 = iwC 1 -



K



2



i(W)] V,



1 + i(w)



(35)



where C is the electric capacitance defined as A



(36)



C==s-



h



Recall the definition of admittance, Y, and impedance, Z, i.e., 1 = Y(w) V, "



" A



V



==



"



Z(w)1



(37)



Note that the admittance and impedance are related by the reciprocal relation Y(w)Z(w)



=1



(38)



Hence, we write the admittance and reactance of the induced-strain actuator as (39) (40)



 Mechatronics and smart structures design techniques 355



N ote that the effective electrical imped ance and admittance of the induced-strain actuator are influenc ed by the value of external dynami c stiffness, as reflected in the complex stiffness ration , r (w). This aspect is also reflected in the iffective capacitance, e *, of th e indu ced strain actuator, defined as C*(w)



= i wC



[1 _K2



i (w)



1 + i (w)



]



(41)



It is apparent that a reduction of the effective capacitance will take place und er high spring loads. Den otin g this capacitance reduction by elSA, we can express it in terms of the nominal (zero-load) capacitance e in the form C '(w)



=C -



CISA,



CISA(W)



= K 2 1 +i (w) _ C r(w )



(42)



This effect can be expressed in non dimensional form through the introduction of a capacitance reduction coefficient , e ;SA' defined as I



C/SA(w)



ClSA(W)



= --- = K



2



C



i (w) _ 1 + r (w)



(43)



Variation of the capacitance reduction coefficient, e ;SA' with the frequ ency ratio and static stiffness ratio are given in Figure 12. It can be seen that a resonance peak is noticed at p = w !wo = 1.33. Thi s phenomenon strongly influences the behavior of the electroactive stack during dynamic operation. For magnetoactive actuators, a similar derivation leads to the expressions



.



Zdw) = /wL



Y w -



1



--



d ) - itol:



( (



1-



~ r (w) ) KL. 1 + r (w)



l-K



2



L



r (w)



1 + r (w)



(44)



)-1



(45)



Kf



wh ere = ki~;Jf' is the effective piezomagnetic coupling coefficient , L is the magnetic inductance, and I is the maximum cur rent. 4. DESIGN OF SMART STRUCTURES WITH INDUCED-STRAIN ACTUATORS



The design of effective ISA actuator applications is particularly challenging due to the small displacement generated by these mater ials. Ho wever, the forces that can be generated by a ISA actuator can be very large. These forces are limited only by the inherent stiffness and the compressive strength of the ISA material. For a stack of a given length , the stiffness increases proportion ally with th e effective area. For most practical applications, a displacement amplification device (displacement amplifier) is employed to incr ease th e small induced-strain displacement of the basic actuator. The displacement amplification is basically a lever mechanism, though various constru ctive



 356



Victor Giurgi u tiu



2 r - - - - ---,-- - - - - -.., 1.5



c;~



0.5



Real pan



o



Imaginary pan I



HJ



P



(a)



0.9 0.8 0.7



0.6 0.5



p = 100



0.4



0.3 0.2 0.1 (b)



(b.o l



0.1



10



)00



Figure 12. Variation of capacitance reduction coe fficient, C~SA ' w ith frequency ratio, p, and static stiffness ratio , r : (a) variation wi th p for r = I;(b) variatio n wi th r, for three values o f p .



variants may be empl oyed. In a generic formulation, the displacement amplifier can be viewed as a compliant mechanism (Clvl). Hence, the effective amplification, n, will depend on both the mechanism geometric ratio, C, and on its int ernal stiffness. The larger the internal stiffness of the compliant mechanism, the large the closer the effective amplification, n, will be to th e geometric ratio, C. The effective design of the displacement amplifier can "make or break" the practical effectiveness of an ISA actuato r. The mo st impor tant parameter that need s to be optimized dur ing such a design is the energy extraction coefficient defined as the ratio between the effective mechanical energy delivered by the actuator and the maximum possible energy that can be delivered by the actuator in the stiffn ess- m atch condition . T he overall efficiency of active-material actuation depend s, to a great extend , on the efficiency of the entire system, which include s the active- material transdu cer, the displacement amplification mechanisms, and the power supply.



 M echatronics and smart stru ctures design techni ques 357



4.1. Efficient static design



C onsider a stacked' actuator of nominal (free-stroke) displacem ent UlS.~ , and int ern al stiffness kj . During static operation, the tot al induced energy. EISA, gets divided between th e int ern al and external subsystems: part of it, E r , gets transmitted to th e externa l application, while the rest, E j • rem ains stored in th e int erna l compressibility of th e stack. The value of E ISA, and its repartition between E, and Ei, depends on th e stiffness ratio r = kr / kj • It has been shown in previou s sectio ns th at ElSA(')



r



=- E,rr 1+ , . r



E (r ) - c



Ei(r) Er~r



-



=



(1 +- ,) 2 Er re



,



(1



r-



(1)



+ r)2 Erej



1



= -2 ki



•



2



u/SA



where Erej is a reference energy conveniently defined in ter ms of stack characteristic parame ters. U/SA, and kj • Dividing through by E rrr yields the non-dimensio nal energy coefficients E;sA(r ), E; (r ), E;(r). i.e.. ,



E/SA (' )



=1+ , t



E' (r ) _ - ' c - (1 + , )2



(2)



,2



E;(, )



= (1 + , )2



Figure 13a shows the variation of the total induced energy coe fficient , E;SA' and of the inte rnal and external energy coe fficients, E; and E; with the stiffness ratio, r. It can be seen that the exter nal energy coefficient , E; reaches a maximum at r = 1. T he ,. = 1 point corre spond s to th e stiffiless match condition. Beyond the r = 1 point, th e energy delivery starts to decrease in spite of continued increase in th e overall indu ced strain energy, E;SA' The explanation is th at beyond r = 1 the induce d strain energy is retained in the internal compressibility of the ISA stack, i.e., it remains stored internally as Ei. H ence, under static condition s, an [SA actuator cann ot be efficiently used beyond 1. Figure 13b show s th e variation of the energy transmission efficiency 1](" ) with r th e stiffness ratio r, Below th e stiffness match condition, the energy delivered extern ally, Err is always greater than the energy-stored internally, Ei, Hence, for r < 1, th e energy transmission efficiency is greater than 50%. T he energy transmission efficiency approaches 100% as r --+ 0, bu t this corresponds the free actuator operation, i.e., with no actuation force, and hence no energy being actually transmitted. T hus the 100% efficiency is not practically possible. In th e extreme, r = --+ 00 . th e actuato r is blocked and hen ce it has no output stroke. In th is condition . th e actuator produ ces maximum [SA energy, but no ne of it gets delivered extern ally either.



=



 358



Victor Giurgiutiu



0.75



c'"



'" 0.5 ~



g <J



>.



~0.25



c



Ul



0.1



1



Stiffness ratio. r= k./ki



10



100



(a) c



o



'cr.



'" C 'E '" c e '~ -c >. ....



...eo'"



'" c



Ul



(b)



1.0 0.8 0.6 I<,



0.4



0.2



o 0.01



0.1



10



Stiffness ratio,r



100



=k/ki



Figure 13. Variation of energy coefficients and energy transmission efficiency with stiffness ratio. r,under static operation.



As the optimum (stiffness match) condition is approached. the delivered energy reaches its peak. but the transmission efficiency continues to decrease. Hence. at the stiffness-match point, the energy transmission efficiency is only 50%. This is consistent with the fact that, since the stiffness is matched, the internal and external energies are equal, and hence half the ISA energy remains stored internally. Beyond r = 1, the energy transmission efficiency, 11, continues to steadily decrease. Figure 13b also illustrates another important concept, viz. that of conjugate stiffness ratios. Consider r1 = 1/4 and rz = 4. The corresponding external energy delivery is, both cases the same: E (r 1) = E (r2) = 0.16 E'if' But the energy transmission efficiency is widely different, viz. l1(rl) = 80% while l1(r2) = 20%. The same external effect, E, = 0.16E,ef, is thus obtained with two widely different values of energy efficiency. The soft design (r1 = 1/4) is four times more efficient than the stiff design (r2 = 4). This example only illustrates a more general principle that, for static operation off the stiffness match condition, there are always two conjugate solutions that produce the same output energy, but one is more efficient than the other. The more efficient solution is usually the softer design.



 Mechatronics and smart structures design techniques



359



ISA



Device



Figure 14. A dynamic ISA system consisting of an ISA device and a dynamic structure.



In conclusion, under static conditions, the best use of an actuator is made when the internal and external stiffness are matched. In this case, maximum attainable energy delivery is attained, and its value is (E e)max = ±Ere[' If stiffness match cannot be obtained, a "soft" design (r < 1), will always be more efficient than a "stiff" design (r > 1). These observations are essential for the successful design of efficient ISA devices for static applications. 4.2. Efficient dynamic design



Consider next a typical configuration for ISA operation under dynamic condition at frequencies typical to mechanical applications. Figure 14 presents an ISA stack coupled with a external dynamic system consisting of mass, spring and damper. In aero-servoelastic control, the mass, spring and damper values vary with the operation frequency, and also with airspeed and length scale. For the present study, constant mass, spring and damper values are considered. Hence, the equivalent dynamic stiffness can be written as: kc(w) = (kc - w 2 me) + ioic . Assuming the natural frequency of the external system as wo, one expresses the complex dynamic stiffness (3)



where p = to / w() is the frequency ratio, and { is the internal damping of the external system. The real and imaginary parts of the complex dynamic stiffness signify in-phase and out-of-phase force reactions components. At low frequency (p --+ 0), the inertial and damping terms in the dynamic stiffness expression vanish, and the static stiffness k c is predominant. This is the quasi-static dynamic operation. At higher frequencies, but still below external resonance (0 < p < 1), the effective stiffness reduces by (1 _ p2), while the dissipative (out-of-phase) component grows as 2{p. As we approach the mechanical resonance (p --+ 1), the reactive inertial forces balance the reactive elastic forces, and hence the real part of the dynamic stiffness vanishes. At resonance (p = 1), only the imaginary part of the dynamic stiffness is non-zero i.e., dissipative forces predominate. Above resonance (p > 1), the inertial forces dominate, and stiffness magnitude increases parabolically. A phase shift of 90° is recorded at resonance, and an overall phase shift of 1800 takes place as the operating point passes from the sub-resonance into the post-resonance regimes. Using the frequency dependent



 360



Victor Giurgiutiu



expression of the complex dynamic stiffness, we define a frequency dependent expression for the stiffness ratio. This expression, called the dynamic stiffness ratio, is given by r (w) = 1. The dynamic stiffness ratio is a complex quantity, just as the dynamic stiffnessis.'A complex behavior is expected from the ISA-driven system under dynamic conditions. The ISA is driven by alternating voltage, v (t), and current, i (t) which induce an alternating strain. The resulting dynamic displacement, u(x, t; w) varies with time and position. Neglecting wave propagation effects inside the induced-strain actuator yields a linear variation of the displacement along the actuator length. Displacement compatibility and force balance between the actuator and the external mechanical impedance are imposed at x = I. At frequencies well below the internal actuator resonance, the dynamic displacement expression takes the form



k,t



u(x,t;w)



=



1 x • _ -uIsAsinuJt=u(x;w)sinwt 1 + r(w) I



(4)



u



where signifies motion amplitude, and UISA is the free stroke. At the interface between the internal and external systems, the external displacement is recovered as ue(t;w) =



u, sinwt = 1 + 1r(w) _ UISA sinwt,



1 U,=--1 + i(w) UISA·



(5)



where usignifies complex number. This remarkably simple expression is ideally suited for studying the power and energy transmitted to the external system during dynamic operation. The stiffness match concept from static analysis is extend to dynamic analysis, using the dynamic stiffness concept. The dynamic stiffness concept allows direct analytical continuation between the static and dynamic regimes. Three questions will be addressed: 1. How is optimum stiffness ratio defined under dynamic conditions? 2. Does a statically matched system maintain its superiority when operated dynamically? 3. How should one design a system to be optimal and robust over a frequency range? In addressing the first question, we first define the optimum condition under dynamic operation. To obtain a step-by-step understanding, three situations with increasing degree of complexity will be considered: a. Quasi-static dynamic operation b. Undamped dynamic operation c. Damped dynamic operation Details are given next.



 Mechatronics and smart structures design tech niques 361



c



C/)



Q)



' (3



:E Q) oo



.... Q)



~



o



0.75



0.5



a.



>.



"0



m 0.25



iii I



C/)



Ctl ::J



a



o1::::::::l......:....:.l.....!~:..-.L:..L..l-...!.-..l..lllllL-L~~ 0.01



0.1



1



10



Stiffness ratio, r = ke / ki



10



Figure 15. Variation of quasi-steady power with static stiffness ratio. 4.3. Quasi-static dynamic operation



Under quasi-static dynamic operation damping and inert ial effects are neglected (k, (w) = k, ). Hence, no ph ase shift occurs, and all displacement and force amplitudes are real lI/SA(t) = u/sAs inwt,



u r (t ) = ~l r sinwt ,



lIi (t)= ll;sin wt,



F(t ) =k, ul' (t )



(6)



Note that for quasi-steady operation, the energy principles developed for static operation are directly translated into power pr inciples. Using the instantaneous power definition P (t)



= F (t ) . u(t) = Psin wt



(7)



yields



•



P, (r) •



r



= - - -2 Prej (1 + r ) r2



(8)



= - - -2 Prej (1 + r) . {r) = wErej= w (Iz k;lIisA ' ) Prej Pi (r)



A plot of these expressions vs. stiffness ratio, r, is given in Figure 15. The plot closely resembles that of Figure 13, only that ener gy coefficients were replaced by



 362



Victor Giurgiut iu



c Q)



'13 :E Q)



oo



Q; ;:



o



Co



iii



...c Q)



x w



0.5



Static stiffness match (r=1)



,.;......-.,. .,.. \



0.4



.i



i i i i i i



Stiff static design (r=4)



0.3



---wwc~~:~~



_



~-"""---



0.2



...._,



--



--



"Soft static design (r=1/4).



0.1



u. ' 1 I.



' ....



i•



'"



o o



0.2



0.4



0.6



0.8



Frequency ratio, P



Figure 16. External power coefficient vs frequency ratio for an und amped dynamic system .



power coefficients. For quasi-static dynamic operation , the maximum power delivered by an ISA device is achieved when the intern al and extern al stiffness are matched. Under quasi-static dynamic operation, the static stiffness match principle still applies. 4.4. Undamped dynamic operation



Many dynamic utilizations of ISA techn ology do not take place und er quasi-static condition s. For mechanical resonanc es in the 10 to 50 Hz frequenc y range, inertial forces cann ot be neglected. As the mechanical resonance of the extern al system is approached , the reactive inerti al forces get subtracted from the reactive elastic forces, and hence the static stiffness match is upset. A system with statically matched stiffness is not expected to retain its optimal condition und er fully dynamic operation. This is illustrated in Figure 16. For a statically matched system, the external power coefficient at p = 0 (quasi-static operation) is optimum, i.e. it has the maximum possible value, 0.5. As the frequ ency increases, the power coe fficient of a statically matched system decreases. At p = 1 (i.e. wo), the power coefficient becomes zero. Th is behavior at extern al resonance , W expect ed, since the equivalent dynami c stiffness of the extern al system decreases as resonance conditions are are approach. At resonance, the equivalent dynamic stiffness is virtu ally zero, hen ce no force is transmitted, and thu s no power. For a soft extern al system (r = 1/4), the same behavior is encountered. T he starting point at p = 0 (quasistatic conditions)-is lower, since the unm atched static stiffness gives less than optimal quasi-static behavior. For a stiff external system (r = 4), the beh avior is different . Under quasi-static conditions, the powe r coefficient of this stiff system is the same as that of the conj ugate soft system. As frequency increases, the behavior of the stiff system is drastically different from that of the soft system. Its power coe fficient increases, w hile that of the conj ugate soft system decreases. This behavior is expected, due to the dynamic soft ening of the stiff system . As frequency in creases, the statically stiff system



=



 Mechatronics and smart structures design techniques



-



..c: o CO



E .Q



~~



Q)



C



>-



u



:Ec:



0.5



Ui ~ U



.-



0Q)



E ..... coc



o>-



/



0.75



0.25



/



»<



-



363



~



,I



r



1/



o



10



100



Static stiffness ratio , r Figure 17. Variation of dynamic stiffness match frequency with static stiffness ratio.



progressively approaches the optimum (stiffness-match) condition. When the dynamic external and internal stiffnessare equal, dynamic stiffness match is attained. The frequency value at which dynamic stiffness match occurs is obtained by setting the value of the dynamic stiffness ratio to 1, i.e.



Pmatcl.(r)



=)1 - ~



(9)



Figure 17 shows the variation of the dynamic stiffness match frequency ratio with static stiffness ratio. For large static stiffness ratios, the frequency ratio for dynamic stiffness match approaches asymptotically the value 1. For moderate static stiffness ratios, values between 0 and 1 can be obtained. For example, for a static stiffness ratio r = 4 (i.e., stiff static design), the dynamic stiffness match takes place at Pmatc!J(4) = )3/2 = 0.866. This value corresponds to the local maximum of the power coefficient for a stiff static design, as show in Figure 16. The dynamic stiffness match principle is a powerful design tool. Depending on the design degrees of freedom under consideration, either the static stiffness ratio, or the operating frequency ratio can be selected in such a manner that the operating point is as close as possible to the optimum condition. Adequate use of the dynamic match principle in ISA systems design can produce significant weight savings and increased performance. 4.5. The damped dynamic system



For a system that presents internal and external losses, the dynamic stiffness is a complex quantity with real and imaginary components. In such a case, the dynamic stiffness match principle has to be extended to take into account the additional aspects of



 364



Victor Giurgiutiu



0.3 0.25



g> '"....E



------ -----



0.2



Q; 0.15



~



a,



IX ....""



--------



0.1



~



~



0.05



o



/"



~~



o



0.2



0.4



0.6



"



Stiff static design (r = 4)



,



\



\



r-.



0.8



Static stiffness match (r = 1)



Softstatic design (r = 1/4)



Frequency ratio, po>'wo Figure 18. Variation of power rating with frequency ratio for three stiffness ratios



(~



= 5%, TJ = 0).



complex power analysis. We define



P= P, v



f: = -ke • Uco



f:.~,



=



Ipl· cos cp



it



= ioi-



U,



Pratin.~



=



Ipl,



coscp = arg(P) (10)



The complex power



P varies with frequency and static stiffness ratio. Its expression is



P(r, p)



p)f Pre[



.



= -I



[1



r(r, p)



+ r(r,



(11)



Figure 18 presents the variation of power rating with frequency for a statically matched, and two statically unmatched, conjugate systems. It can be seen that the statically matched and the soft systems display a decrease in power rating with frequency, as expected. The stiffsystem displaysa power rating increase up to the dynamic frequency match point, followed by a decrease. These trends are consistent with the behavior displayed by the undamped system (Figure 17). An aspect particular to the damped systems is the variation of the power dissipation factor, cos¢ given in Figure 19. For a statically matched design, the power dissipation factor is positive (cos¢ > 0) and increases with frequency, as expected. Similar behavior is displayed by the soft design. The stiff design presents an unexpected behavior: below the dynamic stiffness match frequency given by Equation (9), the power dissipation factor is negative (cos¢ < 0). Further investigation of this remarkable phenomenon is presented in Figure 20. This figure shows that increasing in external damping from ~ = 5% to ~ = 10% increases the amplitude of the negative power dissipation factor. Increase of internal damping from



 Mechatronics and smart structures design techniques



0.2



365



Stiff static design (r = 4) ---.J----+--------t----l+---+j



cos o



Static stiffness match (r = 1)



Soft static



0r~===i===:::j=:===--t--i74 design



0.2



0.6



L-_----l._ _---L_...::::::::::::._



o



0.4



_



0.8



.:!:::::.....--J



Frequency ratio, pw/wO



Figure 19. Cos¢ variation with frequency ratio for three stiffness ratios



(r = 1/4)



« = 5%,



I]



= 0).



0. \



'-o, ~ = 10% , 11 = 3%



cos¢ 0



o



0.2



0.4



0.6



Frequency ratio, p =(I)'WO



0.8



Figure 20. Variation of cos¢ with frequency for a stiff static design and several damping parameters. TJ = 0 to TJ = 3% has the opposite effect, and decreases the negative power dissipation factor.



4.6. Design example of induced-strain actuation application



We consider a design example in which an airfoil vane has to be actuated with the parameters given in Table 1. Figure 21 presents the design flowchart used to meet the requirements of the actuation application. The actuation can be done using harmonic, step, or ramp excitations (Figure 22). Alternatively, special amplitude-modulated signals can be used to meet the power



 366



Victor Giurgiutiu



Table 1 Preliminary requirements for a induced-strain actuation application



Moment Deflection Rate



Mpeak = 2.5 Nm



8=



8=



+/- 3 deg. 3.5 rad/s



Actua tion type c::::::::> Fin ~ctulltion¢:::= Device <:~=== requi remen ts Ca pability r-: Assumptions



harmonic step ramp others



,



,



"



r Force amplifier ~ ~~!i.n:!~e.'! ~.i~p~~::~:~~ .



·'l



Figure 21. Meeting the requirements involves input signal design and smart-materials device capabilities. 4



0.15



"



~ <:



.!2



0.1



U



" -o "



;::



~



0



>. ~



.:: ::l



g



~



OJ



OIl



<:



0.0



es



·4



0



(a)



T/2 3T/4 time ( s) si ne inp ut ramp inp ut ste p input



T/4



T /4



T



(b)



T/2 3T/ 4 time (s) si ne input ramp input _ ._. step input



10



~...



OJ



?;



s -5 - 10



0



3T/4 T/2 time (s) si ne input ramp input



T/4



T



Figure 22. Variation of the required actuation energy and power within a cycle: a) input signal; b) required energy within a cycle; c) required power within a cycle.



T



 Mechatroni cs and smart stru ctures design techn iques 367



Table 2 Preliminary requireme nts for a smart-materials actuated missile fine Sine actuation



R amp actuation



= ~ A1peakOpeak = 0.065J



Step actuation



Peak ener gy per cycle



Epeak



Epeak



= A1peakOpeak = 0.13 J



Peak power per cycle



P



Ppeak



= Mpeak~ = 8.75 W



,d = Epeak' 26, = 6.874 W



peak



Displacement amplification G



ISA stack



/



/



\.



= MpeakOpeak = 0.13J -



,-



ro



i



\



Epeak



' < ; _._ . /



Effective aerodynamic stiffness



Figure 23. Schematic drawing of the displacement amplification process.



requirements. The variations of the required energy and power within a cycle for the above-mentioned excitations are used to determine whether existing induced-strain actuators are capable of meeting the actuation requirements with a given input signal. Table 2 compares the peak values of required energy and power for each actuation signal shape. These values are then compared with the energy/power capabilities of existing induced-strain actuators. We will assume an induced-strain actuator that acts through a displacement amplification mechanism in order to produ ce the displacements required to meet the design specification. The metric for selection is the maximu m energy available from the smart material device. The limitation on the available power comes mostly from the maximum current capabilities of the power amplifier. In the description of the displacement amplification mechanism, we assumed a simple model (Figure 23) that transforms the linear moti on of the active material device into the rotary output mo tion required in the design specification. In this model, the displacement amplification mechanism was taken as a "black box" with 2 design variables: the gain G and the wor k efficiency 11 m, defined as II ,



G= - , II '



e



The output of the displacement amplifier was fed into a rotor arm of radius 5mm.



(12)



TO



 368



Victor Giurgiutiu



0.2 0.18



"'-: ~



r-



0.16



r-



0.14



c: 0.12 Ql



r-



Ql



co 0.10



-""



Required energy for ramp input



r-



Ql



a- 0.08



Requ ired energy for sine input



0.06 0.04 0.02 0 P-246 .70



P-247.70



D125160



D125200



Figure 24. Commercially available actuators meeting the energy requirements for sine and ramp signals.



We have the design requirements were compared with the maximum attainable performances of the active material devices, using the review described in Section 8 of this chapter. It was found that the required energy values fall within the capabilities of commercially available induced-strain actuators for both sine and ramp actuation (Figure 24). Hence, two candidates were identified: P-247.70 and D125200. The design of a displacement amplifier for the case of direct actuation must address several other parameters besides the required energy output. This is mainly because displacement amplifier have a smaller-than-unity energy transmission efficiency, Tim. For example, if we use the P245-70 actuator and model the load response solely as a spring system, the displacement amplification should have the energy transfer efficiency TJm in excess of 0.36. For the D125200 actuator, the same parameter TJm should be greater than 0.42. These aspects outline the need to choose another design metric specific to the design amplification mechanism, besides the energy transfer coefficient which is a smart material device characteristic. If the induced-strain actuator can be designed at will, the internal stiffness of the actuator can be considered as a design parameter. We define the energy transfer coefficient E; as the ratio of the available energy to the reference energy r . C2



(



C?)2 1 +r· 2"""



(13)



'1 111



where TJ is the kinematic gain, defined as 8 . r, / UISA, and UISA is the induced-strain free stroke. Please note the change of meaning for G and TJ, in comparison to previous



 Mechatronics and smart structures design techniques



369



8r-- - - --,-- - - - - - - - - - - - - - - - - -, 7



6



5



L



1.5 '10



8



- - - - - - - - - - - -========:: : -J



a)



-°u



0.3



r------------ - - - - - - - - - - ---,



C



Q)



;;::::



0.2



Q)



0



o



..... Q)



en c



-



0.1 1- -



-



co .....



>.



0'-- - - - - - - - - - - - - - - - - - - - - ----'



OJ ..... Q)



C



w



1.5 .\ 0



b)



8



2. \0



8



2.5.10



8



8 3. 10



3.5 .\ 0



8



4 .\0



8



Internal stiffness of the active material device (N/m)



Figure 25. Variation of the required gain G and energy transfer coefficient with the internal stiffness of the actuator, considering the displacement amplification efficiency 1//11 = O.S.



sections. The overall gain is defined as



G = 7]/11



1-



r . 7]2



1-4· - -



f]/11 • -------2· r . 7]



(14)



Figure 25 illustrates how the gain and the energy transfer coefficient vary with the internal stiffness ofthe actuator. Since an optimal design should tend toward minimum gain and maximum transfer, it is apparent from Figure 25 that an optimum point can be achieved. If we define the metric to characterize the optimum point as the ratio of the



 370



Victor Giurgiutiu



0.03



0.025



0.02



0. 0 1 5 ' - - - - - - - - - - - - - - - - - - - - - - - - ' 1.5.108



Stack internal stiffness Figure 26. Variation of the optimum metric R amplification mechanism efficiency.



=



E; / G with the internal stiffnessand the displacement



energy transfer coefficient to the gain of the displacement amplification mechanism, an optimal internal stiffness can be found for any allowed energy transfer efficiency (Figure 26). 5. DESIGN OF EMBEDDED ULTRASONICS SMART STRUCTURES FOR STRUCTURAL HEALTH MONITORING



5.1. PWAS Ultrasonic transducers



Piezoelectric wafer active sensors (PWAS) are inexpensive transducers that operate on the piezoelectric principle. Initially, PWAS were used for vibrations control, as pioneered by Crawley et al. (1987) and Fuller et al. (1990). Tzou and Tseng (1990) and Lester and Lefebvre (1993) modeled the piezoelectric sensor/actuator design for dynamic measurement/control. For damage detection, Banks et al. (1996) used PZT wafers to excite a structure and then sense the free decay response. The use ofPWAS for structural health monitoring has followed three main paths: (a) modal analysis and transfer function; (b) electromechanical impedance; (c) wave propagation. The use of PWAS for damage detection with Lamb-wave propagation was pioneered by Chang and his coworkers (Chang, 1995, 1998,2001; Wang and Chang, 2000; Ihn and Chang, 2002). They have studied the use ofPWAS for generation and reception ofelastic waves in composite materials. Passive reception ofelastic waves was used for impact detection. Pitch-catch transmission-reception oflow frequency Lamb waves was used for damage detection. PWAS wave propagation was also studied by Lakshmanan and Pines (1997), Culshaw et al. (1998), Lin and Yuan (2001), Dupont et al. (2000), Osmont et al. (2000), Diamanti, Hodgkinson, and Soutis (2002). The use ofPWAS for high-frequency local



 Mechatronics and smart structures design techniques



371



modal sensing with the electromechanical impedance method was pursed by Liang et al. (1994), Sun et al. (1994) Chaudhry et al. (1995), Ayres et al. (1996), Park et al. (2001), Giurgiutiu et al. (1997-2002), and others. PWAS couple the electrical and mechanical effects (mechanical strain, Sij, mechanical stress, 7kl, electrical field, Ei; and electrical displacement D i ) through the tensorial piezoelectric constitutive equations



(1)



where s ~l is the mechanical compliance of the material measured at zero electric field (E = 0), Ejr is the dielectric permittivity measured at zero mechanical stress (T = 0), and dkij represents the piezoelectric coupling effect. As apparent in Figure 27, PWAS are small and unobtrusive. PWAS utilize the d31 coupling between in-plane strain and transverse electric field. A 7-mm diameter PWAS, 0.2 mm thin, weighs a bare 78 g. At less than $10 each, PWAS are no more expensive than conventional highquality resistance strain gages. However, the PWAS performance exceeds by far that of conventional resistance strain gages. This is especially apparent in high frequency applications at hundreds of kHz and beyond. There are several ways in which PWAS can be used, as shown next. As a high-bandwidth strain sensor, the PWAS directly converts mechanical energy to electrical energy. The conversion constant is linearly dependent on the signal frequency. In the kHz range, signals of the order of hundreds of millivolts are easily obtained. No conditioning amplifiers are needed; the PWAS can be directly connected to a highimpedance measuring instrument, such as a digitizing oscilloscope. As a high-bandwidth strain exciter, the PWAS converts directly the electrical energy into mechanical energy. Thus, it can easily induce vibrations and waves in the substrate material. It acts very well as an embedded generator of waves and vibration. Highfrequency waves and vibrations are easily excited with input signals as low as 10 V This dual sensing and excitation characteristics ofPWAS justifies their name of "active sensors". As a resonator, PWAS have the property that performs mechanical resonances under direct electrical excitation. Thus very precise frequency standards can be created with a simple setup consisting ofthe PWAS and the signal generator. The resonant frequencies depend only on the wave speed (which is a material constant) and the geometric dimensions. Precise frequency values can be obtained through precise machining of the PWAS geometry. As an embedded modal sensor, the PWAS is able to directly measure the high-frequency modal spectrum of a support structure. This is achieved with the electro-mechanical impedance method, which reflects the mechanical impedance of the support structure into the real part of the electromechanical impedance measured at PWAS terminals. The high-frequency characteristics of this method, which has been proven to operate at hundreds of kHz and beyond, cannot be achieved with conventional modal



 (a)



(b)



(c) Figure 27. Piezoelectric wafer active sensors mounted on various structures: (a) 7-mm diameter PWAS on a circular plate near an EDM slit; (b) array of7-mm square PWAS on an aircraft panel; (c) PWAS on a turbine blade. 372



 Mechatronics and smart structures design techniques



........====~v(t)



373



PW A S



, ,



,



"



..



-



... ,



\ _



" " .. _



I



: : ~ ~ : : .: ; ~



,



_ _



_, ; : :



, , f t . , " .. _ ... o# , :t .~ " t~ , __ , . " ,



: ~~:::~~ .~~:::;~:: . _----,11 , , ", ,. ;~



: ~ ~ ~:= = ; : : . '" _-----.; .. _----- _ .. .. -.... ------", -----_ ...



•



...i



.....



-



;



-- - - -



I"



...



....



..-.-~



t



_----- -----



I



I ; ,. - -



.



. -------_



--- . .. ---- . -----. :::



....



:, ~I ~::::==:: ~ tI ,, - - _



,



....... ......... ---" .. " , t ..... ------ , , .... .. .



-



,------,,



"



---



t~: :: ~;~ : .,. ......... '" , , ,t ""



------



... .



:, . So ,~ ~ : ,, ,, -- "". . "~ ::



,



\



"-



:: ~~::: ~ ~



\



PWAS



-::: C"l



II



Figure 28. Typical structure of 50 and AO Lamb wave modes, and the interaction ofPWA5 with the Lamb wave.



measurement techniques. Thus, PWAS are the sensors of choice for high-frequency modal measurement and analysis. For embedded NDE applications, PWAS can be used as embedded ultrasonic transducers. PWAS act as both Lamb-wave exciters and Lamb-wave detectors (Figure 28). PWAS couple their in-plane motion with the Lamb-waves' particle motion on the material surface. The in-plane PWAS motion is excited by the applied oscillatory voltage through the d31 piezoelectric coupling. Optimum excitation and detection happens when the PWAS length is an odd multiple of the half wavelength of particle Lamb wave modes. The PWAS action as ultrasonic transducers is fundamentally different from that of conventional ultrasonic transducers. Conventional ultrasonic transducers act through surface tapping, applying vibrational pressure to the object's



 374



Victor Giurgiutiu



PWAS



J-------------------------- [ ~



(b)



•



2-D Surla ce



Figure 29. (a) Elastic waves generated by a PWAS in a I-D structure; (b) circular-crested Lamb waves generated by a PWAS in a 2-D structure.



surface. PWAS, on the other hand, act through surface pinching, and are strain coupled with the object surface. This imparts to PWAS a much better efficiency in transmitting and receiving ultrasonic Lamb and Rayleigh waves than conventional ultrasonic transducers. PWAS are capable of geometric tuning through matching between their characteristic direction and the half wavelength of the exited Lamb mode. Rectangular shaped PWAS with high length to width ratio can generate unidirectional Lamb waves through half wavelength tuning in the length direction. Circular PWAS excite omnidirectional Lamb waves that propagate in circular wave fronts. Unidirectional and omnidirectional Lamb wave propagation is illustrated in Figure 29. Omnidirectional Lamb waves are also generated by square PWAS, although their pattern is somehow irregular in the PWAS proximity. At far enough distance, (r a), the wave front generated by square PWAS is practically identical with that generated by circular PWAS.



»



5.2. Shear-layer coupling between PWAS and structure The transmission of actuation and sensing between the PWAS and the structure is achieved through the adhesive layer. The adhesive layer acts as a shear layer, in which the mechanical effects are transmitted through shear effects. Figure 30 shows a thinwall structure of thickness t and elastic modulus E, with a PWAS of thickness fa and elastic modulus E, attached to its upper surface through a bonding layer of thickness



 Mechatronics and smart structures design techniques



I



PWAS



I



~ ~ ~ ~ r(x)e



!



375



iW /



~



! !



t -+---------+--! )



X



y=- d - .......-~~-_ ........-a +a Figure 30. Interaction between the PWAS and the structure showing the bonding layer interfacial shear stress, r(x). tb and shear modulus Gb • The PWAS length is la while the half-length is a = I, /2. In addition, the definition d = t /2 is used. Upon application of an electric voltage, the PWAS experiences an induced strain, EISA =



V d31 -



(1)



t



The induced strain is transmitted to the structure through the bonding layer interfacial shear stress T. For harmonic varying excitation, the shear stress has the expression iwt T (x)e . The PWAS expansion is transmitted to the structure through the bonding layer, which acts predominantly in shear. Construction of the free body diagrams of the PWAS, bonding layer, and thin-wall structure over the infinitesimal length dx leads the following equilibrium equations: do;



ta -



-



-



T= 0



dx du t-+aT=O dx



(PWAS)



(2)



(Structure)



(3)



The coefficient a in Equation (3) depends on the stress, strain, and displacement distributions across the plate thickness. Under static and low-frequency dynamic conditions, one applies the usual hypothesis associated with simple axial and flexural motions, i.e., constant displacement for axial motion, and linear displacement strain for flexural motion. In this case, a = 4. For high-frequency motion, the displacement field across the plate thickness takes the more complicated forms associated with the Lamb wave modes. However, in the present section, we will restrict our analysis to the static and low-frequency dynamic conditions. These conditions are analyzed next; as a result, the value a = 4 in Equation (3) will be derived. The analysis proceeds as follows. The shear stress T applied to the upper surface is partitioned into symmetric



 376



Victor Giurgiutiu



-8 -8---8Symmetric excitation



Antisymmetric excitation



r



Total excitation



r



+-



+-



+



+2



2



Figure 31. Symmetric and antisymmetric particle motion across the plate thickness.



o§



~-~ -i-do



:



0



r 2 Figure 32. Symmetric stress distribution across the plate thickness.



and antisymmetric pairs, (r /2, r /2) and (r /2, -r /2), applied to the upper and lower surfaces respectively (Figure 31), such that



+ TAl y=d



r



r



'2 + '2 =



• At the upper surface



TS



I y=d



• At the lower surface



TS



Iy=~d + rA Iy=-d = '2 - '2 = 0



=



r



T



r



Under static and low-frequency dynamic conditions the following assumptions apply. For the symmetric case, uniform stress and strain distribution across the thickness is assumed. For the antisymmetric case, linear stress and strain distribution across the thickness is assumed. These two cases are analyzed individually next. 5.2.1. Symmetric case



In the symmetric case, the stress and strain are assumed constant across the thickness (Figure 32). Since the stress is assumed uniform across the thickness, the stress distribution can be expressed as a(y)



= as,



(4)



where Us is the value of the stress in the plate evaluated in the upper surface. The



 Mechatronics and smart struc tures design techniques 377



r



£m----~-o.----:--2F



do



r 2



Figure 33. Antisymmetr ic stress distribution across the plate thickness.



subscript S stands for "symmetric " . The stress resultants are evaluated by integration of the stress across the thickn ess. Since the stress distribution is symmetric, the only stress resultant is the axial force per unit width



+.1a (y)d y = 1+.1as dy = 2das



=1



N



-.I



_.I



(5)



The force due to the shear stresses applied to the plate surface over the length dx is



dN,



r



= 2 -2 dx = rdx



(6)



Hence, equilibrium of the infinite simal element of Figure 32 gives



!fJ + dN + d Nr - l"/i = 0



(7)



Substitution of Equations (5) and (6) into Equation (7) yields da s



t-



dx



+r=O



(8)



In arr iving at Equ ation (8), the definition d = t/2 was used. 5.2.2. Antisymmetric case



In th e antisymmetric case, the stress and strain are assumed to vary linearly across the thickness (Figure 33). Since the stress is assumed linearly varying across the thickness,



 378



Victor Giurgiutiu



the stress distribution can be expressed as (9)



where aA is the value of the stress in the plate evaluated in the upper surface. The subscript A stands for "antisymmetric", The stress resultants are evaluated by integration of the stress across the thickness. Since the stress distribution is antisymmetric, the only stress resultant is the moment per unit width (10)



The moment due to the shear stresses applied to the plate surface over the length dx is r



dMr = 2d -dx = drdx



(11)



2



Hence, equilibrium of the infmitesimal element of Figure 32 gives



Nt + dM + dMr



-



Nt = 0



(12)



Substitution of Equations (10) and (11) into Equation (12) yields 2d 3 -deYA + drdx = 0 3



(13)



Upon simplification and using the definition d



=



t /2, we get.



da A t - +3r =0



(14)



dx



5.2.3. Shear lag solution



The superposition a = as + aA yields Equation (3). For Lamb wave modes, which have a complex stress and strain distributions, the value of a will vary from mode to mode. The strain-displacement equations in the PWAS, bonding layer, and structure are



d u;



Ea = -



dx



Ua -



U



y=--



dx



du dx



E=-



(PWAS)



(15)



(Bonding layer)



(16)



(Structure)



(17)



 Mechatronics and smart structures design techniques



379



whereas the stress-strain relations are



a = EE



(PWAS)



(18)



(Bonding layer)



(19)



(Structure)



(20)



Upon substitution, we obtain two second order coupled differential equations in E, which, upon further differentiation, yield the following pair of decoupled fourth order differential equations



e, and



(PWAS)



(21)



(Structure)



(22)



(Shear lag parameter)



(23)



where



and (24)



Solution of Equations (21)-(22) is obtained in the form Ca(x)



=



aa(X)



=



_a-EISA



a



+ 1/1



(1 + 1/1a



--1/I-EaEISA



a



+ 1/1



a



u, (x) = --EISAa a + 1/1



r(x)



C(X)



_CO_S_h_f_X) cosh fa



(1 __



(x + -1/1 a



CO_Sh_f_X) cosh fa sinh fx



a (fa) cosh fa



= ~-1/I-EaElsA (fa_Si_nh_f_X) aa



=



+ 1/1



_a-EISA



a



+ 1/1



cosh fa



(1 __



CO_Sh_f_X)



cosh fa



)



(PWAS actuation strain)



(25)



(PWAS stress)



(26)



(PWAS displacement)



(27)



(Interfacial shear stress in bonding layer)



(28)



(Structure strain at the surface)



(29)



 380



Victor Giurgiutiu



a(x) = -(X-EEISA (X



+ 1/f



u(x) = -(X-EIsAa (X



+ 1/f



(1 __



CO_S_h_f_X)



(Structure stress)



(30)



sinh fx ) (I' a) cosh fa



(Structure displacement at the surface)



(31)



cosh fa



(~ _ a



These equations apply for Ixi < a. Outside the [x] < a interval, the strain and stress variables in these equations are zero. Note that aa =I- E a e, because of Equation (18). The shear lag parameter, I", plays a very important role in determining the distribution of Ea , E, and T along the PWAS span (-a, a) . The effect of the PWAS is transmitted to the structure through the interfacial shear stress of the bonding layer. A small shear stress value in the bonding layer produces a gradual transfer of strain from the PWAS to the structure, whereas a large shear stress produces a rapid transfer. Since the PWAS ends are stress free, the build up of strain takes place at the ends, and it is more rapid when the shear stress is more intense. For large values of fa, the shear transfer process becomes concentrated towards the PWAS ends. Example



To illustrate these equations, we considered an APC-850 PWAS with E, = 63GPa,l a = 0.2 mm, l, = 7 mm mounted on a thin-wall aluminum structure with E = 70 GPa and I = 1 mm. (The value I = 2 mm was also considered). The mounting is done with cynoacrylate adhesive (Gb = 2GPa) of variable thickness, Ib = 1 uru, 10 /Lm, 100 /Lm. The piezoelectric constant of the APC-850 material used in the PWAS is d31 = -175 mm/kV Figure 34a presents the strain distribution in the structure and PWAS, while Figure 34b presents the shear stress distribution. For the range of values considered here, the value of I" a varied between 16 and 58. Examination of Figure 34 reveal that the shear lag parameter, fa, plays a very important role in determining the distribution of Ea , E, and T along the PWAS span (-a, a). The effect of the PWAS is transmitted to the structure through the shear stress in the bonding layer. A small shear stress value in the bonding layer produces a gradual transfer of strain from the PWAS to the structure, whereas a large shear stress produces a rapid transfer. Since the PWAS ends are stress free, the build up of strain takes place at the ends, and it is more rapid when the shear stress is more intense. As indicated by Equation (23), a relatively thick bonding layer produces a low fa value, i.e. a slow transfer over the entire span ofthe PWAS (the "100 /Lm" curves in Figure 34), whereas a very thin bonding layer produces a very rapid transfer (the" 1 /Lm" curves in Figure 34), which is confined to the ends. Another aspect of interest is the maximum interfacial shear stress. As indicated by Figure 34b, the maximum interfacial shear stress takes place at the ends. A possible concern might be that the large shear stress values at the PWAS ends would exceed the bond strength and would promote failure. A plot of the interfacial shear stress with bond thickness is presented in Figure 35 for a 0.2 mm thick PWAS under 10 V excitation. It is apparent that the maximum interfacial shear stress does not exceed



 Mecha tronics and smart structure s design techn iques 381



PWAS actuator strain



0.8 c



"g v:



-0



0.6



u



.~



:::



...



E 0



\ .. \



\



0.4



c



0.2



I



\. ~



/.



1/



.



.'- . ..



. ". .



.' .



Structure strain



0_ 1 -0.8 -0.6 -0.4 - 0.2



0



0.2 0.4



0.6



0.8



normali zed posit io n



(a)



2



c:



1Ol1m



~



.:



::: u ..c v:



:::



'u



100 11m 0



......



...u



~



C



1 11



-I



""2L-----L._...I-_L----L._...I-_L----L._...I-_ L - - J



- I -0.8 -0.6 - 0.4 0.2



(b)



0



0.2 0.4



0.6 0.8



norm alized positio n



Figure 34. Variation of shear-lag transfer mechanism with bond thickn ess: (a) strain distribution in the PWAS and in the struc ture; (b) interfacial shear stress distribution; (c) displacement distribu tion in the PWAS; (d) displacement distribution in the struc ture (bond thickness I f, = 1. 10, 100 jlm).



 382



Victor Giurgiutiu



0.04



100 11m



0.02



..:



0.4



0.6



0.8



-0.04



norm ali zed positi on



(c)



0.03



.-



0.02



...~



100 11m



.~



E



0.01



-I



-0.8



-0.6



-0.4



0.4 - 0.0 1



-0.02



- 0.03



(d) Figure 34. (con tinued)



nor mal ized position



10 l1m



0.6



0.8



 Mechatronics and smart structures design techniques



383



2.5 ~--~--~----"T---r-------,



~



E



0.5



20



40



60



)00



80



bonding layer thickness, microns Figure 35. Variation of maximum interfacial shear stress with bond thickness (tl> 0.2 mm thick PWAS under 10 V excitation.



=



1 ... 100 J.lm) for a



2.5 MPa, which is about an order of magnitude lower than the bonding layer shear strength. 5.2.4. Pin-force model



It is apparent from the previous section that a relatively thick bonding layer produces a slow transfer over the entire span of the PWAS (the" 100 /Im" curves in Figure 31), whereas a thin bonding layer produces a very rapid transfer (the "1 /Im" curves in Figure 31).The shear-lag analysis indicated that, as the bond thickness decreases, Frz increases, and the shear stress transfer becomes concentrated over some infinitesimal distances at the ends of the PWAS actuator. In the limit, as r a ~ 00, all the load transfer can be assumed to take place at the PWAS actuator ends. This leads to the concept of ideal bonding (also know as the pin-force model), in which all the load transfer takes place over an infinitesimal region at the PWAS ends, and the inducedstrain action is assumed to consist of a pair of concentrated forces applied at the ends (Figure 34a), i.e., r(x) = aTa [8(x - a) - 8(x



+ a)]



F(x) = Fa [-H(x - a) + H(x



+ a)]



(Ideal-bonding shear stress)



(32)



(Shear force due to pin-end forces)



(33)



where 8 and H are the Dirac impulse function and the Heaviside step function, while (Pin-end forces)



(34)



 384



Victor Giurgiutiu



Equation (34) represents the pin forces, Fa = a r, , applied by the PWAS to the struc ture. T hese forces are at localized at the PWAS ends (Figure 34a). The pinforce model is convenient for getting simple solutions that represent a first-order of approximation to the PWAS-stru cture interaction. N ote that this extreme situation implies that the shear stress reaches very large values over diminishing areas at the PWAS ends. R ecall that the Dir ac function has the localization property



f



f(x) 8(x - xo)dx



= f( xo)



(D irac fun ction localization property)



(35)



and that the Dirac fun ction is the deri vative of the H eaviside function, i.e., 8(x)



= H' (x )



(36)



Under these assumptio ns, Equ ations (25)- (31) take the simple forms ex



f:.(x)



= - - S/SA [H (x + a) -



oo. (x )



= - ex +1/1 1/1 S/SA [H (x + a) -



u. (x)



=-



r (x)



ex



ex



oo (x )



u(x)



ex -



+ 1/1



e/SAX (H (x



+ a) -



(PWAS actuation strain)



(37)



H (x - a)]



(Stress in the PWAS)



(38)



H (x - a))



(Displacement in the PWAS)



(39)



(Interfacial shear stress in bondi ng layer)



(40)



= ex +1/1 1/1 taE as /SA (- 8(x + a) + 8(x -



F(x) = f:(X )



H (x - a)]



ex+1/I



=



1/1 - ta Eas/ SA (- H(x



+ 1/1



ex - - S/SA [l-l(x ex +1/1



+ a) + H (x -



+ a) -



ex



H (x - a)]



= - - E S/SA [H (x + a) ex



=



+ 1/1



ex - - S/SAX (H( x ex+t/f



a))



+ a) -



H (x - a)] H (x - a))



a)) (Interfacial shear force in bondin g layer)



(41)



(Structure strain at the surface)



(42)



(Stru cture stress at the surface)



(43)



(Stru cture displacement at the surface)



(44)



wh ere H (x ) and 8(x ) are the He aviside step functi on and the D irac delta function, respectively. Note that the strains in the PWAS and at the surface of the struc ture are equal, but the stresses are not , due to Equ ation (18).



 Mechatronics and smart structu res design techniques 385



2d



x



y=-d



+a



-a



(a)



T



.



+



-a



-Toa



!



T



I



x



= Toa [b(x- a) - b(x + a) ]



I



(b)



I



Figure 36. Pin force mod el: (a) surface shear distributi on ; (b) direct strain ind uced in the struc ture at the upper structural surface. .



The axial force and bending mom ent associated with the ideal-b onding hypoth esis (pin- force model) are show n in Figure 36. Their analytical expressions are



M,



t



= Fad = Fa-2



(Axial force)



(45)



(Bending moment)



(46)



The axial force and bending moment described by Equations (45) and (46) represent the excitation induced by the PWAS into the plate und er the ideal-bond ing hypothesis. 5.2.5. Energy transier betll/eell the PWA S and the structure



The transfer of energy between the PWAS and the struc ture will be studied for both the shear-lag model and th e pin-force model.



 386



Victor Giurgiutiu



I (fa) 0.5



oL..-- ---===-..I...-- - - - . . L. . - - ----.L..10



0. 1



100



---l



fa Figure 37. Bond efficiency variation with



r a.



Ellergy Trallifer through the Shear-LagModel



The ene rgy in the actuator can be evaluated as



where I (f a)



= 1-



3 sinh f a 2 fa cosh f a



-



1



1



+ -2 --~ (coshfa )2



(Bond efficiency)



(48)



is a fun ction that approache s 1 for large r a values and represents a m easure of the bond efficiency (Figu re 47). For the values co nsidered in our preliminary study, the values of I (I' a) were found to vary from 90% to 97%. The term ~ E, €JSA la is a measure of th e total indu ced strain energy, whe reas th e term (at~)2 represent s how mu ch of this ene rgy gets store d as elastic compression in the PWAS actua tor and canno t be transmitted to the struc ture. The energy transmitted to the struc ture can be evaluated either as the elastic energy in the structure or as the work done by th e interfacial shear stresses on th e stru ctural surface. T he two paths are equivalent:



 Mechatronics and smart structures design techniques 387



1. Elastic energy in the structure



l v~2 [(a 8)axial + (a 8)jlexural] d V



l+l = [



ex



2) l



(1- E8 (ex + 1/1 )2 2 lSA



1/1



ex ---'---:~ (ex



+ 1/1 )-



(1



~ /a - E 8 lsA 2



a



- a



)



x)



-1 ( 1 - cosh r 2 cosh r a



2 dx



.



I (f)



(49)



2. Work done by the shear stresses at the structural sllrface



~= =



j (~rll)



l



a



~~-1/I-Ea 8lsA (ra sinhr x)



- a 2 a ex



= (ex



dx



+ 1/1



cosh I"a



-ex-- 8lsAa ( .: _ sinh fx ex + 1/1 a (I" a) cosh f a



ex 1/1 + 1/1 )2 ("21 E8~jSA/a) I (f )



)dX (50)



These two paths of approach give the same results, and the energy in the structure can be retained as (Energy transmitted to the structure)



(51)



We not e that the integral I (f) and term ~Ea 8 TS1Ia of Equ ation (48) reappear in 'w represents how much of th e induced strain Equ ations (49) and (50). The term energy gets transmitted int o th e structure.



(a':



Energy transfer through the pin-jorce model Since th e pin-force m odel is a const ant-strain, constant-stress model, th e evaluation of the energy stored in the actuator and transmitted to th e structure can be readily done (Energy retain ed in the actuator)



(52)



(Energy transmitted to the struc ture)



(53)



5.2.6 . Conditionsj or optimum energy transfer



Figure 35 presents a plot of the energy transferred to the structure vs. parameter 1J; in non-dimensional form. It can be noti ced that the energy transfer reaches a maximum



 388



Victor Giurgiutiu



as the parameter r, by



1/J a



1/1



equals the parameter a. Denoting the apparent stiffness ratio,



ei)«



(54)



r=-=--



Eat a



we notice that, under the quasi-static conditions considered here, maximum energy transfer is attained at r = 1. This represents the well known stiffness matching principle with the proviso that the stiffness of the PWAS, E, ta , needs to be matched with the apparent stiffness of the structure, E t/ a. At the stiffness match condition, the energy in the structure and the energy in the PWAS coincide and are equal to the maximum extractable energy (Maximum extractable energy)



(55)



The value r = 1 represents the condition for maximum energy tranifer under quasi-static conditions. Under dynamic conditions, this condition will be modified by several factors • Due to inertia loads, the apparent structural stiffness varies with frequency • Modal resonances will affect the apparent stiffness, which will go through minima at every resonance • At ultrasonic frequencies, the apparent structural stiffness depends on the Lamb modes that are resonant at certain frequency • The frequency-dependent displacement distribution across the thickness of the Lamb modes will additionally influence the apparent structural stiffness The stiffness match principle corresponds to the impedance match principle commonly used in electronic circuits design in order to attain maximum power transfer conditions. 5.3. Lamb waves excited by PWAS



In this section, we develop the theoretical foundation for selective Lamb mode excitation with PWAS transducers. Figure 38 presents the coupling between PWAS and two Lamb modes, So and Ao . It is apparent from Figure 38 that maximum coupling between the PWAS and the Lamb wave would occur when the PWAS length is an odd multiple of the half wavelength. Since different Lamb modes have different wavelengths, which vary with frequency, the opportunity arises for selectively exciting various Lamb modes at various frequencies, i.e., making the PWAS tuned to one or another Lamb mode. The analysis presented in this section considers surface shear distribution resulting from PWAS action and determines the Lamb wave response of the structure. In this analysis, the choice of shear distribution is very important. As indicated by Equation (28) of Section 11.5, the shear stress distribution is non-uniform, with high values at the PWAS ends. Hence, we will have to assume the shear distribution of the



 Mechatronics and smart struct ures design techniques 389



+-""====~ Vcr) PW A S



..::::



N



II



~ :: ; : : .: ,: "~ ...... - .. . . , : ": ~ _----: :=: ; ..~ ...... _---- -... _---- .....



,",.--- - - -- -_........ " ... -- -- _ ....



: : ~ :: : : : ~ ~ . : ~ ~::: ~ ~



: f " ... .. ... ... "



\



~~: : :;~:;::~ : : :" , ... "" ... " t l l ' ... _ _ .... :~



- -_-----" _------------ - -- ------- ....... ",---



~~ : : :;::: ~~:.::~~~: ---, ...



,..



~~ :: ::~::~~~ : :: :~ ,-- - - - , , ' , ...



,



- --- - - _------_ -----'" -----



- - - - - -.. ..



_----



------', ...... ---,



---_ - - ---_



; :::=:::~ :: : ~:::: =::: ~ , ... _--- "'''., , - _



, ,. ~. ,· ~ii ' ,.~I ' ,· ... ,



---



,



....



,



-



..



'\.I ~



'"



110 "



,



I



, ,



--... -



,



'\



\



----". -----,,, -----_ . .



....



...



-----_



:----------- .



~ So ~:: ,, , - ~" :" :" , ""



...



PW A S



~ II



Figure 38 . Typical struc ture 01'50 and Au Lamb wave modes, and the interaction of PWA5 wi th the Lamb wave.



shear lag model corrected for frequency dependence. The solution will be developed using the space-domain Fourier transform method based on a harmonic functions kern el for straight-crested waves, and a Bessel functions kern el for circular-crested waves. 5.3.1. Lamh wavesolution under nonuniform shear-stress boundary excitation



Assume harmonic shear-stress boundary excitation applied to the uppe r surface of the plate (Figure 39), i.e.,



T, (X, h) =



TO sinh(r {



o



x)



Ixl <



a



otherwise



(1)



 390



Victor Giurgiutiu



y



y=+h



~



--7 --7 --7 --7



l'



= 1'0 (X )eiOlt X



y=-h +a



-a



Figure 39. Plate of thickness 2d, with a PWAS of width 2a, under harmonic loading on the top surface.



Since the excitation is harmonic of the form Ta (x)e- iw t , the solution is also to be assumed harmonic, and the potential functions ¢ and 1/r satisfy the wave equatIOns



(2)



where ci = (A + 2f..t)/ P and c} = p.]P are the longitudinal (pressure) and transverse (shear) wave speeds, A and f..t are Lame constants, and p is the mass density. Recall the displacement and stress expressions



(3)



Applying the Fourier transform



j(x) = - 1 2rr



1



00



-00



i4 x j(x)e dx



(4)



 Mechatronics and smart structures design tech niques 391



yields



(5)



_ . - dlfr



H, =J~ ¢ + d



it



d¢



Y



= -dy



y -'



-i~1/J



i



YY



¢)



= A (-~ 2-¢ + -ddy2 + 2J.l (-~ 2-¢ 2



(6)



d lfr ) i~ dy



Introduce the notations (7)



Equation (5) become



(8)



Equations (8) accept the general solutio n



¢ = A 1 sin py + A2 co s py



lfr = B1 sin qy + B2 cosqy



(9)



Hence, the displacements are



(10)



The four integration constants are to be found from the boundary conditio n. At this stage, it is advantageous to split the problem int o symmetric and amisymmetric parts (Figure 40 and Figure 41).



 392



Victor Giurgiutiu



Symmetric motion



Anti-symmetric motion Uy



Uy



L



u;



Uy



Figure 40. Symmetric and anti-symmetric particle motion across the plate thickness.



y



y=+d h



2d



y=-d



y=+d



L-._-f---~.X



i---?>



-a



~ ~



---?



1 () irot '5 I.v=-d="2'O Xe



(a)



y=-d



+a



!E-a



<E- <E- ~



+a



1



(b)



() irot x e



'A IV=-d=-"2't O



Figure 41. Plate of thickness 2d, with a PWAS of width 2a, (a) symmetric loading and (b) anti-symmetric loading.



Symmetric solution



Assume symmetric displacements and stresses about the midplane. it can be easily observed that this amounts to ux(x, -h)



= ux(x,



h), (11)



Note that positive shear stresses have opposite directions on the upper and lower surfaces, hence the negative sign on T yx in Equation (11). Similar relations also apply for the Fourier transformed variables, ii, i . Hence, the displacements and potentials



 Mechatronics and smart structures design techniques



393



for symmetric mot ion are



= i;A2 cos py + q B I cos qY = PA2 sin py - i; B] sin qy



ii x ,j



~



y



1fr



(12)



= A2 cos py = B) sin qy



(13)



Up on substitution in Equ ation (6), i )'x



i y)'



= u. [- 2i; pA sin py + (;2 - q2)B sin qy] = u. [(r - q 2)A cos PY + 2i; q B cos qy] 2



1



2



(14)



1



(15)



The symmetric boundary con ditions are T yx ( X ,



-h ) =



T yy ( X ,



-il )



-Tyx ( X ,



h)



=



i



2



= T yy (X , il ) = 0



(16)



Up on substitution, we obtain the linear system



(1 7)



U pon solution,



(18)



The constants Az and B1 are substituted into Equat ion (12) to yield the symmetr ic wave solutio n. Up on substitutio n in Equation (15), we ob tain the strain wave solution at the plate upp er surface (19)



 394



Victor Giurgiutiu



Antisymmetric solution



Assume antisymmetric displacements and stresses about the midplane, i.e.,



= -ux(x, h) -h) = uy(x, h)



ux(X, -h) uy(x,



= Tyx(X, h) Tyy(X, -h) = -Tyy(X, h)



Tyx(X, -h)



(20)



Note that positive shear stresses have opposite directions on the upper and lower surfaces, hence are inherently antisymmetric. Following a procedure similar to that applied to the symmetric case, we get



= ~q(~2 + q2)sinph sinqh D A = (e - q2)2 sin ph cos q h + 4~2 pq cos ph sin q h



NA



(21)



Complete solution



By superposition, the total solution is the sum of the symmetric and antisymmetric solutions, i.e., (22)



Applying the inverse Fourier transform and adding the harmonic time behavior yields the strain wave solution (23)



The integral in Equation (23) is singular at the roots of D s and D A . The equations Ds = 0



(24)



D A =0



are exactly the Rayleigh-Lamb equations for symmetric and antisymmetric motion. They accept the simple roots ~I~' ~lS, ~i),··· ~{~, ~lA, ~2A,



(25)



•..



The evaluation of the integral in Equation (23) can be done by the residue theorem, using a contour consisting of a semicircle in the upper half of the complex ~ plane and the real axis O. Hence, (26)



 Mechatronics and smart structures design techniques



395



Figure 42. Contour for evaluating the surface strain under SO and AO mode. Residues at positive wave numbers are included, and excluded for the negative wave numbers.



where D~ and D~ represent the derivatives of D s and D A with respect to ~. The summations in Equation (26) spread over all the symmetric and antisymmetric roots (Lamb wave modes) that exist for a given value of (J) in a given plate. Integration with respect to x yields the displacement (27)



At low frequencies, only two Lamb wave modes exist, So and Au, and the general solution has only two terms, i.e., (Low



frequency)



(28)



The corresponding expression for displacement is



5.3.2. Ideal-bonding solution



In the case of ideal bonding, the shear stress in the bonding layer is concentrated at the ends, i.e., r(x)



= aro [8(x -



a) - 8(x



+ a)]



(30)



 396



Victor Giurgiutiu



The Fourier transform of Equation (30) is



i = a TO [- 2i sin ~ a]



(31)



Hence, Equations (28)-(29) become ex t () x



,



•



aT



=-1-



(J



J1



L (.sm {,



N ( ~s) . a -5-S - ei (~ "x-wt) D~ (~ S)



~S )



e



aT



, {J -1-



J1



L (, { ,1



N ( ~ A) S a -A-ei(,A ' x - wl ) D~(~ A)



~A )



Stlls



(32)



(33)



Equations (32)-(33) contain the sin;a function, The behavior ofthis function is such that it displays maxima when the PWAS length, fa = 2a, equals an odd multiple of the half wavelength, and minima when it equals an even multipl e. A complex pattern of such maxim a and minima is involved, since several Lamb modes, each with its own different wavelength, coexist at the same tim e. Ho wever, frequ encies can be found when the response is dominated by certain mod es that can be preferenti ally excited through mode tuning. An additional facto r must be considered besides wavelength tuning, i.e., the mode amplitude at the to top plate surface. T his factor is contained in the values taken by the function s N/ D ' . Hence, it is conceivable that some higher modes may have little surface amplitude, while others may have larger surface amplitudes at a given frequency. Thus, two important design factor s have been identified: a) T he variation of I sin ~ a I with frequency for each Lamb wave mo de b) The variation of the surface strain with frequency for each Lamb wave mode A plot of this solutio n in the 0- 1000 kH z bandwidth is present ed in Figure 43. . T he s in~a contained in Equatio n (32) displays maxima w hen the PWAS length fa 2a equ als an odd multi ple of the half wavelength, and min ima when it equals an even multiple of the half wavelength. A complex pattern of such maxima and minima evolves, since several Lamb mod es coex ist, each with its ow n different wavelength. At certain frequencies (e.g., 300 kH z in Figure 43a), the amplitude of the Ao mode goes through zero, while that of the So remains strong, i.e., we have tuning of the So mod e, and rejection of th e Ao mod e. At other frequenci es, the amplitude of the So mo de is quite small, whil e that of Ao mode is very large (e.g., 100 kH z in Figure 43a). Furt hermore, frequ encies also exist at which both Ao and So modes are rejected , as for example the 750 kH z frequ ency in Figure 43a. Experim ental validation of these predictions is illustrated in Figure 44 for the So sweet spot at arou nd 300 kH z. This sweet spo t is especially important for embed ded PWAS ultrason ics, since, at this frequency, the So mo de has very little dispersion and henc e can be successfully used in the pulseecho mode . These observations illustrate the mode tun ing oppo rtun ities offered by PWAS excitation of Lamb waves.



=



 Mechatronics and smart structures design techniques



397



.0



·· ·..



··. ..



o



0.5



o



.. 0



o



•



.......



..



. 0



o



•



..



)00 200 300 400 500 600 700 SOO 900 1000 f, kHz



(a)



C Ql



1.5 ~~---~-~-.......- ...,....- .......--.--.,...---,



E



ra C. Ul



'0



"0



.l§



(ij



§ o



Z



(b)



·· ··· ·· · o



Ql



U



D.5



..._-...,.,..-...... . .. o



100 200 300 400 500 600 700 SOO 900 1000 f, kHz



Figure 43. (a) Predicted Lamb-wave response a l-mm aluminum plate under a 7-mm PWAS excitation: (a) strain response; (b) displacement response.



Another important fact to be noticed in Figure 43 is the difference between the strain response and the displacement response. As shown in Figure 43a, the strain response of the Su Lamb wave remains rather constant as the frequency increases, whereas the corresponding displacement response, shown in Figure 43b, decreases rapidly with frequency. This is due to the fact that the strain response vs. frequency varies as the sine. function sin;5 a, whereas the displacement response varies as the sine function, si~t'a The sine function has a maximum at the origin, and then decreases rapidly as its argument increases. This observation indicates that the strain waves are more likely than the stress waves to be excited with significant amplitudes at higher amplitudes. This fact highlights the advantage of using PWAS, which are strain coupled devices, rather than conventional ultrasonic transducers, which are displacement coupled devices. Similar observations can be also made with respect to the Au mode, which is also shown in Figure 43. However, besides the sin;5 a dependency, the A o mode also shows a more complex dependency inherent in the ~;'~~:; function. For this reason, the strain 4ncy response peaks of the Au mode show a tende to decrease with frequency, which indicates that the A o mode will predominantly be excitable only at lower frequencies.



.



 398



Victor Giurgiutiu



Theory



..



r:::



'iii ....



iii



"0 Q)



•!:::! Cii



...



•• Ao



... ... .



0.5



E .... 0



z



o0



100



...



200



400



300



(a)



500 f, kHz



... 600



700



800



900



1000



Experime nt



160



~



Aomode



120



g 100



• •



§ 80 0-



gJ 60



a: 40 • 20



l



• •••



140



Predicted Ao rejection and So "sweet spot" @ 300 kHz



,,~



"



m"



• • ••



•



"" "o



Somode " ,



o -lanEHUi;:=....---,. o



q,



100



-.Em;mEHHIHil



200 300 400 Freque ncy, kHz



500



600



(b) Figure 44. (a) Predicted Lamb wave strain amplitude in a l-rnm aluminum plate under a 7-mm PWAS excitation; (b) experimental verification of excitation sweet spot at 300 kHz.



5.4. Pitch-catch PWAS experiments



5.4.1. Experimental setup To understand and calibrate the Lamb waves excitation and reception with piezoelectric wafer active sensors (PWAS), a set of experiments were conducted on a thin metallic plate with t = 1.6 mm. The plate was made of 2024-aluminum alloy. Its overall dimensions were 914 mm x 504 mm. The plate was instrumented with an array of eleven 7-mm x 7-mm PWAS positioned on a rectangular grid (Figure 45 and Table 3). The sensors were connected with thin insulated wires to a 16-channel signal bus and two 8-pin connectors (Figure 46). An HP33120A arbitrary signal generator was used to



 Mechatronics and smart structures design techniques 399



Table 3 Piezoelectric wafer properties (APC -S50) Proper ty



Symbo l



15.30 .10- 12 Pa-



Compli ance



Signal generator t ier A mpI'fi



0 11



I ~



Dielectric constant



15.47 . 1O -~ Fl m



- 175 .10- 12 ml V



Coupling factor



0.36



Sound speed



2900 1l1/ s



0 .



i



o



5



o



4



o



3



=0 ~~~~



9° · · ••••



III :::::: f



y



I



Induced strain coefficient



Impedance Analyzer 88 88



0\



0 0 0 0 0 0 0 0



Value



8-channel signal bus



c~n.nectorto



l l -pin sensor wiring



D Il -I~~~I<==>



Com uter



Digital osc illoscope



GPIB



~



o!=.::



I



I



RS232 ..u.



0,°°1¢=JE] 0



°



I • Digital switching unit



o



8



o



o



7



0 10



11



02 o



I



06



09



x Figure 45. Schematic of experimental setup for Lamb-wave PWAS-structure interaction studies.



generate a smoothed 300 kHz tone- burst excitation with a 10 H z repetition rate. The signal was sent to active sensor # 11, which generated a package of elastic waves that spread out into the entire plate. A Tektronix TDS210 two-channel digital oscilloscope, synchronized with the signal gene rator, was used to collect the signals captured at the remaining 10 active sensors. A digitally contro lled switching unit and a LabView data acquisition program were used. A M otorola MC68HC 11 microcon troller was tested as an emb edded stand-alone option.



 400



Victor Giurgiutiu



Figure 46. Experimental setup for rectangular plate wave propagation experiment: (a) overall view showing the plate, active sensors, and instrumentation; (b) detail of the microcontroller and switch box.



5.4.2. Excitation signal



The excitation signal considered in our studies consisted of a smoothed tone burst. The smoothed tone burst was obtained from a pure tone burst of frequency f filtered through a Hanning window. The Hanning window is described by the equation x(t)



1



= - [1 2



- cos(2Jrtj TH ) ]



,



t E [0, TH ]



(1)



The number of counts, N B , in the tone bursts matches the length of the Hanning window, i.e., (2)



 Mechatronics and smart structures design techniques



401



1.5



r\ 1\



1\



i \



0.5



~\



~~



'IJvf\j 0



10



20



30



0



30



Frequency (kHz) (a)



10



0.5



Time (ms)



1



(\



0.5



! \



J



I \



)



-10 (b)



\



! \



0



I



0



Frequency (kHz)



Figure 47. Example of 10kHz, 5-count tone-burst excitation: (a) raw tone burst; (b) smoothed tone burst.



The tone burst excitation was chosen in order to excite coherent single-frequency waves. This aspect is very important especially when dealing with dispersive wave types (flexure, Rayleigh, Lamb, etc.). The Hanning window smoothing was applied to reduce the excitation of frequency side lobes associated with the sharp transition at the start and the end ofa conventional tone burst. Through these means, it was intended that the dispersion effects would be minimized and the characteristics of elastic wave propagation would be readily understood. Figure 47 shows a comparison ofraw and smoothed lO-kHz tone bursts, as well as their Fourier transform. It is apparent that, though both tone-bursts excite the same central frequency oflO kHz, the raw tone burst (Figure 47a) also excites a considerable number ofside lobes, below and above the central frequency. By contrast, the smoothed tone burst (Figure 47b) does not produce side lobes. The smoothed tone burst that resulted from this process was numerically synthesized and stored in PC memory as the excitation signal. This numerically generated excitation signal was used in the finite element analysis and in the experimental investigation. 5.4.3. Lamb mode tuning



During our experiments, we investigated the effect of excitation frequency on the excited wave amplitude. It was found that, at low frequencies (e.g., 10kHz), the excitation



 402



Victor Giurgiutiu



160 140






sa. 80



•



•



~ 60



oo~o



o



•



• ••



40 •



20



/300 kHz



•



~ 120



a:



Predicted Ao rejection and So "sweet spot" @



A:J rmde • ••



•



o o



00



o



o o



o



~ o -ja:m'HlBr-_-_-_-EEB;!3eEH1lEHiJ o 100 200 300 400 500 600 Frequency, kHz



(a)



6.000



~



~ o



5.000 4.000



o 3.000



Qj



>



Co



e :::l



2.000



flexural velocit axial velocity --AOtheory •••. SOtheory <>



C)



to.



1.000



100



200



300



400



500



600



Frequency, kHz (b)



Figure 48. (a) frequency tuning studies identified a maximum wave response around 300 kHz; (b) group velocity dispersion curves for Lamb-wave Ao and So modes.



of flexural wave was much stronger than that of axial waves. However, as frequency increased beyond 150 kHz, the excitation of flexural waves decreases, while that of axial waves increased significantly. These trends are shown in Figure 48a. A "sweet spot" for axial wave (So) excitation was found in the 300 to 400 kHz range. Systematic investigations were able to reproduce with good accuracy the group-velocity dispersion curves for the axial (So) and flexural (Ao) modes (Figure 48b), and to identify optimal excitation frequencies for each Lamb wave mode. Thus, in this plate, the antisymmetric Lamb waves (Ao mode) were best excited at around 100 kHz, while the symmetric Lamb waves (So mode) were best excited at around 300 kHz. Subsequently, for the pitch-catch and pulse-echo experiments described in the next sections, excitation at



 Mechatronics and smart structures design techniques



403



300 kHz was adopted as standard. This allowed us to produce So Lamb waves only, which have much less dispersion at this low frequency. 5.4.4. Pitch-catch results



Figure 49a shows a sample of the signals measured during this investigation. The first row shows the signal associated with sensor 11 (the transmitter). The "initial bang" generated by this active sensor, as well as a number of wave packages received on the same sensor in the pulse-echo mode, are present. The wave packages are reflection from the plate edges, and their time-of-flight (TOF) position is consistent with the distance from the sensor to the respective edges. The other rows of signals correspond to the receptor active sensors 1 through 8. Their TOF position is consistent with the distance between the transmitting and receiving active sensors. The consistency of the wave patterns is remarkable. These raw signals were processed using a narrow-band signal correlation algorithm followed by an envelope detection method. As a result, the exact TOF for each wave package could be precisely identified. When the TOF was plotted against radial distance between the receiving active sensor and the transmitting active sensor, a remarkably good straight line fit (99.99% R 2 correlation) was obtained (Figure 49b). The slope of this straight line is the wave speed, 5.461 km/s. For the 1-6-mm aluminum alloy used in this experiment, the theoretical group velocity for So mode is 5.440 km/s. The speed detection accuracy (0.3% error) is remarkable. These systematic experiments gave conclusive results regarding the feasibility of exciting elastic waves in aircraft-grade metallic plates using small inexpensive and unobtrusive piezoelectric-wafer active sensors: a) Excitation and reception of high-frequency Lamb waves was verified over a large frequency range (10 to 600 kHz). For axial, So, waves, an excitation "sweet spot" was found at around 300 kHz (Figure 48a). b) The elastic waves generated by this method had remarkable clarity and showed a 99.99% distance-time correlation. The group velocity correlated very well with the theoretical predictions (Figure 48b). 6. SUMMARY AND CONCLUSIONS



This chapter has presented mechatronics and smart structures design techniques for intelligent products, processes, and systems. Several situations have been analyzed. The chapter started with the analysis of the induced-strain actuation of a smart structure under quasi-static conditions. It was determined that the active-material induced-strain actuation effect depends on the relative stiffness of the actuator and the external application. This effect of actuator internal stiffness is specific to induced-strain actuation and is not usually met in conventional actuator. The conventional actuators are usually limited by their maximum force and can produce relatively large displacements. The induced-strain actuators are limited by their maximum strain, which has to be



 404



Victor Giurgiutiu



Sensor1 Sensor2



~mJ



Sensor3 Sensor 4 SensorS



Sensor6 Sensor 7



Sensor 8 Sensor9 Sensor 10



250



Time (micro-sec)



(a)



800 700 600 E E



500



y = 5.446x - 33.094 R2 = 0.9999



.: 400



8



300



9



200



10



100 30



(b)



7



50



70



90 t,



110



130



150



170



micro-sec



Figure 49. (a) Excitation signal and echo signals on active sensor 11, and reception signals on active sensors 1 through 8; (b) correlation between radial distance and time of flight.



 Mechatronics and smart structures design techniques



405



judiciously divided into internal and external parts. If the induced-strain actuator is much less stiff than the external application, the induced-strain effect is consumed internally in actuator compression and very little actual displacement is applied to the external load. On the other extreme, if the induced-strain actuator is very stiff relative to the external application, the induced strain effect is transmitted almost completely into the external application, but the force involved in this process is very small. This indicates that, in both situations, the net energy transferred from the induced-strain actuator into the external application is very small, or almost negligible. (In the first instance, the force was large, but the displacement was very small. In the second instance, the displacement was large, but the force was small.) To achieve efficient energy transfer from the induced-strain actuator into the external application, a match between the internal stiffness and the external stiffness must be achieved. This is known as the stiffiless match principle. At the stiffnessmatch point, the energy transfer is maximized. The value of this maximum possible energy transferred from the induced-strain actuator into the external application can be predicted based on the actuator characteristics alone, i.e., the induced-strain actuator free stroke and internal stiffness. To achieve an efficient design of induced-strain actuation, these energy transfer principles and stiffness match conditions must be carefully considered. The chapter continued with the analysis of the effect of support elasticity on the energy transfer principle and stiffness match conditions. It was found that a compliant support contributes to additional loss of effective stroke and that special considerations must be taken into account to minimize this detrimental effect. Then, the effect of a displacement amplification device was taken into consideration. Displacement amplification devices are inherently compliant, and the effect of their effective stiffness must be included into the overall analysis. Complete formulations were derived in which the actuator stiffnessand the displacement amplification stiffnesswere considered in conjunction with the external application stiffness. Design points for maximum energy transfer were derived, and design guidelines for the construction ofan optimum displacement amplification device were considered. The principles derived for quasi-static applications were generalized to dynamic applications. In the context of this analysis, dynamic applications were considered in a frequency range well below the actuator internal resonance, but including possible mechanical resonances of the external application. It was found the stiffness-match principle derived for quasi-static applications can be extended into dynamic applications where it becomes the impedance match principle (or dynamic stiffness match principle). The dynamic stiffness match principle applied to the design of dynamic applications of induced-strain actuation takes into account the effect of resonance frequencies and identifies new opportunities for design optimization through frequency and damping manipulation. The chapter than presented and discussed the efficient design guidelines under a multitude of static and dynamic conditions, and presented a design example illustrating these ideas. The design ofsmart structures for structural health monitoring was approached from the view point of embedded ultrasonics. Embedded ultrasonics is a methodology that relies on piezoelectric wafer active sensors (PWAS) to generate and detect ultrasonic



 406



Victor Giurgiutiu



waves that can be used to identify damage and prevent failure. PWAS are miniature ultrasonic transducers devices (7 mm diameter, 0.2 mID thick) that are low power, light weight, and inexpensive. They are permanently attached to the structure that undergoes structural health monitoring. Because they are light weight and inexpensive, PWAS can be affordably deployed on the structure in large numbers forming sensor networks and sensor arrays. Being permanently attached to the structure, and inserted into closed up cavities and enclosures, the PWAS become embedded sensors that exist inside the structure and can be monitored throughout the smart structure life. The analysis and design optimization of the PWAS vis-a-vis its structural host are essential for the successful application of this concept. In this chapter, a thorough analysis of the interaction between the PWAS and the structural substrate was conducted. First, the shear lag coupling between the PWAS and the structure was analyzed. Then, conditions for optimum energy transfer were identified. Under these conditions, maximum energy transfer takes place from the PWAS into the structure and from the structure into the PWAS. Thus, optimum usage can be made of PWAS for transmitter (exciter) and receiver (detector) functions. The mechanism through which the ultrasonic Lamb waves traveling into the structure are coupled with a PWAS oscillating at ultrasonic frequencies was investigated through the space-domain Fourier transform. It was found that multiple Lamb modes are simultaneously excited. However, the excitation amplitude of each mode varies with frequency and PWAS dimensions. It was shown that special frequencies exist at which only one Lamb mode is excited, while the other modes are rejected. This mode tuning effect was utilized to preferentially excite certain ultrasonic Lamb modes that would give best detection of certain flaw types in the structure. For example, a frequency sweet spot for the preferential excitation of the low-dispersion So Lamb mode was identified at around 300 kHz in l-jnm thick aluminum plate. These theoretical predictions were verified through carefully conducted experiments that are also presented in this chapter. The common thread in the mechatronics and smart structures design techniques presented in this chapter was the focus on maximizing the outcome while maintaining a reasonable income. The outcome to be maximized was defined in terms ofthe output power and energy. The income to be maintained at a reasonable level was the input power and energy. While the output power and energy were mechanical in nature, the input power and energy were of electrical origin. The important mechatronics aspect of this situation resides in the electromechanical transduction taking place in the active material actuators. To maximize the complete system and obtain an efficient design, one has to judiciously combine mechanical design techniques, transduction analysis, and material science concepts. The next step in this direction is to include in the design the analysis of the other components of the extended mechatronics system that constitutes a multifunctional smart structure. These other systems are the power supply and energy source, the embedded micro controller or PC control system, and the wireless communication between the various mechatronics components of the complete system. Such an extended analysiswould permit a complete approach to multidisciplinary design optimization and would produce a high-efficient mechatronics smart structure for intelligent products, processes and systems applications.



 Me chatron ics and smart struc tures design tech niques



407



BIBLIOGRAPHY Ayres, T , Chaudhry Z ., and R ogers C. (1996) " Localized Health Mon itor ing of C ivil Infrastructure via Piezoelectric Actuator/S ensor Patches," Proceedinos, SPIE's 1996 S ymposium 011 Smart Structures alld lntejir,lted Systems, SPIE Vol. 27 19, 1'1'. 123--131. Banks, H . T , Smith, R . c., Wang, Y. (1996) " Smart Material Stru ctures: Modeling, Estimation and Co ntrol" , Masson, John Wi ley & Son s, Paris 1996. C hang , E-K. (1995) " Built-In Damage Diagnostics for C omposite Structures" , in Proceedings of the 10th Illtematioll,l1 Corference011 Composite Structures (ICCM- l 0), Vol. 5, Whistler, B. C, Ca nada, August 14-1 8, 1995,1'1'. 283- 289 C hang , E-K . (1998) "Ma nufacturing and Design of Built-in Diagnostics for Com posite Struc tures" , 52nd "vlretill,~ ~f the Society for Machinery Failure Prevention Tee/mo[oji)', Virginia Beach , VA, March 30-April 3, 1998. C hang, E- K. (20()J) "Structural Health Monitoring: Aerospace Assessment" , Ilero Mat 200 1, 12th ASM A'lIIualAdvanced Aerospace Materials and Processes Conierence, 12- 13June 2001 , Lon g Beach CA. Chaudhry, Z ., Joseph, T , Sun, E, and R ogers, C. (1995) " Local-Area Health M oni tori ng of Aircraft via Piezoelectric Actuator/ Sensor Patches," Proceedinos, SPIE North American Conjerence Oil Smart Structures and Materials, San Diego, C A, 26 Feb.-3 March, 1995; Vol. 2443, Pl'. 268-276. C rawley, E. A. and del.uis, J., (1987) "Use of Piezoelectric Actuators as Element s ofl ntelligent Structures", A IAA jo lw zal, Vol. 25, N o. 10, 1'1'.1375- 1385, 1987 . C ulshaw, 13., Pierce, S. G., Sraszekski, W. j. (1998) "C ondition M onitoring in Co m posite Materials: an Integrated System Approach" , I'roceedilljis Ofthe Institute of Mechanical Enyinecrs. Vol. 2 12, Part I, Pl'. 189202. Diamanti , K., Ho dgkinso n, j. M ., Soutis, C. (2002) " Damage Detection of Composite Laminates Using pZT Generated Lamb Waves", l st Europ ean Workshop on Structural Health Mo nitoring, Ju ly 10-1 2, 2002, Paris, France, Pl'. 398-405. Dup ont , M., O smont , R ., Gou yon, R ., l3alageas, D. L. (2000) " Permanent Monitori ng of Damage Impacts by a Piezoelect ric Sensor Based Integrated System " , in Structural Health MOllit",i"ji 2000, Techno mic, 2000, Pl'. 561-5 70. Fuller, C. R ., Snyder, S. 0. , Hansen, C. 1-1., Silcox, R .j. (1990) " Active Co ntrol of Interior No ise in Model Aircraft Fuselages Using Piezoceramic Actu ator s" , Ame rican Institute of Aeron auti cs and Astronautics Paper # 90--3922. Giurgi utiu, v., and R ogers, C. A., (1997) " Electro-mechanical (E/ M) impedance method for structural health mo nito ri ng and non-destru ctive evaluatio n" , International Workshop 0 11 Structural Health Mon itorir\~, Stattford Universitv; CA , Septemb er 18- 20, 1997, 1'1'.433--444. Giur giut iu, v., Re yno lds, A., and R ogers, C. A., (1998), "Experimen tal Investigation of E/ M Impedance H ealth Monitoring of Spot-Welded StructuralJoints" submitted for pub lication to the Journal ofIlItelliJirlll Material Systems alld Structures, J uly 1998. Giurgiutiu, v., R ogers, C. A. (1999) " R ecent Progress in the Application of E/M Impedance Meth od to Structural Health Monitoring, Damage Detection and Failure Preventi on " , 2nd International Wor kshop of Stru ctural Health Monitoring, Sept. 8-10, 1999, Stanford U, CA, 1'1'.298- 307. Giurgiuti u, v., Zag rai, A. N. (2002) "E mbedded Self-Sensing Piezoelectric Active Sensor s for On-line Structural Identificatio n" , Transactions of ASME, Journal of Vibration and Acoustics, January 20lH , Vol. 124, Pl'. 116~125. Giurgiutiu, V , Zagrai, A. N., Bao, j. (2002) " Embedded Active Sensors for In-Situ Structural Health Monitoring of Thin-Wall Stru ctures" , ASME Journal of Pressure Vessel Technology, Vol. 124, No.3, Augu st 2002, Pl'. 293-302. lhn, j.-ll ., Chang, E-K. (2002) " Built- in Diagnostics for Monitor ing Cr ack Growth in Aircraft Struc tures", Proceedings of the SPIE's 9tli Intern ational Symposium on Smart Struc ture s and Materials, 17- 21 March 2002, San Diego . CA , paper #47ll2-ll4. Lakshm anan, K. A. and Pines, D.]. (1997) " Modeling Damage in Co mpo site R otorcraf] F1exheams Using Wave M echanics" ,J",mral ofSmart Materials and Struaurcs Vol. 6. No. 3, J une, 1997, lO P, Bristol, England, Pl'. 383-392. Lester, H . c., Lefebvre, S. (1993) " Piezoelectric Actuator Mod els for Active Sound and Vibration Co ntro l ofCylinders". jollmal Of Smart Material: .1lld Structures, Vol. 4, Jul y 1993, Pl'. 295-306. Liang, C, Sun, E 1'., and R ogers C. A. (1994) " Coupled Electro -M echanical Analysis of Adaptive Material System-Determ ination of the Actuato r Power Co usumption and System energy Transfer" , Journal of Intelligent Material Systems and Stru ctures, Vol. 5, Janua ry 1994, Pl'.. 12- 2ll.



 408



Victor Giurgiutiu



Lin, X, Yuan, E G. (2001) "Damage Detection of a Plate using Migration Technique", Journal of Intelligent Material Systems and Structures, Vol. 12, No.7, July 2001. Osmont, D., Dupont, M., Gouyon, R., Lemistre, M., Balangeas, 0. (2000) "Damage and Damaging Impact Monitoring by PZT Sensors-based HUMS", Proceedings of the SPIE Smart Structures and Materials 2000: Sensory Phenomena and Measurement Instrumentation for Smart Structures and Materials, SPIE Vol. 3986, pp.85-92. Park, G., and Inman, 0. J. (2001) "Impedance-based Structural Health Monitoring", Monograph: Nondestructive Testing and Evaluation Methods for Infrastructure Condition Assessment, edited by Woo, S. C, Kluwer Academic Publishers, New York, NY. Sun, E P, Liang C, and Rogers, CA., (1994) "Experimental Modal Testing Using Piezoceramic Patches as Collocated Sensors-Actuators", Proceeding ~f the 1994 SEM Spring Conference & Exhibits, Baltimore, MI, June 6-8, 1994. Tzou, H. S., Tseng, C I. (1990) "Distributed Piezoelectric Sensor/Actuator Design for Dynamic Measurement/Control of Distributed Parametric Systems: A Piezoelectric Finite Element Approach", Journal of Sound and Vibration, No. 138, pp. 17-34, 1990. Wang, C S., Chang, E-K. (2000) "Built-In Diagnostics for Impact Damage Identification of Composite Structures", in Structural Health Monitoring 2000, Fu-Kuo Chang (Ed.), Technomic, 2000, pp. 612-621.



 ENGINEERING INTERACTION PROTOCOLS FOR MULTIAGENT SYSTEMS



A Full Development Cycle



MARC-PHILIPPE HUGET* AND JEAN-LUC KONING



1. INTRODUCTION



1.1. Interaction protocols in multiagent systems



Multiagent systems are becoming increasingly popular as a new programming paradigm that provides the right abstraction level and the right model to build a lot of distributed applications. Its basic components are agents which are encapsulated computer systems that are situated in someenvironment and that are capable offlexible, autonomous action in that environment in order to meet their design objectives [1]. In order to tackle the decentralized nature of multiagent problems, a new approach consists in adopting an agent-oriented view of the world. Furthermore, the agents are to interact with one another, mostly to achieve their individual objectives. In order to reach any cooperation/coordination/negotiation goal agents rely on interaction protocols to exchange information. An agent interaction protocol can be seen as a set of rules that guide the interaction among several agents. For a given state of the protocol only a finite set of messages may be sent or received. If one agent is to use a given protocol, it must agree to conform to such a protocol and obey the various rules. Moreover, it must comply with the semantics of the protocol. A rule may be either ofa syntactic or semantic nature. A syntactic rule has to do with the protocol's architecture, i.e., the transitions between the protocol's states. Semantic rules define actions that the agents are to perform when sending or receiving messages. * This work has been achieved while the first author was Ph.D. student at the



MAGMA



Team, Leibniz, in Grenoble, France.



409



 410



Marc-Philippe Huget and Jean-Luc Koning



For example, receiving a message with an inform performative may modify an agent's behavior since the related piece of information is added to the agent's knowledge base. From then on, such a piece of information is part of the agent's belief. Making use of interaction protocols enables agents to reach a solution in a quicker way.Indeed, the agents know the messagesthey can receive in a given state, the messages they can send and the rules that guide their choice in case of non-determinism when several messages are possible. The agents thus go faster towards a solution. For a thorough definition of protocols in multi agent systems, one may mention the works from Pitt et al. [2] or Greaves et al. [3]. 1.2. Communication protocols 1.2. 1. Communication protocols in distributed systems



Interaction protocols for multiagent systems stem from the distributed and telecommunication systems domain. They are specific though in that multiagent systems are general open systems (such as agent-based systems for electronic commerce for instance). Therefore, such protocols must be able to handle a variable number of agents. Another characteristic is that protocols for distributed systems only deal with data chunks whereas agents have to do with a performative (i.e., verb) plus some contents. Besides, in distributed systems the various processes are embodied in a protocol in order for a task to be entirely performed. On the other hand, agents are capable to decide if and when they will start interacting, or as Odell et al. put it "an agent is an object that can say 'go' (dynamic autonomy) and 'no' (deterministic autonomy)" [4]. Holzmann [5] and Lai and ]irachiefpattana [6] have given extensive definitions of protocols in the field of telecommunication. 1.2.2. Communication protocol engineering



The development cycle of communication protocols usually embodies seven stages: identification of needs, design, formal description, validation, protocol synthesis, conformance testing, and interoperability testing. Let us give a brief definition for each of these stages. • The first stage consists in identifying the needs in terms of available and required communication services. The heart of designing protocols deals with the building of protocols whose available services should be as close to the required ones as possible. Therefore, such an informal description of a protocol in natural language should be both comprehensible and complete. A thorough example ofsome assessment ofneeds for a data transfer protocol such as the Sliding Window can be found in [7]. • The design stage aims at obtaining a more formal description of the protocol. Designers usually rely on use cases, chronograms (an example of the TCP protocol may be found in [8]) or algorithms. A main difference with the preceding stage is the introduction of messages and types of messages. The first stage only provides actions whereas this one also defines the message related to these actions as well as their type.



 Engineering interaction protocols for multi agent systems 411



Both these stages are essential and are prerequisites for a sound protocol design. Holzmann [5] points out that the use of natural language may lead to ambiguities or wrong interpretations. It is therefore necessary to make use of a formalism to be as precise and rigorous as possible. This stage is called the formal description and consists in formally defining a protocol. Many formalisms can be used for representing communication protocols. Among the main ones, let us mention finite state automata [5][9], Petri nets [10], and languages such as Lotos [11], Estelle [6] and SDL [61. Going from an informal to a formal description is not an automated process and therefore heavily depends on the designer's skill. Having a formal description enables use of protocol handling and validation tools. o The validation stage ensures that properties defined during the first stage are present in the formal representation of the protocol. Two different paths for the validation can be followed. On the one hand, the reachability analysis deals with the validation of general properties that do not depend on the protocol's objectives. On the other hand, model checking has to do with properties related to the protocol's objectives. In both cases, one has to build the graph of the protocol's reachable states, i.e., the set ofinteraction states accessible from the initial state. The reachability analysis checks the presence or absence of leaves representing the protocol's terminal states in the graph. As far as model checking, properties are translated into a temporal logic formula and then validated on the graph via a model checker. Given that protocols may be complex and large, the generation of the graph of accessible states may be time consuming; the generation of such a graph may be reduced by a number of algorithms [5] [6]. o The protocol synthesis stage consists in obtaining an operational version ofthe formalized and validated protocol. This is a two-step stage. First, an algorithm automatically generates the skeleton of a program whose code corresponds to the protocol's transitions. Second, the designer inserts the code that corresponds to the actions to be performed following a transition between two protocol states. This stage adds some semantics to the protocol. o As just seen the protocol synthesis stage involves some code that is manually added by the designer. This may have introduced some flaws or omissions. Conformance testing is thus a stage that ensures the obtained protocol still follows the assessment of needs. It therefore checks whether the behavior of the operational protocol matches the formal one. o The last stage is the interoperability testing stage which checks whether two processes using the protocol are capable of interoperating, or whether two versions of the protocol are able to communicate.



o



1.3. Engineering interaction protocols 1.3. 1.



o« approach



Figure 1 proposes an incremental development life cycle for agent interaction protocols. It relies on the traditional communication protocol life cycle as much as possible



 412



Marc-Philippe Huget and Jean-Luc Koning



Figure 1. Interaction protocol development cycle.



and introduces specificities whenever interaction protocols cannot be likened to communication protocols. The full cycle encompasses the following main stages. The analysis stage consists in a natural language description of the protocol to be designed. It identifies the various agents' roles, the message exchanges, and their contents as well. This stage embodies the first two stages given in the communication development cycle. The next stage consists in deriving the protocol's formal specification. This is achieved via a graphical modeling of the protocol by a designer, which makes the modeling results amenable to algorithm processing. This stage aims at getting rid of ambiguities that might have been introduced earlier due to the natural language description. It incorporates some static verification ofthe visual specifications. When errors are found, the process traces back to visual modeling for their modifications. Then comes a validation stage, which checks the protocol's behavioral properties. An algorithm automatically generates finite state machines (FSM) from the textual format ofthe modeling specification. A semiautomatic algorithm and some hand-coding rules are provided to translate the protocol's FSMs to Promela [5] specifications. A modelchecker Spin [12] is used to simulate agents interactions and debug some design flaws and errors, and to verify the satisfaction of some property specification expressed in linear temporal logic [13]. The protocol synthesis stage consists in automatically generating the executable version ofa protocol that will be integrated in the agents. A conformance testing stage then checks



 Engineering interaction protocols for multiagent systems 413



Br:l Ei



t;JOIP 1001101Oes_ng Inl...chon t'lolocob



018,11



Pnllocol UA/lll.elliew:_



AQenl



A NI!IIlII



.'.



, '. h .



.."·.···"'·':Tc",.



~.·"" r."_......



'Aossage loken



1



Time Pr tcL>Ouote



COrdIlon.



-



Except,on



0( GoodRr.(Jur.~t



l oop Ilox



Delete



· fI·lvntf~IP.f'M I1f:>fI1Or ~ft!t-



MIMI



INol8,1I



rAicr o.prutuculs:



Ii



Sp.nWlflc s:



I



pen n. ion:



~



~



,.



.•



g



II



rJ' ~ [!J I



"



rnr;Nt1lf.Of1(in Oft



AII.._



eme:



,



r.=



I-;;



. .



Figu re 2. Tool for designing agent interaction protoco ls.



whether the protocol's code conforms to the properties defined earlier in the analysis stage. As for communication protocols, it turns out that the two cru cial stages are specification and validatio n [14]. As illustrated in the previous figure, there is no interoperability stage since it intrinsically poses hard problems due to the nature of the (autonomo us) agents. 1.3.2. Some tools.



N ot only sho uld a development cycle provide a pro cedure to go from a set of requirem ents to a fully operational protocol, but also it is of crucial impo rtance for it to rely on tools, whi ch can help designers in their task. This chapter details three of them . We have developed a A Modeling Tool to Support the D esign Stage: DIP. platform with a tool dedicated to helping design intera ction protocols (D IP). This tool follows a compo nent- based approach. As show n on Figure 2, it is endowed with a true graphic editor that enables definition of interaction protocols in the graphic langu age UAM Le by relying on micro-pro toco ls and on the compositional language CPDL Ano ther feature is th e automatic translation into C PD L of a protocol represented by a high- level Petri net. DIP allows the display of a protocol in the altern ate FIPA's



 414



Marc-Philippe Huger and Jean-Luc Koning



notation called the Protocol Description Notation (PDN). Unlike UAML (and UAMLe), PDN is a tree-like description ofa protocol where each node represents a protocol state and the transitions going out of a node correspond to the various types of messages that can be received or sent at the time the interaction takes place. Since DIP is also used in analysis and implementation stages, it is possible to store a description of the protocol in natural language and the designer can generate a skeleton of the protocol in a programming language. The protocol described in Figure 2 shows the three agents' roles. Each interaction between the agents is represented by a directed edge labeled with the speech act used in the corresponding message. The first two messages deal with the handling of an exception.



A Tool for Testing Protocols: TAP. TAP (Testing Agents and Protocols) is a tool dedicated to the validation of interaction protocols. It performs the accessibility analysis of CPDL described protocols and translates their CPDL representation into their PROMELA counterpart [5] in order to be able to make use of the well known SPIN model checker [12]. A Tool Dedicated to Conformance Testing: CTP; The CTP (Conformance Testing of Protocols) tool is capable of generating a graph of accessible states for a formal and valid protocol, and checks whether the protocol's behavior is correct. 1.4. Overview of the beghera application



Prior to dealing in detail with each ofthe stages ofthe interaction protocol engineering process let us introduce one application that will help exemplify the approach advocated all through this chapter. The aim of the Beghera project [15] is to build a decentralized system, which supports distance learning. Such an application is originally dedicated to the teaching of eighth-grade geometry problems when a student has an enforced long stay at the hospital. In order to avoid wasting a whole school year those students are given the opportunity to log into a local workstation. From there they are able to reach a set of new or partially known geometry problems as well as pieces of knowledge related to currently or already studied topics. The multiagent system Beghera gathers a diversity of actors (professors of various competencies and students of various levels) and types of actors (artificial and human) spread on a network. Compared to current tele-teaching platforms Beghera's multiagent approach introduces new features. First, it affords a spatial distribution and mobility to the students and professors. They do not have to be in a given location. Second, the frequency, duration and time oflogin are left open for the students as well as the professors. There is no requirement to be connected at a given time. Third, it ensures a personalized learning scheme followup which allows each student to receive help and suggestions that take into account his/her own learning trajectory, and to support a professor's help to a student by providing him/her with a piece of information that is relevant for that student as far as the current reasoning situation s/he is in,



 Engineering interaction protocols for mulriagent systems 415



\-+- --



assistant



..



rrcom~~nioti



~



tutor



...



.



c~mpartmentO--professor "



.



8



verificato



,.. , au ~ " ' .



-,



,



', - - -', -



Stude



tutm



:,



SChOOlba~



~verificato



8



'



,r



"



,



,



,



"



,,'



"



-.. .... ..... - -



verificator



I



o



~ ~



h Ib __ ~ag



e



,



, ,



Student



Agent inter-agent communication



Figure 3. Th e Beghera system.



or, mo re globally, his/her learn ing progress. As opposed to an interaction via a video channel for instance where a professor and a student need to rebuild a common setting in order to be able to effciently int eract (student level, special difficulties related to a knowl edge domain and the current problem , etc.). Such a multiagent system con tributes to this construction by providing to professors pieces of a student's history or by helping to match in a relevant way a student to a professor, or even a student to another student when this pairing is enough to make the student's learn ing progress. By means of such a com puter system (see Figure 3), students are in direct relation with several artificial agents. • A companion agent assists a student by providing him/ her a graphical user interface that will help interac tion with the other agents and mostly with the tuto r agent. The companion agent is also in charge of managing the student's schoo l bag whi ch encompasses both the already and yet to be solved geometry problems as well as some didactic infor mation such as his/her knowledge level and his/her knowledge background (the theorems, axioms, proofs s/ he knows). • A tutor agent has clearly a didactic pur pose. It coordinates the rest of the student's agents and guides his/her learn ing scheme via the companion agent by providing assignments that comply with his/her level. It is endowed with dialog capabilities in order to manage the interaction with professors or possibly some other students. O ne of its main goals is to rebuild the learn ing setting ofa stude nt (w hat s/ he is knowing or lacking) when ever a professor needs it. It also possesses nego tiation capabilities with the verificator agent in order to shape the derived proofs given the student's knowled ge profile. It steps into th e students' reasoning process if they take a wrong path in their



 416



Marc-Philippe Hugel and Jean-Luc Koning



problem solving or if they have a hard time understanding a proof. Therefore it keeps a close look at what a student is doing and what is going on in the system. It takes into account the verificator's viewpoint in order to guide the students and lead them in the right procedure towards a solution. • A verificator agent analyzes the validity and coherence of a proof given by a student. It allows the building of analogies between what a student is currently doing and the proofs s/he may have obtained in the past. It encapsulates a resolution module that helps in building mathematical proofs by suggesting whole or partial demonstrations. Such an agent is a formal reasoning tool capable of refutation, belief revision, and the proposing of counter-examples. In a similar fashion, a companion agent assists professor, which help her/him interact with her/his assistant agent via a graphical user interface. This latter agent's purpose is to guide a student's learning progress through his/her tutor agent. Each professor owns a set of compartments that hold geometry problems arranged according to their type and difficulty level. Beghera's approach consists in allowing the tutor agent to call into play the verificator's services. The scenario between a student and a computer is thus rather general compared to the one where a student receives an evaluation from a professor several days after turning in an assignment. 1.5. Chapter's outline



This chapter advocates a full development cycle for the engineering of interaction protocols for multiagent systems. Such an incremental cycle is composed of 5 main stages, which are presented in detail. Section 2 discusses the analysis stage. It first introduces a prototypical analysis document that helps designers gather a protocol's informal specification. It then applies this procedure to Beghera's ProeifAnalysis protocol. Section 3 tackles the formal description issue and introduces the related componentbased formal specification language called CPDL. It also discusses the use of visual modeling languages and applies it to the Beghera example. Section 4 explains how to perform the validation stage and how to validate properties. Section 5 focuses on the building of an operational version of a protocol. The last stage deals with conformance testing and is explained in section E Finally, this chapter ends with some discussion. 2. THE ANALYSIS STAGE



The first stage in all design cycles is the analysis stage or the specification of requirements. This one aims at defining the goal of the product to design. In our case, this goal is the protocol's unfolding. The analysis document is a natural language description that encompasses the plain specification of needs and the general design step as exhibited in distributed systems engineering [5]. On the one hand, this analysis stage defines the protocol's unfolding. On the other hand, it describes its data types and messages.We suggest that some analysis document gathers the following fields: the protocol's name, its keywords, the agent roles involved in the interaction protocol, the agent which initiates the interaction,



 Engineering intera ction protocols for m ultiagent systems



417



any prerequisites, the prot ocol's unfoldin g, the con straints on the exec ution, and the postcondi tions. It is composed of a set of fields:



Name: This field corresponds to the name of the protocol. It is used in order to distingui sh one specification of requirem ent s from others. Keywords: A protocol designer often reuses previous prot ocols or parts from th em . It is necessary that designers can easily look tor a prot ocol, which fits their need s. U sing keywords is an int eresting solution since they allow us to distinguish easily o ne protocol from th e others, and particularly, when on e has a set of pro tocol s whi ch all have the same domain of action but some different features. For instance , the Co nt ract Net has several different protocols available: n agent negotiation, iterated negotiation, etc. Agents: In fact, this field gives th e role of each agent in th e prot ocol. For instance, in th e electronic commerce prot ocol NetBili [16], one does not mention the agents Smith#l and ]ones#l but their role: client. This notion of role is a convenient way to tackle the notion of dyn arnicity in the protocol and th e openness of multiagent systems. So, when new agents appear during the interaction , one doe s not change the protocol. Initiator: T his is the beginner of the interaction. Prerequisite, constraints and termination: Som e constraints must be checked during the protocol use cycle. Before the interaction, th e prerequ isites have to be true in order to begin th e int eraction . During the int eraction wh ere constraints must be fulfilled and at th e end of the inte raction wh en the termination predicates must be checked. These all three fields are not used at th e same tim e. The prerequ isite field is used during the agent design and particularly, during the design of the Jgent part managing protocol executi on . T he two remainin g fields are conside red dur ing protocol design. Summary: A protocol description might represent a voluminou s docum ent so, it is worthwhile if designers can read a summary of this prot ocol provided by this field. Protocol's unfolding: This is th e main field in the specification of requirements. It contains a set of information abo ut the goal of the protocol and its unfolding. M oreover, messages and message types are described for each step of the interaction. Its unfolding is presented eith er in natural language or with chro nograms or use cases [17]. The design of this specification of requirements is essent ial and must be accurate. Actually, its docum ent is present in all stages of the design cycle. The prot ocol 's unfolding field is used during the formal descrip tion stage. Both co nstraint field and termination field are used during th e validation stage. Finally, the prerequi site field is co nsidered during the protocol synthesis stage. T he writing of this docum ent might be realised with our tool DIP described before (see Figure 2). It is thu s possible to int egrate th e specification of requirem ent s and th e formal description in one too!'



 418



Marc-Philippe Huget and jean-Luc Koning



We illustrate this stage with one example coming from the Beghera project. This example is the verification of proofs realised by students. Students write proofs for their geometrical exercises and want to know if their proofs are correct or not. They ask their companion agent to check their proofs. Then this agent asks the tutor agent which asks the verificator agent. This agent checks if the proof is correct or not. It can also reply that it can not answer on the validity of the proof. After that, the verificator agent answers the tutor agent, which forwards this answer to the companion agent.



Name of the protocol: ProofAnalysis Keywords: Beghera, proof validation Agents: Student companion, tutor, verificator, teacher companion and assistant Initiator: Student companion Prerequisite: The student must be connected to the system Summary: The student informs his/her companion agent that he/she wants to know if his/her proofis valid or not. The companion agent asks the tutor agent about it. Then, the tutor agent asks the verificator agent about it. Three answers are coming from the verificator agent: either the proofis valid or it is invalid or the verificator can not answer. In the two first cases, the answer is forwarded to the companion agent. In the latter case, the tutor agent then asks the other tutor agents if they have an answer about this analysis. If they have an answer, this one is forwarded to the companion agent and finally, if they do not have answers, the tutor agent asks the teacher's assistant about the answer and forwards its answer to the tutor agent. Constraints: Nothing Protocol's unfolding: The companion agent asks the tutor agent about the proof analysis with the query-if performative and the content of this message is the proof. The tutor agent forwards this proofto the verificator agent with the same performative. The verificator agent answers with the performative inform. The content of the message is either true if the proofis valid, or false if it is invalid or noidea if the verificator can decide if the proof is valid or not. If the content is true or false, the tutor agent sends the information to the companion agent with the inform performative. Finally,if the content is noidea, the tutor agent asks the other tutor agents about the proof with the query-if performative. These tutor agents answer with the inform performative. If the content is true or false, it is sent to the companion agent. If the content is noidea, the tutor agent asks the teacher's assistant about the proof. Whatever is the answer, the tutor agent answers to the companion agent with the inform performative. Termination: An answer is sent to the student, either his/her proof is valid or invalid or it is impossible to decide if this proof is correct or not. 3. THE FORMAL DESCRIPTION STAGE



A natural language specification of requirements is described in the previous stage. As Holzmann notes, using natural language can carry a misunderstanding of the protocol's properties or its unfolding [5]. Actually, if the specification ofrequirements is described by one designer and the design stage by another one, the document has to be without



 Engineering interaction protocols for multiagent systems



419



misunderstandings on the protocol's unfolding. More over, natural language does not make the reuse of such parts of the protocol easier. Finally, this approach does not allow checking if some properties are present in the protocol. A convenient solution is to use an accurate and rigorous tool in order to represent a protocol. This tool is called a formalism. Therefore, the formal description stage consists in translating an informal descripti on of a protocol into a formal one. T his stage aims at getting rid of ambiguities that might have been introduced due to the natural language descripti on. T his stage is essential either in communication protocol engineeri ng or in interaction protoco l enginee ring. Here is a list of formalisms used in order to represent interaction protocols: • Finite State Automata [1 8], AgenTalk [19-21], COSY [22, 23] • Petri N ets [24, 25] • Temporal logic [26-28] ·Z [29- 31] • Lotos [32] • SDL [33] 3.1. Towards a new interaction modeling language



We have identified five major characteristics an interaction mod eling language might have: reuse, synchronization, validation, ease of design and tools. C 1. Reuse. Designers have to reuse previous protocols or such parts of them as Singh quoted it [34]. The reuse notion is linked with the notion of modularity since it is easier to reuse protocols defined mod ularly. C2 . Synchronization. Agents act concurrently in multiagent systems. The parallelism ofaction is also present in interaction. So, the formalism has to implement the notion of meeting points and the ability to synchronize agents. C3. Validation. The formalism has to be provided with algori thms and tools in order to check properties. At least, some translations into another formalisms, wh ere algorithms and tools exist, must exist. C 4. Ease of design. The design of protocol in this form alism has to be easy and, if possible, it has to contain a graphical modeling language. C5 . Tools. Designers must have tools in order to design and to check properties. The comparison between these cri teria and these formalisms is summed up in table 1. The plus sign den otes that the form alism fulfills the criterion . T he minu s sign does not mean the formalism does not take into account the criterion but this is not fully presen t. T his article does not describe all the explanations of the results. One can find these ones in [35]. One would rather focus on the conclusions comi ng from the comparison between criteria and formalisms.



 420



Marc-Philippe Huget and Jean-Luc Koning



Table 1 Comparison between formalism and criteria Formalism Finite state automata Petri nets Temporal logic Z Lotos SDL



Cl



Cz



C3



C4



c,



+



+



+ +



+ +



+ + +



+ + +



+



+ +



+



+ +



As far as one reads the table, some formalism would not be chosen: finite state automata, temporal logic and the languages Z and SDL. Actually, these ones do not take into account the notion of synchronization, which is essential in multiagent systems. Only two formalisms can be considered: Petri nets and Lotos. Petri nets are reusable if one takes hierarchical ones or recursive ones. However, there are few tools allowing their aggregation. Lotos is difficult to understand and is not provided with graphical description. One can also mention that these formalisms do not take into account the notion of openness, the modification of the number of agents during the interaction. The solution we consider is to define a new modeling language called CPDL, which fulfills all the criteria. The criterion reuse appears thanks to a component-based approach. Moreover, the synchronization between agents is possible. The verification of protocols is realized by algorithms using CPDL and by translations into finite state automata and Petri nets. CPDL is a textual language and we add two graphical languages: GrCPDL (Graphical CPDL) and UAMLe (Unified Agent Modelling Language, an extended version). Finally, we propose a tool to design interaction protocols in CPDL, GrCPDL, UAML and UAMLe: DIP 3.2. A component-based approach



The approach presented here is based on components. A protocol is thus a combination of components, each of which having a particular role. This research is motivated by a twofold acknowledgment:



1. The design of interaction protocols moves increasingly towards a design based on the reuse of existing pieces but the current description methods do not allow for such reuse yet. Indeed, these description methods force a designer to carry out a tedious analysis in order to extract the desired protocol pieces. 2. Needs in interaction protocols match up. For example, only one part of the FIPAContract-net and the FIPA-iterated-Contract-net [37] is specific. Therefore it seems natural to define two parts within each of these protocols: a common part implemented only once and used in both protocols, and a specific part implemented



 Engineering interaction protocols for multiagent systems 421



in each of them. Since components encapsulate a particular behavior, it is possible to define components with some well-defined properties. This way, it would be possible to create components for specific purposes such as negotiation, cooperation, information request, etc. Singh [38] looked into the design of protocols and noticed some problems raised by current technologies while dealing with the formal semantics of protocols. He enumerates four issues that are closely akin to our conclusions: • Interactions between agents must be designed from scratch every time. • These interactions' semantics is included in the procedures, some of which embody code related to the network or to the operating system. This makes the system's validation and modification not commonplace and sometimes rather difficult. • Systems are designed independently from each other and cannot easily be integrated. • Updating or modifying a system in an elegant manner is impossible: a new one cannot replace an agent. A description of the advantages of this approach is presented in [5, 35, 39]. 3.3. Definition of micro-protocols



In the previous section, we said that a component-based approach was advocated. A component is called a micro-protocol since it refers to a part of a protocol. A micro-protocol is composed of four attributes: 1. Its name identifies a unique micro-protocol. 2. Its semantics is used to help designers know its meaning without having to analyze its definition. These two fields make up the micro-protocol's signature. The other two attributes refer to its implementation. 3. Its parameters' semantics. When making use of a microprotocol it is necessary to know all the parameters' semantics since they are used for building messages. 4. Its definition corresponds to the ordered set ofperformatives constituting the microprotocol. Each performative is described along with its parameters like the sender, the receiver and the message's content. Like components in software engineering, a micro-protocol is defined by an executable part, which is a set of performatives, and an inteiface part for connecting microprotocols together. The parameter's semantics field contains the semantics of each parameter used in the protocol. This information is interesting for two reasons: designers can understand what are the requested data and the interaction module can ask unknown data to the reasoning module. Designers are not constrained to follow a specific semantics. Actually, the definition of semantics and ontologies goes past our work in protocol engineering. The



 422



Marc-Philippe Huget and Jean-Luc Koning



FIPA association proposes a set of semantics: SLO, SLl and CL for example (see http://www . fipa. org).



The definition field is the heart of the micro-protocol. It contains the set of performatives used during the interaction. Each performative has parameters: usually the sender, the receiver and the content. This definition is not restricted to an ordered set of performatives but it is possible to add loops, choices, deadlines, synchronization points and exceptions. Here is the grammar for the definition field: <definition> .. - <exception> <definition> I   <definition> 1    <definition> 1  <exception>  <definition>   <definition> 1  <definition> I  <definition> I E <exception> - 'exception('  ')'  ::=  ','    ::=  '='   ::= 'time(' <nombre> ')'  ::= 'token(' <nombre> ')'  ::= '{''}' '['<definition>']'  ::= '('')'  ::= <definition> 'I' <definition> <definition>  :: =  , ( '' , '' , '' )' 1 <exit>  :: =  I  I   1    I <exp> 1   AIBICIDIEIFIGIHIIIJIKILIMINIOIPIQIRISITIUIVIWI XIYIZlalblcldlelflglhliljlklllmlnlolplqlrlsltl ujvtwlxtytz



 ::= 0111213141516171819  :: = perf <e xp> :: = exp  ::= timeout <exit> ::= exit



• The notion of choices reminds us ofthe fact that it is possible to have several messages for one step of the interaction. It is possible to choose between several messages for the sending or the receiving action. One speaks about messages and not performatives since it is allowed to have the same sender and receiver and a different content. A vertical bar separates each alternative from each other, and all the alternatives are enclosed in parentheses. For instance, (agree(A,B,C) I disagree(A,B,C)) proposes two alternatives, either agree or disagree. • It is noteworthy that designers can have the notion of loops without making some artificial methods like linking the initial state of the loop to the final state of the loop and forcing the exit of the loop by the reasoning part of the agent. We provide a convenient method where predicates are enclosed in brackets and the body of the loop in square brackets. The loop follows until the conditions become false.



 Engineering interaction protocols for multiagent systems



423



Here is an example: {hasMoreElements}[request (A, B, "one element"). answer(B, A, E)]



This is an iterative data request. The request is made by the request performative and this request follows until the knowledge hasMoreElements becomes false. This value is modified by the content of the answer performative. • The time predicate deals with deadlines, i.c., a time for receiving a message. This predicate has one parameter, which is the number of time unit. An example of deadline is an information server, which sends information each time the deadline is passed. • The token predicate, similar to the token in Petri nets, allows for synchronizing several agents on one point of the interaction. If the designer inserts token(3) on a performative. He/she expects to receive three answers coming from other agents and the interaction follows only if these three messages are received. • There is an intelligent management of tokens since the three waited messages are not necessarily the same but might be different. For example, some agents can answer with the agree performative, some others with the disagree performative and some others with the not-understood performative. If the number in parentheses is a star, the number of agents for the previous sent message is taken into account in order to define the number of waited messages. The time and token predicates are placed before the performative and are used once. • The exception predicate is global in comparison with the last two predicates. If designers want to change the exception, they have to insert again an exception. The exception of interest is to manage some unusual or unexpected cases, for instance, an unknown performative. Each exception is separated by others by a comma. An example is when a client wants to connect to a server and the latter informs the client that the connection is refused. The microprotocol will be: exception(C="refused").request(A,B,"connection"). answer(B,A,C).inform(A,B,identity)



When the exception is triggered, the action is not inserted in the interaction module but in the reasoning part of the agent. Actually, an action could use some knowledge stored in the knowledge base of the agent. The interaction module only informs the reasoning part of the agent that an exception is triggered. AJI the combinations between the time, token and exception predicates, the loops and the choices are not allowed. Moreover, it is possible to insert all these features inside a loop or a choice. Usually, the exception predicate is inserted at the beginning of the micro-protocol definition. If one finds an exception predicate during the definition, it replaces the previous one.



 424



Marc-Philippe Huget and jean-Luc Koning



The only interesting combinations are: • time-token: it deals with the ability for the agent to wait several messages and if all the messages are not received before the deadline is passed, the agent can follow the interaction. • time-exception: if the deadline is passed and the exception predicate contains the keyword timeout then it is possible to trigger an action corresponding to the time passmg. 3.4. The CPDL language



Combining micro-protocols into a general interaction protocol can be done with some logic formulae encompassing a sequence of microprotocols. The relation between the micro-protocols' parameters should also be specified by saying which are the ones matching. Suppose two micro-protocols a and f3 are used in a same protocol, if they handle an identical parameter, this parameter should have a unique name. This facilitates the agents' work in allowing them to reuse preceding values instead of having to look for their real meaning. This approach is very much oriented towards data reuse. CPDL is a description language for interaction protocols based on a finite state automata paradigm, which we have endowed with a set of features coming from other formalisms such as:



Tokens: in order to synchronize several processes as this can be done with marked Petri nets. Timeouts: for the handling of time in the other agents' answers. This notion stems from temporal Petri nets. Beliefs: that must be taken into account prior to firing a transition. This notion is present in predicate/transition Petri nets as well as in temporal logic. Beliefs: within the protocol's components as it is the case in AgenTalk. Compared with a finite state automaton a CPDL formula includes the following extra characteristics: 1. A conjunction ofpredicates in first order logic that sets the conditions for the formula to be executed. 2. The synchronization of processes through the handling of tokens. Such behavior is given through the token predicate. 3. The management of time and timestamps in the reception of messages with the time predicate. 4. The management of loops that enable a logic formula to stay true as long as the premise is true, with a loop predicate.



 Engineering interaction protocols for mu ltiagent systems 425



A CPD L well- for med formula looks like: ex, b



E



B*, loop(/\ p;} H- micro - protocol',



f3



A C PDL formul a correspon ds to an edge going from an initial vertex to a final on e in a state transition graph. Such an arc is labeled with the micro-p rotoco ls, the beliefs and the loop conditions. ex denotes the state the agent is in prior to firing the formula and f3 denotes the state it will arrive in once the formula has been fired. {b E B}* represents the guard ofa formula. Such a guard is a conjunction of firstorder predicates that needs to be evaluated to true in order for the formula to be used. B is the set of the agent's beliefs. This guard is useful when the set of formulae contains more than one formula with a same initial state. Only one formula can have a guard evaluated to tru e, and therefore it is fired. This requires that no formula be nond eterministic and that two formul ae cannot be fired at the same time. In the cur rent version of CPDL, predicates used for beliefs are defined within the language, and agents have to follow the m. As indi cated earlier the loop predicate aims at handlin g loops withi n a formula. Its argumen t is a conj unction ofpredicates. It loops on the set of micro- protocols involved in the formula while it evaluates to true . States init and end are reserved keywords. They cor respon d to the initial and final states of a protocol. Several final states may exist because several sequences of messages may take place from a single protoco l. For example, with the Contract N et prot ocol [40] whether a bid is accepted or not leads to two different situations; one providing an actual result iaccepted-proposali and one stopping the negotiation process (rej eetedproposal ). We illustrate C PDL language with one example coming from Beghera. It is the pro tocol for proof analysis described in the analysis step: init ~ queryif ( Comp an ion ,Tut or , Proof) , Al Al ~ needanalysis(Tutor , Verificator , Proof) , A2 A2, proof=validated ~ inform (Tutor , Companion , Ana l y s i s ) , end A2 , proof=invalidated ~ needanalysis (Tutor, {Tut or }, Proof ), A3 A3, proof=validated ~ inform(Tutor , Companion ,Analysis ) , end A3 , proof=invalidated ~ needanalysis(Tutor, {Assistant },Proof), A4 A4, proof=validated ~ inform(Tutor , Companion ,Analysis) , end A4, proof=invalidated ~ inform(Tutor , Companion , Fail) , end



The request (the performative query-if) and the answer (the performative itiform) are merged in one micro-protocol called needAnalysis. 3.5 . Graphical modeling languages for protocol's representation



Like the SDL communication protocol descrip tion language [11], it is interesting interaction protocol designers have both a textu al language in order to represent proto cols and to make easier the use by tools and a graphical one which is more accessible for designers.



 426



Marc- Philippe Huget and j ean-Luc Ko ning



UML



/



/~



\



Agent-UML



UAML



EAUML



UAMLe



2000



Agent- UML Figure 4. Graphical mode ling languages for interaction protocols.



In our proposal of interaction protoco l engineering, the textual language is CPDL (Co mm unication Protocol Descri ption Language) and the graphical one is GrCPDL (Graphical C PDL) and UAM Le. The Gr-CPDL language allows designers to graphically represent formulae in CP DL. The second language is more interesting. UAMLe is an extension of th e FIPA graphical language UAML (Unified Agent Mo deling Language) [37]. We propose an extension in order to add, among others, the notion of exception. Essentially two families ofagent modeling languages have been used for representing agent interaction protocols: one is Agent-UML [4] and the other is FIPA-UAML [37] (see Figure 4). Both UAML and Agent-UML give birth to one extension: EAUML, which is an extension of Agent-UML and our proposal UAML e [41]. All th ese proposals consider agents by their role in protocols and each message is sent from one role to one another. Moreover, they are all based on UML and UML sequen ce diagrams. O ur article [41] present a comparison between these four languages. Here are the criteria used for the comparison and the results of the comp arison : Roles: Agents are not represent ed by their name but according to their role within the interaction prot ocol. Such an approach enables one to easily take into account a variable number of agents. O nce those roles are identified there is no need to modify the design of the interac tion protocol when a new agent is brought into place. Synchronous/asynchronous communication: W hen agents send messages to one ano ther they wait (resp. do not wait) till tho se messages are read pr ior to furth er run-operation.



 Engineering intera ction protocols for mu ltiagent systems



427



Concurrency: A number of messages can be sent or received at the same time. Loop: A set of messages is sent a number of time s. Eith er this number is explicitly kn own or the loop is based on a condition that must be tru e for the loop to keep on being activated. Temporal constraints : An agent specifies a deadline that correspo nds to a point in time before which some messages are expected. Exception: A way to handle unexpected events that could either stop the course of an interaction or lead to a failure. D esign : C onnected to the visual model ing language a set of algorithms andlor tools to go to a corresponding formal definition is provided . This may involve a translation of the descripti on into finite state machines (FSM) . Validation: Connected to th e visual modeling language a set of algorithms andlor tools for validating prop erti es on interaction pro tocols is provided . It may either be a struc tural or a functional validation . For this purpose on e may rely on the SPIN, PROMELA model- checker [12]. Protocol synthesis: Som e algorithms andl or too ls can lead to some code gene ration to make a protocol execut able by the agents. Table 2 gives a synthetized view on the followin g four graphical languages AUML , EAUML , UAML , UAML e against these nin e criter ia. The first six criteria deal with the direct characteristics of the visual language, and as a matter of fact, all four languages provide them . Sharper differenc es bet ween these Table 2 Some criteria for comparing int eraction protocol modeling languages AUM L R oles Sync.! Async. Co ncur rency



Bot h



EAUML Asynch ronous Specific connector



UAM L



UAM Le Both Separatio n of the various messages using boxes



Loop



At the level of a message or a group of message



T ime Exception



T hrough deadlines By mean s of a special Not directly co nnector for triggering actions



U pon a set of messages



Design



Possible augmented UML tools



Algorithms for translation into FSM



No graphical tools



Graphical tool DI P



Validation



N o direct brid ge to validato rs



Algorith ms for translation into Promela fo r usc with SPIN



N o direct bridge to validators



Protocol synthesis



No known algor ithm for protocol synthesis



Code ge neration



No known algor ithm for protocol synthesis



Translation to FSM for reachabiliry analysis and translation to I' ROMEl A for mo del-checking Code generatio n



 428



Marc-Philippe Huget and Jean-Luc Koning



agent-modeling languages appear among the last three criteria, i.e., when one considers them as a stage of an overall interaction protocol life cycle. As the main point of this table, one can see that criteria about the features of languages are present but criteria on the design cycle are more or less present. 3.6. UAML and UAMLe languages



UAML [37] is probably the first graphical language proposed (by FIPA) for representing agent interaction protocols. Most ofthe characteristics in UAML also appear in AUML: agents are denoted via their role, several types of message sending along with possible added constraints are allowed (synchronous, asynchronous, broadcast, repeated sending, temporal constraints, etc.). As shown in Figure 5, concurrent messages are allowed. Sub-protocols are an interesting notion introduced in UAML that denotes a sequence of messages inside one protocol. The letters attached to the edges represent a cardinal value, e.g., the first edge indicates that m copies of the message are to be sent and n (or 0 or p) answers (n :s m) can be sent back and so on. UAML represents alternatives in interaction states by means of boxes with separations between the possible cases. Each message is defined via a box containing a message (i.e., with an arrow). A sub-protocol (e.g., see the box starting with the message not-understood-msg) may contain other sub-protocols (as shown by two other nested boxes in Figure 5). Possible choices are separated by lines and messages to be handled concurrently are separated by dotted lines (like between inform-msg and cancel-msg). Compared with UAML, UAMLe essentially enables synchronization of one agent on several messages of different types and also introduces the notion of exception at the level of a single message as well as a set of messages. In the classical Contract Net protocol (with AUML [4] and with UAML [37]) the final message is a cancel message returned by the initiator after receiving the last inform message. Actually, it would be better to send a cancel message only if an exception arises. In UAMLe (see Figure 6) this exception handling is denoted by an exception[cancel] . . . end exception statement. Therefore the exception applies to the whole time interval that corresponds to the waiting for an answer by the agent in charge of the task. When the exception is caught the cancel message is sent to that agent. 3.7. A tool for supporting agent interaction protocol design



We have developed a platform with a tool dedicated to helping design interaction protocols (DIP). This platform also contains a tool for validation (TAP) and a tool for conformance testing (CTP). DIP follows the component-based approach presented here above. As shown on Figure 2, such a tool enables to design and implement a protocol in a graphical manner by relying on micro-protocols and on the compositional language CPDL. Our platform is endowed with a true graphic editor that enables the definition of interaction protocols in the graphic language UAMLe (see Figure 2). Such a tool allows one (1) to build and (2) to modify protocols. For this, DIP maintains some



 Engineering interaction protocols for multiagent systems 429



IAgent-Type-I I I



1



]



]



I



I



I cfp-rnsg I



IAgent-Type-2 I m



I not-understood-rnsg I n I rcfuse-msg I I propose-msg I I reject-proposal-rnsg I



I accept-proposal-msg I



0



p



r



q



I



I failure-msg I



q



]



I inforrn-msg I



q



--- - - ------- - - - - --]



I cancel-msg I



-



q



Figure 5. Contract Net proto col in UAM L.



information about a protocol: its name, its set of micro-protocols, its semantics and its set ofC PD L formulas. Another feature is (3) the automatic translation into CPDL of a prot ocol represented by a high-level Petr i net. (4) DIP allows one to display a protocol in the altern ate FIPA notation called the Protocol Description Notation (PD N). Unlike UAML (and UAMLe ), PDN is a tree-like descripti on of a proto col where each node represents a protocol state and the transitions going out of a node correspond to the various types of message that can be received or sent at th e time the inte raction takes



 430



Marc-Philippe Huget and Jean-Luc Koning



I Agent-Type-l I



exception [cancel] end exception exception [cancel]



I Agent-Type-2 I



1



cfp-msg



m



1



not-understood-msg



n



1



refuse-msg



0



1



propose-msg



p



I



reject-proposal-msg



r



I



accept-proposal-msg



q



I



failure-msg



q



I



inform-msg



q



end exception



Figure 6. Contract net protocol in UAMLe.



place. Since DIP is also used in analysis and protocol synthesis phases, it is possible to store a description of the protocol in natural language and the designer can generate a skeleton of the protocol in a programming language. 4. THE VALIDATION STEP



Formal description ofprotocols does not avoid errors and some requested behaviors can be absent. Interaction protocol designers have two verification techniques in order to check if good properties are present, if wrong properties have disappeared and if all the properties defined in the document of requirements are described: the first technique uses the notion of graph and defines the checked property as a property on a graph; it is the reachability analysis. The second technique defines a model of the protocol and a property defined by a temporal logic formula; it is the model-checking [5].



 Engineering interaction proto cols for multiagem systems 431



4.1. The reachability analysis



The validation stage checks whether a formal description of a protocol satisfies the prot ocol's requirements. Today the validation method mostly used both in academic and indu strial settings is based on the accessibility test. Checking whether a protocol is accessible consists in checking that all of the protocol's states can be reached from the initial state. This come s down to generating a graph of all the states accessible from the initial state. One can thus verify a number ofproperties such as accessibility, deadlocks, livelocks, completion of a protocol, etc. The downside of such a techn ique is that a state-space explosion may quickly be reached even with rather simple protocols if they make use of a significant number of variables with wide domains. For our platform, we have built a validation tool called TAP that is capable of verifying some independent properties on the protocol using C PDL. TAP also implements a translator that turns a C PDL specification into a PROMELA [5] specification in order to make use of a model checker [43]. The algorithm for the reachability analysis is the following : 1. Generate the accessibility graph 2. Define the property as a graph property 3. C heck whether this is a property of the graph or not The first step is automatic. An algorithm for the generation of the accessibility graph can be found in [5]. This algorithm is similar to a depth-first search in the protocol. As we use CPDL as formal language, we only provide a translation from C PDL into a graph. Thi s algorithm is described in [44]. It is quite similar with the one propo sed by Holzmann but, here , we have to consider the protocol's unfolding. Th e second step requires one to translate the property into a graph property. For instance, one can translate the ter mination proper ty asfollows: if a node has no out goin g vertices, this node must be quoted as a final state. Then, one has j ust to check the proper ty on the graph . 4.2. One example



The example of agent introdu ction in a society is a prot ocol example where the new agent and the agent responsible for the introduction do not have the same proto col. The initiator is the agent , which wants to enter the society. Here is its protocol: init



-s



connec t i.ont A, B, "connection"), end



and the micro-protocol connection is: query-if(A,B,C). (inform (B ,A , "accepted").inform(A ,B ,id) I inform (B ,A , " r e f u s e d " ) )



 432



Marc-Philippe Huger and Jean-Luc Koning



o



+query-i~



(0,0)



0



0



(1,1) -query-if..



(1,0)



+inform



0



accepted



0



0



(1,2)



(2,2)



refused +inform.. -inform..



.~(l,2)



-m rm



id



+inform~



(2,2)



(3,2)



(3,3)



0



0



Figure 7. The reachabiliry graph for the protocol Pl.



o



+query-i~



(0,0)



0



0



(1.1) -query-if..



refused +inforr:, -inform.. (1.2)



(1,0)



accepted



(1,2) rm +inform.. (2.2)



o



(2.2)



id



(3.2)



-inform.



0 (3,3)



Figure 8. The reachabiliry graph for rhe protocol P2.



The initiator asks if it is possible to enter the society with the query-if performative. Then, the agent responsible for the introduction answers either accepted if the introduction is allowed or refused if the introduction is not possible. If the introduction is possible, the new agent sends its identity to the agent. The agent responsible for the introduction has the following protocol: init-.connection(A, B,



"connection") .publish(B,C, "new agent"), end



The micro-protocol connection is the same as before. The microprotocol publish only contains one perforrnative inform in order to inform all the agents within the society of the appearance of a new agent. Figure 7 gives the graph G I for the protocol PI which is the protocol for the initiator. The plus sign corresponds to a message sending and the minus sign to a message receiving. Figure 8 gives the graph for the protocol P2 which is the protocol for the agent responsible for the introduction.



 Engineering interaction protocols for multi agent systems 433



Then, one has to merge these two graphs into one graph. The resulting graph is the graph G2 since all the differences are in the protocol P2 and in the micro-protocol



publish.



From now on, one has to check the property. Some properties can be checked with our tool TAP if protocols are described in CPDL.



4.3. The model-checking approach



The model-checking approach needs to build a model of the protocol. This model is the reachability graph ofthe protocol described during in the previous section. Modelchecking is more complex than reachability analysis because in reachability analysis, one checks if the property is present or not and this property is described as a graph property whereas, here, the property is a temporal logic formula. Two temporal logics are available:linear temporal logic and branching temporal logic. Linear temporal logics (Propositional Temporal Logic PTL [45]) are used in order to check some structural properties like mutual exclusion and deadlocks where branching temporal logics (FML in Concurrent METATEM and CTL* [45]) check some functional properties which are dependent from the goal of the protocol. For instance, D'Argenio et al. [46] present an example for a bounded retransmission protocol. This property is:



\f~K.BAD



/\



\f~L.BAD



K and L are two bounded channel used between two processes in order to exchange data. This property corresponds to the validation that for no paths; it is possible to have simultaneously the property BAD on both K and L. The property BAD corresponds to the capacity overhead of a channel. There exists some works in interaction protocol validation in multi agent systems. Wen and Mizoguchi [47] use the model-checker SMV of Carnegie-Mellon. Lacey and DeLoach [49] [50] and Wei et al. [38] use the model-checker SPIN designed by Holzmann [12]. Our solution in our interaction protocol engineering is to translate the protocol in CPDL into a program in PROMELA and then use the modelchecker SPIN [5]. The algorithm first translates the protocol in CPDL into a finite state automaton and then this automaton into a program in PROMELA. The tool TAP realizes these translations. 5. THE PROTOCOL SYNTHESIS STAGE



5.1. Phase role



By now, a designer will have a formal description of a protocol and a validation of it. It is time to realize an operational version of such formal protocols. Describing an operational version of a protocol is called protocol synthesis or more traditionally in software engineering, implementation. This stage derives a program in a given programming language from a formal version.



 434



Marc-Philippe Huget and Jean-Luc Koning



Figure 9. Classicalapproach for protocol synthesis.



5.2. Two methods for the protocol synthesis



Two methods are proposed in our interaction protocol engineering. The first one (see section 2.1) is the classical approach used in communication protocol engineering: a program is defined from the formal version of the protocol. The second one (see section 2.2) is our proposal and the protocol is directly executed in CPDL language. This execution is realized by a mechanism called interaction module. 5.2.1. Protocol synthesis approach



The protocol synthesis approach is the classical one used in communication protocol engineering. Chu in his thesis [50] presents several algorithms in order to synthetize a program corresponding to the protocol. These algorithms are more or less automated and need more or less human actions. The main problem with this approach is that it is not really obvious how to define a program having the behavior of the protocol. As Loffler and Serhrouchni noted in [51], interaction protocol designers can make some mistakes during the synthesis. Moreover, there is a big gap between the formal language and the programming language. Actually, if the formal language carries some notions, which can be added -with difficulty- in the programming language, it is easy to implement some wrong behaviors. For instance, Petri nets carry deadlines but a language like LISP does not consider deadlines. So, if designers want to translate a Petri net into a LISP program, there is more chance some errors will be inserted. Figure 9 presents the fact that it is obvious to have several versions of a protocol: a first version for the validation, a second one for the protocol synthesis and the last one for the conformance testing. The first step of this synthesis is the generation of a program skeleton from the formal version of the protocol. This skeleton corresponds to the transitions of the protocol. These transitions can be stored in an adjacency matrix. A second solution is to realize a program where each state of the protocol is represented by one condition statement if and for all transitions going out this state, a condition statement if is described.



 Engineering intera ction protocols for multiagent systems 435



For instance :



== Nl) If (message == a) perform something knowing a If (message == b) perform something knowing b If ( me s s a ge == c) perform something knowing c



I f (state



The prot ocol synthesis algorithm recalls the depth-first search in graphs. Actually, the algorithm considers all the CPD L formulae in the prot ocol and then each microprot ocol for these formulae. From th ese micro-protocols, the algorithm takes all the performatives and builds the skeleto n. Th e skeleton can be divided int o three parts: a first part for the initiator of the interaction, a second part for the agents which attend the interaction and a third part in order to build the medium in order to link agents. The design tool DIP realizes the protocol synthesis and the skeleto n is described in Java. DIP splits the code int o four classes. A class for the initiator also called Server ; a class for the attendees called Client. The class Proto col makes the management of the interaction and each transition from the graph is stored in a class Transition . Some examples of proto col synthesis can be found in [44). In order to man age the gap between the form al version and the programm ing language, we prop ose a second approach where a mechanism directly run s the CPDL code. This approach has some similarity with interpreted languages like LISP [52). This second approach is descr ibed in the next section. 5.2.2. Direct execution if protocols ill CPDL langllage



The main problem with the previou s approach is that errors can occur during the translation from a form al version into a programming language or some features w iII be absent. A solution would be to delete the translation and to directly execute the protocol in CPD L language. So, if designer s can directly run the protocol in C PDL, errors and lack of features disappear. This prop osal is summarized in Figure 10, where one only finds one version for the validation and the protocol syntheSIS.



Direct run of protocols in CPDL language recalls interpreted languages like LISP [52). These languages need some compiler. Here, our compiler is called the interaction mechanism and it is stored in the interaction module. The interaction module is the architecture part of the agent, which manages interaction and proto cols. The inte raction module is composed by the int eraction mechanism wh ich unfolds proto cols and two libraries for prot ocols and micro-p rotocols (see Figure 11). A protocol is composed of two parts: a first part , which is a finite state auto mato n, and a second part , whi ch is the semantics, correspo nding to the transition s. Since the int eraction mechanism can handle all protocols in CPD L language, designer s have just to defin e semantics for all the se transition s. Actually, transition semantics depend s on



 436



Marc-Philippe Huget and Jean-Luc Koning



Validation Protocol Synthesis



Figure 10. Direct run of protocols in CPDL.



Agent



Reasoning module



I ~on/



module



IlaterecUoR



-,



Mecbanism



FAviroRmeRt



-



~I Mkro-po~



-.



library



I



8



Information exchaDge (COIllDlWlicaliOll. kDowledge) -Use



Figure 11. Interaction module for the management of protocols.



the beliefs of the agent so a "universal" interaction mechanism can have all the possible semantics. A direct run of protocols can be found in [44]. 5.2.3. Comparison of these two approaches



The protocol synthesis approach is automatic and designers only have to define the semantics for all the transitions. However, this synthesis can be expensive in lines of



 Engineering interaction protocols for multi agent systems 437



code as the number ofstates and transitions tends to increase. Direct run ofprotocols in CPDL decreases designer's effort if protocols use FIPA communicative acts. Actually, the interaction module manages yet the semantics of these communicative acts. The protocol synthesis is a convenient approach since it is possible to reuse the program for simulation and for the conformance testing. However, the synthesis has to be re-made whenever the protocol is modified and designers have to insert the semantics again. Protocols in direct run approach can be easily modified since it is not necessary to re-do the synthesis of the protocol again. In conclusion, protocol synthesis is an interesting approach once there are just few protocols and they do not have a lot of states and transitions. As protocols tend to increase, direct run of protocols may be better. 6. THE CONFORMANCE TESTING STAGE



Previous stages in this proposal of interaction protocol engineering offer the possibility to formally design, to check some properties and to synthetize an operational version of the protocol. As noted by Lamer and Serhrouchni [51) and Holzmann [5), the translation from a formal description of a protocol into an operational one can insert errors and misconceptions and some features described in the specification of requirements can be absent. The conformance testing stage checks if the operational version of the protocol always has the properties defined in the specification of requirements and, of course, checks if wrong properties are absent. The conformance testing stage differs from the validation one since the conformance testing checks properties on an operational version whereas the validation stage checks properties on a formal version of the protocol. The conformance testing algorithm is given in Figure 12. Designers have a formal description of the protocol (specification s in the figure), then they synthetize an operational version (implementation i) from this formal version which will be checked. A set of tests (Ts in the figure) is then defined from the specification s. The conformance testing corresponds to the application of these tests to the implementation. The protocol is proved if all the tests manage on the protocol. If one test fails, the designer has to find the problem and return to the design cycle in order to correct it. There exist two methods in order to define the tests: either designers have the formal version of the protocol and they define a set of tests that strictly match the protocol; or they only have the features ofthe protocol and, they only define a set of tests which partially fit the protocol. This latter case is frequent in the communication protocol domain where a protocol of Nth level can use protocols coming from N-l th level and these protocols are not provided with their formal definition. We only consider designers having the formal version of the protocol in our explanations. The conformance testing is similar to the test stage in software engineering. Actually, in software engineering, one checks if when one gives data to a software, the behavior of the software is that expected. This is the same principle for interaction protocols. The algorithm looks for the checked state of the interaction (the state e1 in Figure 13, then applies some input data and checks if the output state ofthe operational version of the protocol corresponds to the one of the formal version.



 438



Marc-P hilippe Huger and Jean- Luc Koning



specification s



Gplementation i Test suite Ts



Figure 12. Co nformance testing.



Co nformance testing needs to generate the accessibility graph . The algorithm is defined previously in the validation stage section (see section D). W hen the graph is created , one has to apply the following algorithm proposed by [5]: For all vertices j going out the state i, execute the three following steps: 1. Define the set of needed messages in order to reach the state i. 2. Apply this set of messages to the operational version of the prot ocol and then apply the vertice j . 3. Check if the out state of the operational version of the protocol matches with the form al one. 7. CONCLUSION AND PERSPECTIVES



Even thou gh a numb er of suggestions concerning the enginee ring of interaction protocols have been proposed through this chapter, some work is still needed. • The analysis stage leads to a document called the requirements that encompasses a set of different fields. It would be int eresting now to provide designers with a set of guid elines that would help them fill out those fields.



 Engineering interaction protocols for multiagent systems 439



pl



p2



p3



Figure 13. Conformance testing example.



• The formal description stage has been thoroughly detailed here. For now, there are three graphical languages; namely Agent-UML which is the merging of UAML and AUML, EAUML which extends Agent-UML, and our language called UAMLe which extends UAML. Both these extensions should be merged into Agent-UML, which should incorporate such notions as exceptions. Besides, one still needs to intrinsically incorporate Agent-UML within our approach and adapt the DIP tool by offering users the possibility to visually define their protocols via Agent-UML. • The literature in communication protocol engineering attests that protocols are getting more and more complex, and therefore validating them has become a rather difficult task especially when dealing with the generation of a complete graph of accessible states. The current state of the art shows a number of solutions/algorithms to remedy this problem. It seems promising to apply such algorithms to interaction protocols. For instance, on-the-fly validation algorithms allow validation of a graph as it is being generated as well as the identification of cases where a functionality is absent. The validation stage detailed in this chapter emphasized a modular validation of protocols described in CPDL. The main example dealt with the protocol termination. One should investigate other properties and consider whether a micro-protocol's environment may have an influence on another therefore leading to the handling of contexts. Such an approach may be extended to model checking so as to obtain a modular model checking. The validation stage presented so far does not take into account the real function of an interaction protocol. It could be extended in this way. Besides the agents are not part of the validation process. In a more thorough approach they are. • The interoperability test does not belong to our protocol engineering development but one should investigate what could be its meaning for interaction protocols and which tests must be really conducted. This could be performed at the level of ontologies or at the level of communication protocols between agents in order to obtain a common ground between all the agents.



 440 Marc-Philippe Huget and Jean-Luc Koning



• The last stage in the life cycle in software engineering has to do with maintenance. Such a stage does not belong to the interaction protocol engineering we are suggesting, neither does it belong to the communication protocol engineering. It would be appropriate to provide utility and quality measures upon the designed protocols and check afterwards how adequate is the obtained protocol. In a second step, one could have the agents modify their protocol themselves depending on their own needs. Since a protocol is a composition of micro-protocols, an agent could replace any of its micro-protocols by a more appropriate one and check whether the results are any better. One step even further, agents could build up their own micro-protocols, either according to the works on dialogism [53], or by making use of patterns in order to make new micro-protocols. REFERENCES [1] Wooldridge, M., Agent-Based Software Engineering, lEE Proc. Software Engineering, 144,26,1997. [2] Pitt, J., Anderton, M. and Cunningham, J., Normalized Interactions Between Autonomous Agents, in Proc. International Workshop of the Design ofCooperative Systems, Juan les Pins, 1995. [3] Greaves, M. T, Holmack, H. and Bradshaw, J., CDT- A tool for agent conversation design, in Proc. of1998 National Conference on Artificial Intelligence (AAAI- 98) Workshop on Software Tools for Developing Agents, Madison, AAAI Press, 1998, 83-88. [4] Odell, J., Van Dyke Parunak, H. and Bauer, B., Extending UML for Agents, in Proc. ~f the AgentOriented Information Systems Workshop at the 17th National conference on Artificial Intelligence, Austin, Texas, Wagner, G., Lesperance, Y. and Yu, E., Eds., !cue Publishing, 2000. [5] Holzmann, G., Design and Validation of Computer Protocols, Prentice-Hall, 1991. [6] Lai, R. andJirachiefpattana, A., Communication Protocol Specification and Verification, Kluwer Academic Publisher, 1998. [7] Facchi, C, Haubner, M. and Hinkel, 0., The SOL Specification of the Sliding Window Protocol Revisited, in Technical ReportTUM-19614, Technische Universitat Miinchen, 1996. [8] Schieferdecker, I., Abruptly Terminated Connections in TCP - A Verification Example, in COST 247 International Workshop on Applied Formal Method in System Design, Brezocnik, Z. and Kapus, T, 1996. [9] Salomma, A., Theory ofAutomata, Pergamon Press, 1969. [10] Jensen, K., High-Level Petri Nets, Theory and Application, Springer-Verlag, 1991. [11] Turner, K.J., Using Formal Description Techniques-An Introduction to Estelle, Lotos and SDL,John Wiley and Sons, Ltd., 1993. [12] Holzmann, G., The Model-Checker SPIN, IEEE Transactions on Software Engineering,S, 23, 1997. [13] Manna, Z. and Pnueli, A., The Temporal Logic of Reactive and Concurrent Systems, Springer, 1992. [14] Koning, J.-L. and Huget, M.-P, Validating Reusable Interaction Protocols, in Proc. The 2000 International Conference on Artificial Intelligence (IC-AI 00), Las Vegas, Nevada, Arabnia, H., Ed., CSREA Press, 2000. [15] Huget, M. -P, Koning, J.-L. and Bergia, 1.., Une plate-forme pour l'ingenierie des protocoles et son application au projet de teleenseignement Beghera, in Proc. Svstemcs multiayents: mhhodolo.Rie, technologie et experience, 8emes [oumees Francophones Intelligence Artificielle Distribuee et Svstemcs MultiAgents,JFIADSMA'OO, Pesty, S. and Sayettat-Fau, c., Eds., 2000, 297-301. (In French) [16] Cox, B., Tygar, J. D. and Sirbu, M., NetBili Security and Transaction Protocol, in Proc. Of the First USENIX Workshop in Electronic Commerce, 1995. [17] Jacobson, I., Christerson, M., Jonsson, P. And Overgaard, G., Object-Oriented Software Engineering: a Use Case Driven Approach, Addison-Wesley, 1992. [18] Barbuceanu, M. and Fox, M.S., COOL: A Language for Describing Coordination in MultiAgent System, in Proc. First International Conference on Multi-Agent Systems (ICMAS95), San Francisco, AAAI Press, 1995, 17-24. [191 Kuwabara, K., Ishida, T and Osato, N., AgenTalk: Coordination Protocol Description for Multiagent Systems, in Proc. First International Conference on Multi-Agent Systems (ICMAS95), San Francisco, AAAI Press, 1995.



 Engineering interaction proto cols for multiagent systems 441



[20) Kuwahara, K., Ishida, T. and Osato, N., AgenTalk: Describing Multiagen t Coordination Proto cols with Inheri tance, in Proc. Seventh IEEE lnternational Conferenceon Tools with Arrificial lntellioence, Herndon , IEEE Press, 1995, 460--465. [21] U eno, I. , Yoshida, S. and Kuwabara, K., A Multi-Agent Approach to Service Integration, in Prof. Practical A pplications of Imelligelll A,lIm ts and Mlllti-A,lICllt Sysrems (PA A M), Lon don, 1997. [22] Haddadi, A., Comlmmicatioll and Cooperation ill A,lIellt Systems: A Pra,llmatic Theory, Lecture Not es in Computer Science, 1056, Springer-Verlag, 1996. [23] Burmei ster, B., Haddadi, A. and Sundermeyer, K., Generi c, Configura ble, Cooperation Protocols for Multi-Agent Systems, in Proc. Modelljll,~ A utonomolls A,lIellIs ill a Muiti-Agcnt World (MA A lvIA W), N euchatel, Castelfranchi, C and Miiller,j.-P.. Eds., Springer-Verlag, 1995, 157- 171. [24) Cost, R . S., Chen, Y, Finin, T. and Labrou, Y., Modeling Agent Conv ersation with Co lored Petri Net s, in Proc. Alltonomous A,~ell ts'99 Special IH"ksllOp 011 COtwersatiol1 Policies, Bradshaw, j. , Ed., 1999. (25] Konin g,j.-L., Francois, G. and Demazeau, Y , An Approach for Designing Negotiation Protocols in a Multiagent Systems, in Prof. 15th IFIP World Compllter COll,llrcss, IT &KN OWS Corfercnce, Vienna, C uena, j. , Ed., Chap man and Hall, 1998. [26] Fisher, M. , A Survey of Con current METATE M- Th e Language and its Applications, in Prot. First International Conf erence on Temporal Logic (IC TL), Bonn , 1994. [27J Fisher, M . and Wooldridge, M ., Specifying and Executing Protocols for Coo perative Action, in Prot. International Wc"king Conference on Cooperating Knowledye-Based Systcms (CKBS-94J, Keele, 1994. [28] Wooldridge, M., Rational A,lIents, MIT Press, 2000. [291 Luck, M . and d'In verno, M ., Structuring a Z Specification to Provide a Form al Framework for Autonomous Agent Systems, in Proc. Z UM'95 : The Z Formal Specification Notation, 9th International Conference of Z Users, Bowen, j. and Hinchey, M ., Eds., Lecture Notes in Computer Science, 967, Springer-Verlag, 1995,47-62. [301 d'Inv erno , M. and Luck, M ., Understanding Auton omous Interaction, in Proc. 12th Ellropcall Cinierente all Arrificial Intcl/igellcc (ECA I), 1996. [31] d' Inverno, M., Kinny, D. and Luck, M ., Interaction Protocol in Agentis, in Prof. 77rird lntemational Conjerence all Mlllti-A,lIellI Systems (ICAH S98), Paris, Demazeau, Y , Ed., IEEE Press, 1998, 112119. [32J Koning, j.-L., AGIP: A Tool lor Automating the Generation of Conversation Policies, in Proc. 16th IFIP IVorid Computer CorlJireSS, llltel/igellt Inj ormation Protessiny (IFIP-OO), Beijing, Shi. Z., Ed., 2000. [33J Iglesias, C, Garrijo, M ., Gonzales, j. and Velasco, j. , Design of Multi-Agent System Using MASCommon KADS, in Prof. of ATA L 98, Workshop all AJiell t Theories, A rchitectures 


 442



Marc-Philippe Huget and Jean-Luc Koning



[45] Clarke, E. M., Grumberg, O. and Peled, D. A., Model-Checking, MIT Press, 1999. [46] d'Argenio, P. R., Katoen, J.-P., Ruys, T. and Tretmans, J., The Bounded Retransmission Protocol Must Be On Time!, in Proc. of the 3rd International VVorkshop on Tools and Algorithms for the Construction and Analysisof Systems, Lecture Notes in Computer Science 1217, 1997,416-431. [47] Wen Wand Mizoguchi, E, A Case Study on Model-Checking Multi-Agent Systems Using SMV, in Technical Report, Science University of Tokyo, 1998. [481 Lacey, T. and DeLoach, S. A., Verification of Agent Behavioral Models, in Proc. The 2000 International Conference on Artificial Intelligence (IC-AI 2000), Las Vegas, Arabnia, H., Ed., CSREA Press, 1998. [49] Lacey, T. and DeLoach, S. A., Automatic Verification of Multiagent Conversations, in Proc. Eleventh Annual Midwest Artificial Intelligence and Cognitive Science Conference, University of Arkansas, 2000. [50J Chu, P-Y M., Towards Automating Protocol Synthesis and Analysis, PhD Thesis, Ohio State University, 1989. [51] Lofflcr, S. and Serhrouchni, A., Protocol Design: from Specification to Implementation, in Prot, 5th



Cpen Workshop for High-Speed Networks, 1996.



[52] Nilsson, N. J., Principles ofArtificial Intelligence, Tioga, 1980. [53] Chicoisne, G., Ricordel, P-M., Pesty, S. and Demazeau, Y, Outils et Pistes pour la Pratique du Dialogisme entre Agents, in Ptoc. 6emes [ournees Francophones sur l'Intelligence Artificielle Distnbuie et les Svstemes Multi-Agents (fFIADSMA-98), Pont aMousson, Barthes, J.-P, Chevrier, V. and Brassac, C, AFCET AFlA, 1998, 163-176. (In French)



 VOLUME V. NEURAL NETWORKS, FUZZY THEORY AND GENETIC ALGORITHMS



 NEURAL NETWORK SYSTEMS TECHNOLOGY AND APPLICATIONS IN CAD/CAM INTEGRATION



YONG YUE, LIAN DING AND KEMAL AHMET



1. INTRODUCTION



With the growing trend towards global market, industry is facing fierce competition. Traditional design and manufacturing practice is no longer suitable for the new requirements. It has been widely recognised that genuine integration of design and manufacturing is needed to make products ofhigher quality with lower cost .md shorter lead times. Although CAD and CAM have been extensively used in industry, effective CAD/CAM integration has not been implemented and human intervention is often required to interpret design data and intents to downstream applications. One of the major obstacles of CAD/CAM integration is the representation of design and process knowledge. Geometrical models only provide the geometric and topological information of a component, which is not sufficient for the manufacturing applications, e.g. process planning. Thus, features encapsulating the engineering significance are considered as a key element in the integration of design .md manufacturing and feature-based models have been widely used. Computer-aided process planning (CAPP) is considered as another key element for CAD/CAM integration, which automates to some extent, the decision making in process selection, sequencing and parameter calculations. Much ofresearch has been seen in CAD /CAM integration using various approaches and techniques, such as design by features, feature recognition, artificial neural networks (ANNs), fuzzy logics and genetic algorithms (GAs). Considerable progress has been made towards a genuine integration in practical applications, especially 111 tackling interacting features and inability of self-learning tor certain tasks. 3



 4



Yang Yue, Lian Ding and Kemal Ahmet



This Chapter provides a comprehensive review of the current developments in feature recognition and CAPP for CAD/CAM integration. It covers neural networkbased applications that may incorporate other contemporary techniques such as fuzzy logics and GAs. 2. ARTIFICIAL NEURAL NETWORKS



An ANN is an interconnected assembly of simple processing elements, units or nodes, whose functionality is loosely based on the animal neuron [1]. The function of an ANN-based system is determined by four parameters: the net topology, training or learning rules, input node characteristics and output node characteristics [2]. Neural network-based methods can eliminate some drawbacks ofthe conventional approaches, and therefore have attracted research attention particularly in recent years: • it can tolerate slight errors from input during learning or problem solving; • it is faster because the process is limited to simple mathematical computations and does not use either a search or logical rules to parse information; and • it has the ability to derive rules or knowledge through training with examples and can allow exceptions and irregularities in the knowledge/rule base. This chapter gives a brief introduction to ANN techniques and its applications in feature recognition and CAPP 3. ANN TECHNIQUES FOR FEATURE RECOGNITION



Feature recognition extracts features from a model ofan object for a specific application. As one ofmajor feature technology, feature recognition has been regarded as a key tool to integrate design and manufacturing processes. Various methods have been proposed such as graph matching, rule-based reasoning, volume decomposition, multiple feature modelling and hybrid approaches. However, none of them have the learning capability of artificial neural networks (ANNs). This section discusses the following aspects of ANN applications to feature recognition: the network topology, input representation, output format, training or learning method, and a summary of the results. 3.1. The topology



There are three main ANN architectures: feed-forward, recurrent and competitive networks [1]. 3. 1.1. Feedforward networks



Feedforward network is the network whose neurons are strictly fed forward to activate the neurons in the next higher layer. As shown in Figure 1, for a typical single-layer feedforward network, its input neurons are fully connected to output neurons, but not vice versa. On the other hand, the input neurons are not connected to other input neurons and the output neurons are not connected to other output neurons. Different from the single-layer feedforward network, a multi-layer feedforward network has one or more hidden layers. Although there is no theoretical limit on the number of hidden



 Neural network sysrems technology and applications in CAD/CAM integration



5



~•



WIl Wil



•



Wnl Wlj Wjj Wnj Wl m



lG• •



wnm-+G Wjm~



Input Layer



Output Layer



Figure 1. Single-layer feedforward neural network.



Output layer



Hidden layer



Input layer Figure 2. Three-layer feed-forward neural network model.



layers, in general, only one or two hidden layers are used in practice. It has been shown theoretically that it is sufficient to use a maximum ofthree layers (two hidden layers and one output layer) to solve an arbitrarily complex pattern classification problem [3]. By adding one or more hidden layers, multi-layer feedforward networks can solve more complicated mapping problems than single-layer feedforward networks. Figure 2 illustrates an example ofmulti-layer feedforward network structure with one hidden layers. 3.1.2. Competitive networks



In a competitive network, a group of neurons compete for the opportunity to become active. Generally, in order to activate only the winner neuron, an algorithm is used to assign the winner neuron or inhibit other neurons. Typically, it includes two layers: the input layer which receives input information, and the competitive layer which outputs the winner. Figure 3 gives an example of competitive network structure.



 6



Yong Yue, Lian Ding and Kemal Ahmet



G•



~



• •



•



G•



01



~



O· . 1



•



•



~



InputLayer



Competitive Layer



Output



Fig ure 3. Compe titive neural network.



• ••



••• Xj



w·



w Figure 4. Ho pfield network.



3. 1.3. Recurrent networks



A recurrent network distinguishes itself from a feedforward neural netwo rk in that it has at least one feedback loop [4]. Typically recurrent networks includ e Hopfield net , Boltzm ann machine and recurrent backpropagation net. In the structure depicted in Figure 4, the outp ut of each neuron feed int o the input of other neurons in the same layer so that the network is " recursive" . Comparing to competitive networ k and recur rent network , feedforward network is more suitable for feature recogniti on . The reasons are • M anufacturi ng feature recognition is not the kind of recogniti on based on the input probability, but it is possible to collect eno ugh samples including inpu ts and targets.



 Neural network systems technology and applications in CAD/CAM integration



7



• Manufacturing feature recognition is a complicated process, for which entire information including both geometric and topologic information needs to be inputted. The typical topology used by most feature recognition system can be defined by an input layer of neurons that receive binary or continuously valued input signals, an output layer with a corresponding number of neurons, and a number of hidden layers that are highly interconnected [5]. Hence, the design of network topology can be reduced to three problems: the number of hidden layers, the optional number of neurons in the hidden layer, and the use of networks with incompletely connected layers. At present, three main feed-forward architectures have been developed for feature recognition as described below. 3.1.4. The three-Iayerfeedjorward neural network



This model has an input, a hidden and an output layer (shown in Figure 2). Neurons on the hidden and output layers are defined from the neurons on the previous layer, the weights and a processing algorithm. That is, the lth neuron on the current layer, NI, can be calculated as:



L «iw«. 1/



N1 =



k=1



where Uk



is the kth neuron on the previous layer, and is the weight representing the strength of the relationship between the kth neuron on the previous layer and the lth neuron on the current layer.



tv kl



In order to constrain the value of each neuron on the current layer, an appropriate transfer function is used. For example, in order to restrict the output value from 0 to 1, a sigmoid transfer function can be applied, such as Chuang's system [6]:



The Bipolar sigmoid function is a commonly used activation function to make the output fall in a continuous range from -1 to 1. It is closely related to the hyperbolic tangent function, which can be described approximately by the following equation f(x)



= tanhax = 1 + e- 2a x



There have been several instances of using three-layer feed-forward neural networks, such as [6,7,8,9].



 8



Yang Yue, Lian Ding and Kemal Ahmet



Threshold layer Output layer



Hidden layer Bias



Input layer



Figure 5. Four-layer feed-forward neural network model.



3. 1.5. Thefour-layer feedjorward neural network



This approach uses four layers: an input, a hidden, an output, and a threshold layer, which is added to the network as the training is completed. The threshold layer performs the function of activating the neurons of the output layer by a threshold, e.g. 0.5. Nezis and Vosniakos [5] provided an example of this kind of topology. As shown in Figure 5, there are 20 neurons in the input layer, 10 neurons in the hidden layer and eight output neurons. All elements of the hidden and output layers are connected with a bias element that can be considered as an activation threshold. Although four-layer feed-forward networks are more versatile than three-layer feed-forward networks, they train more slowly due to the attenuation of errors through the non-linearities [10]. 3. 1.6. Thejive-layer, perceptron: quasi-neural network



Prabhakar and Henderson [2] developed a five-layer, perceptrons quasi-neural network system called PRENET. The system has five layers which respectively, consist of N nodes, N groups of M nodes, N nodes with a threshold non-linearity, M nodes corresponding to the M conditions for a feature, and one node, where N is the number of faces in the test part and M the number of conditions required for the feature. 3.2. Input representation



Neural nets typically, although not necessarily, receive a set of integer values. The problem then is how to convert a solid model to a format suitable for neural net input in a convenient and efficient way. There are three basic characteristics for a satisfactory input representation: • Complete information (e.g. faces, edges and vertices) for feature recognition. It is extremely important that this representation describes features correctly and does not distort any information. • An identifiable format by the input layer of the neural network. • A unique input representation without overlaps. In other words, features belonging to different feature classes must have different input representation.



 Neural network systems techno logy and applications in C AD /CAM Integration



9



Input representation is a key task for AN N -based featu re recogmnon , and several studies have been carried out. However, some probl ems are still to be solved, such as limit ed range of recognised featur es, large size and ambiguity. Gene rally, the prop osed input represent ation s can be broadly classified int o the following types. 3.2. 1. 2D [eature representation



In enginee ring drawin gs, th e wire- frame profiles of shapes can be subdivided into co nnected loops of edges. Peters [8] propos ed an ordered triplet (C j , A j , L i ) to represent each edge of a co nnected loop, w here C j , A; and L; are the curvature , inter ior angle and arc length of the ith elem ent respectively. An en cod ed feature vecto r of the triplet (C; , Ai , L i) for a given profile is used as the inpu t. C hen and Lee [11] developed an improved encoded feature vecto r, ill which th e represent ation of each edge is expande d from an ord ered 3- tuple to an ordered 7tupl e in the form: (L i, Ai, Ci , J i , 0 L" 0 Ai, OCi ) wh ere J i is th e int ersection type between the line segment and its subsequent line segment, and O L i , O.i., and oe i are the ordinal values assigned to L i , Ai and C, respectively. T he ordi nal values are assigned to the parameter in order to capture the magnit udes. Th e input layer has thirty-fi ve neurons corre sponding to five edges, seven neurons representing each edge. Although this method can recognise th ree and four-sided features, more neurons are needed in the input, output and hidden layers when the num ber of edges ofa feature is increased. 3.2.2. Face adjacency matrix code



A face adjace ncy m atrix is a 2D array of intege r vectors convert ed from a solid mo del. Each integer vector represent s a face and its relationship to ano ther face, i.e, adj acency or commo n edge. The length ofan intege r vecto r dep ends on the number of parameters considered for the recogniti on of a feature . In Prabhakar and Hen derso n's work [21 , the vecto r has eight int egers indicating characteristics such as edge type, face type, face angle type, number ofloops, etc. This method is limit ed to features defined by a pri mary face and a set of secon dary faces. It canno t differenti ate between features with th e same topology but different dime nsions of com poun d faces. 3.2.3. Face score vector



T his represents the relationship between the main face of a feature and its neighbouring faces [7]. The eight-element face score vector is formed in three steps.



= .r



a) A face score is defined as F, (F~, E~, ~, A t), w here F, is the face score, F~, E~ and ~~ are the information abou t the face, edge and vertex geometry, and A t is the adjace ncy amo ng the faces, edges and vertices. A high face sco re indicates J likely featur e face, w hich in turn indicates the additio n or remo val of material. b) A face score graph representing the relationship offace scores between a tace and its neighb ouring faces is drawn based on th e face scores for all faces of the given object. A non- zero difference between a face score and its neighbouring face scor e indicates a geometric or top ological change between th ese faces, which form a region . T he region may be defined as a featur e.



 10



Yang Yue, Lian Ding and Kemal Ahmet



fi



Through-slot



Through-step



Figure 6. Features with the same face-edge graph.



c) An eight-element face score vector is formed and input to the net. Chan's work [12] is another example of a face score vector while Srinath [13] tackled partial features. This representation can recognise a very limited number of compound features, and there is no one-to-one correspondence between feature patterns and features. 3.2.4. Attributed adjacency matrix



An attributed adjacency matrix [5,14] describing the geometry and topology of a feature pattern is converted from the attributed adjacency graph (AAG) [15]. In Nezis and Vosniakos' research [5], the adjacency matrix (AM) is a 2D, square, binary matrix with two triangular areas: an upper and a lower which are the convex and concave spaces respectively. AM[i, j] and AM[j, i] indicate the connection between the ith and jth faces of the object. One of them belongs to the concave space and the other to the convex space. The representation vector is formed as follows: a) the AAG is broken into sub-graphs which are converted into AM using a heuristic method; b) each matrix is convert into a representation vector (RV) by interrogating a set of 12 questions about the AM layout and the number of faces in the sub-graph; and c) a binary vector is formed combining the 12 positive answers and the other 8 elements corresponding to the number of external faces linked to the sub-graph. This method can recognise planar and simple curved faces, but it still has several problems, such as ambiguity (e.g. Figure 6), non-unique (e.g. Figure 7), limited features recognised, etc. 3.2.5. F-adjaccncy matrix and V-adjacency matrix



Aiming to solve the problems of AAG, an input representation with two matrices is proposed [9]. The input scheme describes the topological and geometrical information of a feature as a spatial virtual entity (SVE), which is an equivalent to the volume



 Neural network systems technology and applications in CAD/CAM integration



I 2



13



r.l



Ii



Ii



f3



1 0 1



1 1



1



1



0 1



11



f4



1 1 1



Matrix A



fl



"2



"3



"4



fl



f2



f3



0 1 1



1 1



1



0



1 1



r. 1 1 1



Matrix B Figure 7. A feature with two matrices.



removed from the initial material stock in order to obtain the final boundary of the feature. The input representation works in three stages. a) Employing a depth-first method to search for all faces in the feature SVE, building its UndiGraph and determining the Node Sequence. The UndiGraph is defined as a face graph describing a feature pattern: UndiGraph = (F, R),



where F is the finite non-NULL set of faces consisting of the feature, and R is a set of relationships between faces: R = {FR}. FR is a relationship with no specific direction between two faces. A Node Sequence corresponding to each face is defined as the following: NSface;



= N('lO + (6 -



Nv) + Tjr ype'O.l,



where



NSfacei is the Node Sequence offace i; Nf is the number of adjacent faces of face i; N; is the number of adjacent virtual faces of face i; and 7jrype is the value of the type offace i (the value allocated is shown in Table 1).



 12



Yong Yue, Lian Ding and Kemal Ahmet



Table 1 Value of face type Face type



Value



Cylindrical face Part-cylindrical face Conical face Part-conical face Semi-spherical face Planar face Linear-group Circular-group



I 2



3 4 5 6 7 8



5: 180"-+----+----+ 1: in the same surface 9: parallel



7: 2700



0: no relationship



Figure 8. Values of relationship between two features.



b) Defining the F-adjacency matrix F-adjacency matrix is defined as j <= 5.



a2.1



a21



Ir



=



a31



a32



a41



a42



a43



as!



aS2



a 53



a24



a 5



a.\4



a .5



h



= [ajiliXj, where 1 <= i <= 5 and 1 <=



a 5 a54



The layout of h is ergonomically designed to have a one-to-one correspondence between the feature pattern and the input matrix. The middle elements of IF, i.e. a,i, show the type of the ith face, face, (e.g. 6 for a planar face). Table 1 denotes the values for various face types. Other elements of h (ai), where i =f. j) indicate the connection between the ith and jth faces of the object. A numerical value between o and 9 is allocated according to the relationship between the two faces. The values are given in Figure 8. The layout presentation of h is symmetrical so that the input format consists of 15 nodes, all, aI2,"" a15, a22, a23,"" a25, .•• , a55· c) Constructing the V-adjacency matrix



 Neural network systems technology and applications in C AD / CAM integration



13



T he V-adjacency matrix I II can be determined in the following steps. Step 1: Determine three pairs of boundary planes in the x , y and z directions, which can be represented as +x , - x , + y, - y, + z and - z. Step 2: Define the SVE for the given feature, which is completely enclosed based on the above six directions. Step 3: Attach the attribu tes of FF/VF, whic h are face types used to define the feature [9), to all faces in the SVE. Step 4: Define a 6*6 matrix, I II showing the relationships between VF faces in the SVE. The middle element, bu, show whether there is a VF in the corresponding direction . If in the ith direction (e.g. +x), the given SVE exists a VF face, and bii = 1; if no t, bi ; = O. Other elements , b,) (i i= j ), describe whe ther the two VFs, corresponding to direction i and directio n j , are connected or not (i.e. 1 or 0). b" b21



Iv =



b 12 b22



bl} b 14 b2}



b 15



b1(,



b24



b25



b26



h I b32 b33 b34



&.15



b .16



b 41



b42



&4.1



b44



&45



&46



&51



&52



b 5J



&54



b55



&5(,



b6 1



b62



b6.1



b64



b65



b6(,



Similarly, the symmetric characteristic of V-adjacency matri x is used to simplify the input. A vector consisting of 21 code s is input to the neural network, That is, b11, b12, ... , b16, b22 , b23 , . . . , bi «. . . . , b66 · With the number of faces increased, the size of matr ix will become quite large. For example, the pocket shown in Figure lJa consists of seven faces and the size of the adjacency matrix will be 7*7. In practice, the size of the matrice s can be reasonably decreased . As shown in Figure 9a, the topological information is similar to the pocket in Figure 9b and can be described as the graph shown in Figure 9c. If the number of faces in the ordere d adjacency list (OAL) is larger than 5, the OAL should be simplified. T he rules for simplification are described below. Rule 1: A series of faces,facc;,fa ce;+ 1, .. . ,face; +", arc regarded as a Linear Group if they satisfy th e following condi tions:



• int (NSi+ j ) = 35, where j = 1, . . . , n - 1; • int(N S,+j ) = 24, where j = 0, II; • They are consec utively connected; • face; and [ace,+11 are not connected with each other. Rule 2: A serial offaces,.{


• int (NS;+j ) = 35, where j = 0, 1, . . . , II; • They are consec utively connected; «[ace, and f ace;+11 are connected with each other.



 14 Yang Yue, Lian Ding and Kemal Ahmet



(b)



(a)



(c) Figure 9. Simplification of topology.



3.2.6. 2D input patterns of 3D feature volume



Zulkifli and Meeran [16] presented an input matrix based on a cross-sectional method. The B-rep solid model is searched through cross-sectional layers and converted into 2D feature patterns, which are then translated into a matrix appropriate to the network. Four input matrices correspond to four feature classes: simple primitive, circular, slanting, and non-orthogonal primitive features. There are several disadvantages, e.g. simple primitive features are limited to four rectangular vertices, and features with non-orthogonal faces in the z direction cannot be dealt with. 3.2.7. A vector based on thepartitioned view-contours of a given object



The given object is represented by nine partitioned view-contours from +x, -x, +y, -y, +z, -z, x, y and z respectively. The vector is built in three steps. a) A graph with a representative ring code is defined from a partitioned view-contour in which the nodes represent the regions and the arcs the adjacency relations among the regions; the representative ring code is a cyclic string of digits formed for each region based on both the graph and a two-layer octal corling system.



 N eural net work system s tech nology and application s in C Al ) / CAM Int egration



15



b) Based on the weighting value co mputed with the represent ative ring code. the graphs are converted to a reference tree in whi ch each node is associated wi th () + III values using heuristics from several experiments, assumi ng each grap h nod e has at most m + 1 adjacent nod es. c) T he vector is then generated with th e first 6 + III elements for the tree roo t and the next 6 + m elements for the seco nd tree node ranked , and so on. This meth od is onl y suitable for block-shaped obj ects with rectangular view-conto ur boundaries. The work by C huang et al [6] provides an example. 3.2.8. Simplified sheleton



A simplified skeleton is a tree struc ture with line segm en ts r17J represente d by an input vecto r that is for me d in the followin g process: a) A standard tree structure in whi ch each parent branch has th e same number of descendants is predefined; b) A simplified skeleton with several standard trees is repre sent ed; c) Six attributes of a bran ch in th e standard tree to describ e each real link (no n- null assignme nt bran ch) and the spatial relationships amo ng th em are defi ned; and d) The standard tree is co nverted into a vector in whi ch each elem ent co rrespo nds to a branch ; there can be several standard trees for a simplified skeleto n. This representation can be used to classify 3D prismatic par ts, but only the conto ur information of the part is considered . 3,3. The outp ut for m at



T he ou tput of an ANN is the result of many operations with th e inp uts and weights. Commo nly, a good outpu t format should have the following characteristics: • Representation sche me: It is essential that the output describ es the results clearly and correc tly as expected. • Appropriate format: Similar to the input format, th e output is designe d as a noda l value in the format of a vector . • Activation pr inciple: For feature recogn itio n, it is not practicable to activate two classes at the same time. If one or more outp ut neurons are activated. the pattern present ed to the network does no t belon g to a known class. Based on th e information in th e ou tput vector, there are thr ee types of out put format. 3.3.1. Each /l eH YOl1 corresponding to a}'afllrc class



If the number of feature classes to be recognised is not large, it is possible that each output neuron represents a cer tain featur e class. At this situation, only one of the output neurons will be activated (e.g. its value will be th e greatest one and greater th an th e thr eshold value, 0.5). Man y systems adopt this output for mat. In Chen and Lee's work



 16



Yang Yue, Lian Ding and Kemal Ahmet



[11], for example, the six neurons on the output layer represent six feature classes: rectangle, slot, trapezoid, parallelogram, v-slot and triangle. Nezis and Vosniakos' system [5] provides eight output neurons corresponding to eight feature classes. In the first level of recognised system developed by Ding and Vue [9], five output neurons are designed for five basic feature classes: Hole, General Hole, Conical Hole, Slot/Step and Pocket. 3.3.2. Neurons representing the information of the recognised feature



Hwang [7] uses six neurons for the output, representing the class, name, confidence factor, the main-face name, the list ofassociated faces ofthe feature found, and the total execution time. A file storing the information ofdifferent features is constructed. In the file, each vector corresponding to a feature has thirteen elements as follows: the feature name, the number of elements in the weight vector, eight elements representing the weights, a threshold factor, a gain factor, and the number ofiterations before converging to an acceptable value. 3.3.3. A matrix.filecontaining thecodefor each recognised feature and its machining directions



The output has the information on tool access direction, which in turn reflects the feature orientation. Zulkifli and Meeran's system [16] is a typical example: the output is a binary matrix 0 = [b j)]' 1 ::'S i ::'S 2, 1 ::'S j ::'S 5, with b1) representing the codefor the feature recognised, and b2) showing the tool accessibility to machine the feature, namely +x, -x, +y, -y and -z directions. 3.4. The training method



The training or learning method determines how the network will react when an unknown input is presented [2]. Before the process of recognition, the neurons in the network have to be trained with some set of training features. The training process is generally classified as supervised learning or unsupervised learning. During supervised training, the correct class corresponding to the training pattern is given. The net produces an output based on its current weights, and compares it with the correct output. If there is a difference, the weights are adjusted according to a learning algorithm based on the output difference. 3.4.1. Back propagation



a(~orithms



Most ANN-based feature recognition work employs supervised training with a back propagation algorithm [5,11,16]. During back propagation a given input, called the training input, is mapped to a specified target output. The training process comprises four stages: a) the weights are initialised; b) training vectors/matrices are presented to the network; c) the actual and desired outputs are compared, and the network's error is calculated as the difference between its output and target-the mean squared error is commonly



 Neural network systems technology and applications in CAD/CAM integration



17



used as the test norm. In the system developed by Chen and Lee [11], the rootmean-square (RMS) error is defined by the equation:



RMS=



npn o



where is the target value for output neuron j after presentation of pattern p, is the output value produced by output neuron j after presentation of pattern p, n p is the number of patterns in the training set, and no is the number of neurons in the output layer. tj p



ajp



d) information about this error is propagated backwards to the hidden neurons and the weights adjusted accordingly. After a number ofiterations, the output will converge towards the target. The delta rule, also known as the Widrow-Hofflearning rule is used to modify the weights [5,11]. 3.4.2. Conjugate gradiellt algorithm by the authors



The basic back propagation algorithm adjusts the weights in the steepest descent direction (negative of the gradient) [18]. Although this direction makes the performance function (error function E) decrease most rapidly, it does not necessarily produce the fastest convergence. Alternative approaches, known as conjugate gradient algorithms, make a search along conjugate directions, which produces generally faster convergence than in the steepest directions. A set of mutually conjugate directions can be achieved through the following steps: a) An initial weight vector (wo) is chosen randomly; b) The steepest descent direction (do) is selected on the first iteration, which is the negative of the gradient (go), do = -go; c) The weights are updated by an optimal distance (called learning rate, (lk) along the current search direction, Wk+l = Wk + (lkdk where a, is determined using a line search method proposed by Charalambous [19], which minimises the error function along the current search direction; d) The stopping criterion (performance goal to satisfy the error set) is examined. If it is satisfied, the training stops; otherwise, proceeds to the next step; e) The new gradient vector of performance (g k+Jl is evaluated, which is orthogonal g k+1 = 0, where denotes the transpose of to the previous search direction,



ck '



d!



d!



f) Each successive direction (dk+l) is chosen as a linear function of the current gradient and the previous search direction (dk), dk+l = -gk+l + f3kdk; g) Set k = k + 1, go to step c).



 18 Yang Yue, Lian Ding and Kemal Ahmet



3.4.3. Training method by Prabhakar and Henderson



Prabhakar and Henderson [2] developed a system that allows the feature class to be stored in the net as it is defined. Although in a sense it used a supervised training method, it is not receptive to traditional neural net training. During a training session, the trained feature is presented only once, and the weights and other parameters are set at the same time. Thus there is a lack of fundamental quality to the learning. 3.5. Summary of ANN-based feature recognition



At present, the results of neural network-based methodologies are limited to a range of particular features which are outlined in Table 2. From the previous discussions, it can be seen that neural networks have the potential in devising general methods of feature recognition that are effective and robust. Most of the neural network-based systems have shown a higher recognition speed, and any features that are moderately similar to the training examples can be recognised. 4. ANN TECHNIQUES FOR CAPP



Process planning is a function that establishes a set of manufacturing operations and their sequence, and specifies the appropriate tools (machine tools, cutting tools, jigs and fixtures, etc.) and process parameters in order to convert a part from its initial raw Table 2 Capabilities of ANN-based feature recognition Work by



Features recognised



Hwang [7]



a. Simple and partial features whose main face must be directly connected to all its associated faces, e.g. pocket, slot, through-hole, blind hole and step b. Compound features formed by two or more non-intersecting simple features 3-D features that can be defined by one primary face and a set of secondary faces, e.g. flat bottom hole, through-slot, through-hole 2-D features: square, rectangle, parallelogram, slot 2-D features: bracket, circle 3-D features: block, hole, slot, pocket, groove, cylinder and boss Depression features: step, slot, blind step, blind slot, pocket, inverted dove tail slot, blind hole Some 3-D prismatic components a. Features such as slot, blind slot, step, pocket and hole which only have planar faces b. Simple curved faces 2-D features: rectangle, slot, trapezoid, parallelogram, V-slot and triangle a. Simple primitive features defined by four rectangular vertices, such as step. slot, blind slot and pocket b. Circular features c. Z-slanting features d. Non-orthogonal faces in the x and y directions 3-D block-shaped components 3-D prismatic components



Prabhakar and Henderson [2] Peters [8] Dagli [20] Chan [12] Gu et al. [14] Wu andJen [17] Nezis and Vosniakos [5] Chen and Lee [11] Zulkifli and Meeran [16]



Chuang [6] Ding and Yue [9]



 Neural network systems techno logy and applications in C AD/CAM integration



19



state to a final form predeterm ined from an engineering drawin g [21). The use of computer techn iques to auto mate the tasks of process planning has been the subject of extensive research. A variety of CAPP systems have been developed, using both retrie val and generative meth ods, with varying degre es of success. ANN techniques, which have been prevalent in recent years, offer promi sing potentials in solving CAPP tasks, such as selection of techni cal parameter s for a cutting tool, adaptation of cutting conditions, operation selection and operation sequence. 4.1. The topology



Four main ANN architectures have been used in CAPP are introduced: Feedforward network, H opfield network, Brain-State-in-a-Box (BSB) and MAXNET. 4.1.1. Feeclforward network



Most neural network-based CAPP systems use the feedforward architecture, e.g. Li et al [22], especially three-layer feedforw ard network because it is suitable for a mapping in a continuous decision region [23]. Examples include a network by Osakada and Yang [24] consisting of a 256-unit input layer, an 8-unit hidden layer and a 4-unit output layer to relate forming meth ods for rotationally symmetric produ cts; a net work with a 5-n euron hidden layer for manufactur ing evaluation by Gu et al [25]; and a network for selecting technological parameters for a cutting tool using the hyperholic tangent sigmoid function [26]. Park et al. [27] developed a four-l ayer feedforward network with two hidden layers to modify cutting condition based on several tests. Le Tumelin et al. [28] proposed a five-layer feedforward network to determine appropriate sequence of operations for machinin g holes. 4. 1.2. Hopfield network



Th e Hopfield netw ork is a single layer recurrent network that uses threshold process elements and an interconnect symmetric matrix as shown in Figure 4 [10]. A minimum point or attractor has been demo nstrated to be existence in this network, whi ch corresponds to one of the stored patterns. It can be describe d as the following [10]:



where



i = 1, . . . N, sgll represents the threshold nonlin earity (- 1, 1), and



b is a bias.



Th e dynamics of the H opfield network can be described by the state of an energy function which eventually gets to a minimum point. Therefore, optimal operation sequencing can be expected with the continuous downl oad trend of a global energy



 20



Yang Yue, Lian Ding and Kemal Ahmet



function. Shan et al [29] adapted the Hopfield network to the operation sequencing problem. Supposing the number of operations is n, the network is then composed of n 2 neurons, each identified by double subscripts: the operation and the sequence to be executed. The global energy function of the network is given by the following equations:



/1-1



E3



= FLLL f ki V /j V k ,j+ l , j



ki'i



i



where A, B, C, D and F are constants, vij is the output of neuron in position (i, j) of the matrix, and tki is tool travelling time form the position i to k.



The change in energy



to..Ei.l



=



~ E ij



[L L w uuvn + I



due to a change in the state of neuron is:



Iij]to..ViJ'



where



I ij



is a bias weight.



k



The weight connecting neurons kl and ij can be found as the following:



where



8"



~ (~



ifx=FY



¢xy



=



(~



if x S y



if x



=



Y



if x> Y



The Hopfield network provides one of the strongest links between information processing and dynamics. However, spurious memories limit its capacity to store patterns.



 Ne ural network systems techn ology and applications in C AD/CAM integration 21



4.1.3. Brain- State-in-a-Box (BSB)



As a discrete-time recurrent network with a continuou s state, th e output values of a BSB, consisting of interconnected neurons , depend on th e learnt patterns , the initi al values of given patterns and the recall coefficients. The motion of a ESB network can be describ ed by the followin g equation [1 0]:



f( lI )



=



I~



-1



if ll ::: !



if - 1 :'S 1I :'S ! ifu:'S - 1



A ESB can be used as a subnet for decision feedback applicati ons because it amplifies the present input until all neurons saturate, and eventually converges to one of the corn ers of the hypercube [-1 , 1]". Sakakura and Inasaki [30] adapted a BSB network in a CAPP system . The number of neu rons assigned for th e dressing depth of cut , dressing feed and surface roughne ss are 5, 5 and 9 respectively. Th e initial values are given by a feedfo rward network run at th e same time. T he BSB repeats, performing a calculation using the followin g equatio n until th e output value of each neu ron converges to a certain value : 0", = LIJ\tIIT(a", ), a lii



= (I L W "" 10 " + (2 0 + (3 0 ,,, (0). ,,#/1/



/1/



where



LIMIT( ) is the fun ction wh ich limit s the value in th e parenth eses between -1.0 to +1. 0; (1 , (2 , (3 are recall coefficient s; and 0", (0) is initial value of neu ron Ill . The limitation of the BSB network is that the location of th e attractors must be predefined as the vertic es of the hypercube. 4. 1.4. lvlAXNET



MAX N ET is a competitive network in which only one neuron will have a non-zero output w hen the competit ion is co mpleted. The net work co nsists of int erconnected neu rons and symme tric weights. There is no trainin g algo rithm for MAXNET and the weights are fixed as depicted in Figure 10 [31]. Its application procedure includes two steps: activation and initialisation of weights, and updating th e activation of each unit until only one unit responds.



 22



Yang Yue, Lian Ding and Kemal Ahmet



1



1



1



1



Figure 10. Example of MAX N ET.



MAXNET is suitable for situations where more information is needed than can be incorp orated. Knapp and Wang [32] used a MAXNET to force a decision between the competing operation alternatives. In their work, a sequence of operations for machinin g each feature of the part is generated independently by the MAXNET. However, few systems use only BSB or MAXNET indep endently. BSB and MAXNET are typically used in multi- type architectures, e.g. Sakakura and Inasaki [30] used a BSB with a three-layer feed- forward network while Knapp and Wang [32] utilised a co-o perating architecture combining a three-layer feed- forward nerwork and a MAX N ET. 4.2. Input representation



T he input representation for neural network-based CAPP involves the conversion of design data into a proper inp ut forma t. Three aspects must be resolved. a) Input information . Process planning deals with a number of detailed activities, such as selecting manufactur ing operations, determining setups, specifying appropriate tools and so on. Each activity requires a different set of informa tion. For example, setup generation requir es information about tolerance, material and operations. b) The need for uniqu e input. Each piece ofthe input information for a neural network must be uniquely represente d in a proper format. c) The need for unique input. Each piece ofthe input information for a neural network must be uniquely represented in a proper format. Standard input image data and input vector are the two main types of input representation in CAPP. 4.2. 1. Standardisedimage data



Standardised image data converted the cross-sectional shape data of the produ ct into standardised image data for the input [24]. They use 12 "co lours" to represent 12 outer or inner geometric primitives, such as cylinder and cone. Half of the produ ct



 Neural network systems technology and applications in CAD/CAM integration



23



shapes are converted into a 16*16 "colour" data image. These 256 units are regarded as the input to the neural network. This representation can only be used for rotationally symmetric products. 4.2.2. Input vector with value rangingfrom 0 to 1



It is one of n-unit vector input formats, whose units are coded with numerical values ranging between 0 and 1. Certain special transformations have to be performed, which different formulae need to be established for different units. For instance, a unit related to the workpiece material is calculated from the cutting force per unit area, k( usmg the following formula [26]: k( - 1900



i 6= - - - -



2600



Park et al [33] defined the input values from 0 to 1 according to its real values, e.g. 0.5000 for a hardness unit for a real value of225 BHN, and 0.4366 for a cutting speed unit for a real value of80 m/min. 4.2.3. Input vector with integer value



Each input unit is given a particular integer instead of the original value. A common method is to classify the value for each parameter and assign a discrete integer to the corresponding unit. Park et al [27] used 15 input parameters concerning seven factors, such as the feature type, ratio of feature width to depth, tool length, and tool material. A class number is given for each parameter based on its real value, e.g. the class number is 6 if the ratio of selected tool length over standard length is 2. Similarly, in the system by Sakakura and Inasaki [30], the unit of dressing depth of cut is assigned a value of 4 for a real value of 11.0 flm. Mei et al [23J developed a scheme for rotational parts in which the surface orientations (i.e. right, left or both) are represented by values -1, 1 and O. 4.2.4. Input vector in binarvform



The value of each unit uses only two characters (i.e. 0 and 1) representing whether the corresponding parameter is needed or not. In order to determine feature clusters, Chen and LeClair [34] represented a feature with a (6 + n)-unit vector in binary form, which defines the six approach directions and n tool types. 4.2.5. Input vector in mixedjorm



A typical example is the work ofDe vireddy and Ghosh [21]. An eight-unit input vector is used specifying the feature type (e.g. hole, step, taper, thread), and its attributes, e.g. diameter, length, tolerance, surface finish, etc. The value of feature type is in binary form while the values for attributes are in the range of 0 to 1.



 24



Yong Vue, Lian Ding and Kemal Ahmet



4.3. Output format Several output formats have been proposed which are summarised below. 4.3.1. Output Ileelor in ordered binary form



The output vecto r is usually applied for the op eration selection, machin e tool or cutting tool selection. Co mmo nly, it consists of a number of neurons, each with a value (i.e. 0 or 1) showi ng whether the correspo nding item (e.g. machining operation, machine tool or cutting tool) belongs to the process plan or not. For instance, an output vecto r consists of eight neuro ns representing respectively drilling, reaming , borin g, turning, taper turni ng, grooving, grinding and precision. If the value of a neuron is ' 1', the corresponding operation is needed for the feature; otherwise, the value is '0 ' , e.g. a hole requires the drilling operation, so the first neuron is assigned the value '1'. Li et al [22] used a 4-n euron vector corresp onding to the abrasive type, grade, grit size and bond. Le Tumelin et al [28] designed a 23-neuron vector. 4.3.2. Output vector with special values



This type of output is specified for determining some technological parameters, such as parameters of a cutting tool and cutting conditions. Each neuron in the output vector has a possible value that the cor respo nding parameter may assume . Santochi and Di ni [26] develop ed a system for selecting the eight technological paramet ers of a cutting tool. For example, to select a nor mal clearance angle a.; the numb er of output neurons is 5 w hich represents 4°, 5°, 6°, T", 8° respectively. T he neuron with the value ' 1' represents the optimal value and 'OS a second choice. 4.3.3. One-unit outputin binary form



This output format has only one unit whose value is either 0 or 1 [23]. The output shows which surfaces sho uld be used as manufacturing datums. For instance, ' 1' means that the surface will be used for the part setups and '0' means that it has no thing to do with part setups. 4.3.4. One-unit output in integerform



Each discrete integer is conc ern ed with a special class [34]. T he output integer represents a cluster of features according to the approach direction s and the tool types. 4.3.5. Output matrix



Shan et al [29] devised a binary incidence matrix V (n" n) in which the rows denote operations and the columns cor respon d to sequences. T he value ' 1' indicates that a specified operation is performe d. Because each operation is performed only once and only one operation is carried out at a time, one and only one of the entr ies in each row and column sho uld take the value of 1 whereas the rest sho uld be set to O.



 Ne ural network systems technology and applications in CAD/CAM integration



25



4.4. Training method



In CAPP applications, the training metho d usually employs either an unsupervised learnin g algorithm or back-p ropagation. 4.4. 1. Unsupervised leamil1g algorithm



With an unsupervised learnin g algorithm, the training set only contains input samples; no desired or sample ou tputs are available. The neural network must construc t an internal model that captures regularities in input training patterns instead of measur ing its predictive performance for a given input. Hence this method is also called self-organisation. In CAPP applications, a logical AND/ OR operation- based unsupervised learning approach is used. Che n and LeClair [34] clustered features based on the approach direction and tool type and then generated a process plan using an Episodal Associative Memory (EAM) approach. The AND operation was applied to solve multiple approach directio ns for some features. If the digit is 1 for th e corresponding approach direction, the upd ate weight for the cluster j is "b(s I}



+ 1) =



"xi1p) AND "b..(s) - . "x(p)"b .(s) , '.1 j I)



where aX j(p ) is the approach direction sub-pattern, < + x , + y, + z, - x , -y, - z > , of pattern p . In the meantime , th e O R rule is used to upd ate the weight so that the probability of common tools can be increased. If the digit is 1 for the cor respo nding too l, then bij (s + 1) is modified according the followin g equation [34]:



where I X ;P )



1(17)



is the tool sub-patte rn , and = 1 if17 /1, else 1(17) = o.



4.4.2. Bace-propagation



A back-propagation algorithm is a form of unsupervised training. Back-propagation methods have proven highly successful in CAPP applications [23, 24]; they can be classified into three gro ups. The Delta Rule: One of the back-p ropagation learni ng algorithms is the delta rule based on the cumulative error. It is also known as the least mean squares (LMS) or Windrow-Hoff rule. T he learn ing rule changes the connec tion weights so as to minimise the mean squared error between the network output and the target over all training patterns. Sakakura and Inasaki [30] chose the delta rule for both a feedforward network and a ESB network. In th e three-layer feedforward netwo rk, the weight connec ting neur on



 26



Yang Yue, ban Ding and Kemal Ahmet



j in the hidden layer to neuron k in the output layer is updated as follows: t.Wkj



= l1r L 8pk opj p



where



10 is the



output function of neuron,



nr is the learning coefficient of the FF network, p is the learning pattern number,



is the learning value of neuron i for learning pattern p, is the status value of neuron i for learning pattern p, and Opi is the output value of neuron i for learning pattern p. tpi



api



For the BSB network, the modified value of the weight which interconnects neuron



m and neuron n is calculated as the following:



~WIIIlI



== rJiJ L(tPIII - Wllllltp,/)tPI" p



where 17& is the learning coefficient ofBSB netowrk, and tji is the learning value of neuron i for learning pattern j.



Levemberg-Marquardt Approximation: A back-propagation algorithm using the approximation of Levemberg-Marquardt is also used in some applications [26]. This algorithm allows a better performance in terms of training time in comparison with other training methods. However, it may require a very large storage space for some complex situations. The matrix of the connection weights is updated through the following equation:



where ] is the Jacobian matrix of derivatives of the errors to each weight /-L is a scalar, U is the unit matrix, and e is the error vector of the network.



Wi),



Batch Training: Either the delta rule of the Levemberg-Marquardt approximation is used as the on-line learning rule. The batch training is an off-line training process. Rather than adjust the weights after each pattern presentation, batch training



 N eur al network systems technology and applications in C AD/CAM inte gration



27



accumulates the errors over the whole training set and adj usts each weight acco rding to the accumulated errors. It can generally be expressed as follows [10]: t:. W j i



= 1] L OO'Hl{lI J'



w here th e subscripts ill and Ol/t refer to th e net input and output signals associated with a given unit, and i and j refer to th e co nnection from unit i to unit j . The form of 1> will vary depen ding on rhe type oflaye r to which th e formula applies. In some cases it is advantageo us because ofits smoothing effect on the correctio n terms and increasing of convergence to a local minimum [31]. D evireddy and Gh osh [21] trained a system with a batch trainin g back-propagation algorithm . 4.5. Summary of ANN-based CAPP



Th e achievem ents of neural network-based CAPP systems are summarised in Table 3. Although the capabilities are limit ed at present, there is great pot enti al for further applications of neural networks to CAPP. It has been shown that A!)JN techniques can significantly improve the perfor mance of CAPP systems. T he self- learn ing functions allow empirical rules to be learnt thr ough typical examples. Faster processing makes systems more effective, especially in parallel environment s. There has been effort in inco rpo rating neural networks with other techniques, which will be discussed in the next section. 5. ANN-BASED HYBRID APPROACHES TO CAPP



There has be en research on neural network s incorporating other techniques. This section presents ANN-based hybr id approac hes to CA PP with expert systems, genetic algorithms and fuzzy logics. 5.1. CAPP using expert system and ANN techniques



Expert systems employ explicit rules, such as manu facturing and production rules. H owever, CAPP is not only concerne d with explicit judgem ents but also implicit j udgeme nts, for exampl e, how does a system run when it canno t guarant ee all manufactur ing rules are satisfied at th e same time. On the contrary, neural networks are an implicit reference method, whi ch is formed through a training process with a set of examples. Therefore, it can be seen that the incorporation of expert system and neural network techniques in a CAPP system can benefit from th e advantages of both and make the system more flexible and adaptive. Such a hybrid system generally consists of two control modules as depi cted 111 Figu re 11, the expert system co ntrol module and neural networ k control module. 5.1. 1. Expert system control module



T his modul e is mainly applied to the activities with explicit rul es, such as machin e too l selection, cutting too l selection and process sequenc ing . In addition , it is also used whe n th e input design data is new to th e system .



 28



Yong Yue, Lian Ding and Kemal Ahmet



Table 3 Achievements of ANN-based CAPP



Work by



Functions



Type of ANN



Osakada and Yang [24] Roy et al. [35] Knapp and Wang [32J Devireddy and Ghosh [21] Shan et al. [29]



Generation of process plan



Three-layer Feed-Forward network Feed-Forward network Three-layer Feedforward network and MAXNET Three-Layer Feed-Forward network Hopfield



Generation of process plan Operation selection and operation sequencing Operation selection and operation sequencing Operation sequencing



Le Tumelin et al. [28] Dong et al. [36]



Operation sequencing



Gu et al. [25]



Operation sequencing



Giusti et al. [37] Chen and LeClair [34] Santochi and Dini [26] Mei et al. [23]



Tool selection Generation of setups



Unknown Unknown



Selection of optimal values of a tool parameter Selection of manufacturing datums Selection of grinding wheels Selection of dressing conditions Generation of modified cutting conditions Generation of cutting conditions



Three-layer Feedforward network Three-layer Feedforward network Feedforward network



Li et al. [22] Sakakura and lnasaki [30] Park et al. [27] Park et al. [33]



Operation sequencing



Five-layer Feed-Forward network Feedforward network and Hopfield Three-layer Feedforward network



BSB/Three-layer Feedforward network Four-layer Feedforward network Fuzzy ARTMAP network



Manufacturing processes/ components Cold forging for axissymmetric components Cold forging Machining operations Machining operations for rotational components Cutting operations for components machined on single spindle Swiss-type automatics Machining operations for holes Machining operations Machining operations for prismatic components with regular machining features Rotational components Machining operations Turning operations Rotational components Grinding operations for ground components Grinding operations for ground components Milling and turning for sheet metal Milling operations



5.1.2. Neural network control module



This module is trained by a set of examples before it is used. When the input data is familiar to the system and can be recognised by the neural network, the module will be adopted. Neural network is particularly suitable for the activities, which have no explicit rules but enough examples, such as setup planning, cutting condition selection, technological parameters selection for cutting tool and process sequencing. The neural network control module usually has the ability to learn and is trained to learn with newly generated CAPP results. Shan et al. [29] developed an integrated system for machining operation sequencing using expert system and neural network techniques. In their system, an expert system is designed to produce partial orders according to expert rules that satisfy the specific



 Neural network systems technology and applications in CAD/CAM integration



+



I



I



CAPP Interface



Expert System Control module



~



J I



~



~



29



Neural Network Control module



......



---



Neural Network Training



Knowledge Rules



-;



Process Plan (Final result) Figure 11. Architecture of CAPP with expert system and neural network.



operation constraints, including operation constraints, geometric constraints, rigidity constraints, accuracy constraints and setup constraints. Then, a Hopfield network is presented to determine the final sequence based on the partial operation sequence produced by the expert system. Finally, a binary incidence matrix is determined where the rows denote operations and the columns correspond to sequence. Ming et al [38] proposed a hybrid intelligent inference model combining an expert system and a neural network for CAPP consisting of inference functions, global inference control strategy, hybrid control manager, cooperative communication processor, hybrid process knowledge base, and CAPP inference methods. The hybrid control manager is first executed to judge the initial condition and select the appropriate control strategy (by the expert system or neural network) or other functions (through the calculation function or optimisation function). The corresponding module is then called. Other examples of this kind of hybrid approach include the systems proposed by Kandel and Langholz [39] and Medsker and Liebowitz [40]. 5.2. CAPP using ANN, fuzzy logic and expert system techniques



There are often uncertainty or intangible factors in CAPp, especially during manufacturing evaluation, such as geometrical complexity and manufacturability. Fuzzy set theory may be employed as a solution to uncertainty. On the other hand, new manufacturing methods and technological development which may lead to easier manufacture and influence process planning, should be adapted in the manufacturing environment



 30 Yong Yue, Lian Ding and Kemal Ahmet



Defuzzyify



AND layer



Input Layer Figure 12. FL-BP neural network.



(e.g. new machine tools purchased). Thus, adaptability is needed for CAPP. Based on the above requirements, it is useful to incorporate fuzzy logic techniques to perform certain CAPP tasks. Chang and Chang [42] developed an artificial intelligent CAPP system integrating neural network, fuzzy logic and expert system techniques. Their system consists of the following parts:



• A back-propagation neural network for evaluating the manufacturability ofimportant features of the component. • A fuzzy logical back-propagation neural network (FL-BPN) for evaluating the suitability of the existing plans stored in the database. The FL-BPN has five layers as shown in Figure 12 [41]: the input layer, membership function layer which fuzzifies the crisp input values, AND layer where each neuron represents the premise part of a rule and is connected with an 'AND' operator, OR layer where each neuron represents the conclusion part of a fule and is connected with an 'OR' operator, and defuzzification layer which defuzzifies the final evaluating result. • Functional modules for process planning using an expert system, such as manufacturing process selection, machine selection, cell selection, fixture selection, part setup determination, cutting tool selection, machining parameters calculation, and final operations sequencing.



 Neura l network systems technology and applications in CAD/CAM integration



31



5.3. CAPP using GA, ANN and Fuzzy logic techniques Operation sequencing is task responsible for arranging the selected operations in a suitable order to fabricate the part [29J. An optimal pro cess sequence can largely increa se th e efficien cy and redu ce th e cost of production. However, wh at is the most optimal process sequence? It is influenced by various constraints and facto rs, such as geometric constraints of the component, manu facturing rules , manufacturing cost and time. Ge netic algor ithms (GAs) as one of the mo st popular combinatori al algor ithms, are a search technique for solving optimisation problems based on the me chanics of the survival of the fittest [42]. Generall y, a GA starts with some valid solutions generated rand omly, then makes a rand om change to them and accepts th e ones whose fitness fun ction s reduced, and th e process is repeated until no changes for fitness function reduction can be made . The authors have applied ANN, GA and fuzzy logic techniques to feature-based proc ess sequencing. In order to generate a process sequence that satisfies both the geom etri c constraints of the component and the restrictions du e to manufacturing rules while minimising the machining cost and time at the same time, a complete fitness fun ction can be defined for a GA, that is, F =



IV",!",



+ w (~ + Writ,



wh ere F is th e fitness;



f il is th e degree of the satisfacto ry with manufacturing sequence rul es; fc is the relative evaluating value for manufacturing cost; ft is th e relative evaluatin g value for manufacturing time ; and



HI III



'



HI c » UI(.



are the weights for the above evaluations , respectively.



However, it is difficult to find explicit rules to determine th e values of weights for the evaluation. Moreover, th e values of weights vary according to the complexity of the compo nent (e.g. feature relationship s, tolerance and rou ghne ss), produ ction requirement s (e.g. batch size and urgency), and manufacturing environment (e.g. machine tools and cutting tools available). Based on the above characteristics, the neural network may be used as a suitable tool to solve this problem. 5.3. 1. Input representation



Three main factors are conside red as input to the neural network . Complexity of component. It may be difficult to evaluate the complexity of a component because many factors need to be considered. The main factor s include the number of features, the co mplexity of feature s, the relation ships between features, the design and technical requirements (such as tolerance, surface roughness and material) for the component. Aiming to simplify such complicated evaluation, 20 input neurons are designed and divided int o five gro ups. Each four-n euron gro up denotes a binary number to represent th e number of features in the component, with an increase in



 32



Yong Yue, Lian Ding and Kemal Ahmet



Table 4 Example of input neurons 1 to 20 Input neurons 1 ---+ 20



o



o



o



Type I



()



()



Type II



1



5



()



()



()



Type III 3



()



()



Type IV



1



()



()



()



()



Type V ()



manufacturing difficulty. Thus, neurons 1-4 in the first group represent the number of features which are very easy to manufacture (Type I) while neurons 17-20 in the first group represent the number of features which are very difficult to manufacture (Type V). For example, if a component has 10 features and 5, 1, 3 and 1 features belong to Type I, Type II, Type III and Type IV, respectively, then the values for the 20 input neurons are shown in Table 4. Due to the uncertainty and complexity in feature evaluation, a fuzzy synthesis evaluation model is established, which is described below. Assume that the domain of the evaluation factors is A, A == {at, az, ... , ai, .. " all}'



where aj represents the factor of evaluation, and i = 1, 2, 3, ... , n. Seven factors are considered; they are feature class, nominal dimensions, accuracy, surface roughness, material, shape and position tolerance, and feature relationship. The domain of the evaluation grades is V,



v=



(Vl' V2,""



Vjl""



VIIJ}'



where v j expresses the evaluation results of complexity obtained from each factor considered, and j = 1, 2, 3, ... , m. Five grades are adopted, i.e.



v = {very simple, simple, general, complex,



very complex}.



Accordingly, a vector with five values is defined as



v=



Vj



0



V2



0.25



V3



0.5



V4



0.75



Vs



1



 N eural network systems technology and applications in C AD/CAM integ ratio n



33



Supposing that a fuzzy evaluatio n matrix U can be established as



u =



U II



UI }



1/15



Uil



H ij



HiS



H I/ )



11 11)



11 ,,5



w here tl jj indi cates the evaluation value of the ith evaluation factor a, as the memb ership of the j th evaluation grade v i ' Then, a fuzzy vector U can be calculated as 5 L (U l j X j =l



u =



= U V=



1/;



U"



Vj )



5



L (u;j X Vj)



j =!



5



L (U"j x Vj )



}=1



In order to consider the influen ce ofinte ractions amo ng the evaluation facto rs, a weight is introduced for the synthesis evaluation of each factor. T he fuzzy set of weight factors is IV , wh ich is normalised as HI:::: { Wl , W 2 , • . • , UJi " " , WI/ } '



wh ere



W



j



denotes the corresponding weight of the ith facto r at . and



I:;'=1 W



j



= 1.



Finally, the fuzzy synthesis evaluation can be carried out using the fuzzy operation



Production batch size. In practice, the process plan of a compo nent for mass produ ction , med ium produ ction , small produ ction or single produ ction may be different significantly. Therefore produ ction batch size must be taken into account. The value allocated to the produ ction batch size is shown in Table 5. Production urgency. An indicator is designed to consider how urgent the compo nent is needed. Th e value is allocated in Table 6.



 34



Yang Yue, Lian Ding and Kemal Ahmet



Table 5 Values for production batch size Production batch size



Quantity



Value



Single production Small production Medium production Mass production



<20 20 - 200 201 - 5000 >5000



1 0.6 0.3 0



Table 6 Values for production urgency Production urgency Very urgent Urgent Normal time No time requirement



Time (day)



<7



7-14



15-30 >30



Indicator value



1



0.6 0.3



o



5.3.2. Outputformat



Based on the problem to be solved, the output consists of three neurons, representing W ffI , W ( and W t v respectively. They are assigned a value of a real number between 0 and 1, i.e. [0, 1]. As relative weights, W ff1, We and W t are usually normalised before they are input into the final fitness calculation, that is W ffI + We + W t = 1. 5.3.3. Topology and the training method riftheproposed neural network



The proposed neural network uses a typical three-layer back-propagation architecture, consisting ofan input layer, a hidden layer and an output layer. There are 22 neurons and 3 neurons in the input layer and the output layer, respectively. The number of neurons in the hidden layer was determined from experiments. The experiment results have proved that a hidden layer of 10 neurons is the most appropriate. 5.4. Summary of ANN-based hybrid methods



In comparison to other approaches to CAPp, hybrid methods can incorporate the strengths of neural networks and other techniques such as expert systems, fuzzy logic and genetic algorithms, and therefore improve the performance in CAPP The main advantages of ANN-based hybrid methods are the enhancement of adaptability and consistency for dynamic manufacturing environments, and suitable tools to deal with uncertainties utilising expert experience, and thus, increased system intelligence. 6. CONCLUSIONS



ANN techniques can eliminate three principal drawbacks of conventional feature recognition methods: • the inability to recognise inexact or incomplete features, • slow execution speed, and • the inability to learn.



 Neural network systems technology and applications in CAD/CAM integration



35



An ANN-based feature recognition system possesses experience to recognise and classify similar features since it is trained and there is no need to predefine almost every instance of a feature as in most conventional systems. Many of the process planning tasks, such as selecting and sequencing appropriate machining operations, are based on empirical experience which cannot easily be modelled mathematically. ANNs, which can be trained with a set of typical examples, can be regarded as an effective tool for CAPP applications. Neural networks can adapt quickly to a dynamic manufacturing environment. By incorporating with expert systems, genetic algorithms, fuzzy logic rules and other artificial intelligent techniques, ANNs can make CAPP systems more efficient and adaptive. However, there are certain limitations with the use of ANN techniques for both feature recognition and CAPP These are: • a limited range of features and feature intersections that can be recognised, • a limited range of component types in CAPP applications, and • a lack of robustness, especially for feature interactions. In addition, there is a need for pre-processing of the input data and post-processing of the output, and it is a time-consuming process to configure an optimal ANN architecture with a suitable set of examples. It is also noted that the knowledge base created by an ANN is not directly observable, so the basisfor the output in response to any given input cannot be verified or examined directly. Further, although good at interpolation between adjacent training data sets, the results for input data sets outside the training data range can be fundamentally unreliable. In spite of the limitations and modest results of current research on the neural network techniques for feature recognition and CAPp, their potential is clear. However, there is a need for further work, in particular on tackling the limitations: • Solving problems ofinteracting features through incorporating new training methods as well as conventional techniques, such as heuristic algorithms. • Extension of the domain of features that can be recognised. • Efficient utilisation of expert experience to improve the performance of neural networks, especially for CAPP • Further development of ANN-based techniques, which can be applied to more CAPP tasks. 7. REFERENCES



1. Gurney, K, A" lntroduction to Neural Networks, UCL Press, 1997. 2. Prabhakar, S. and Henderson, M. R., "Automatic form-feature recognition using neural-network-based techniques on B-rep of solid models", ComputerAided Desif?", Vol. 24, No.7, pp. 381-393, 1992. 3. Cichocki, A. and Unbehauen, A., NeuralnetworksJaroptimization and sicnal protessino; Chichester: Wiley, 1993. 4. Haykin, 5., Ncural netu.orles: a comprencnsiveIoundation, Macmillan College Publishing company, 1994.



 36



Yong Yue, Lian Ding and Kemal Ahmet



5. Nezis, K. and Vosniakos, G., "Recognising 2.5D shape features using a neural network and heuristics", Computer Aided Design, Vol. 29, No.7, pp. 523-439, 1997. 6. Chuang,]. H., Wang, P H. and Wu, M. C, "Automatic classification of block-shaped parts based on their 2D projections", Computers & Industrial Engineering, Vol. 36, No.3, pp. 697-718, 1999. 7. Hwang, ].-L., "Applying the perceptron to 3D feature recognition", PhD Thesis, Arizona State University, USA, 1991. 8. Peters, T]., "Encoding mechanical design features for recognition via neural nets", Research in Engineering Design, Vol. 4, No.2, pp. 67-74, 1992. 9. Ding, L. and Yue, Y, "A Novel Input Representation for ANN-B.ased Feature Recognition", Frontiers in Artificial Intelligence and Applications, Vol. 82, Knowledge-Based lntclligent Information Engineering Systems and Allied Technologies, KES 2002, Part I, pp. 311-315, 2002. 10. Principe,]. C, Euliano, N. R. and Lefebvre, W C, Neural and adaptive system: Fundamentals through Simulations, John Wiley & Sons, Inc, 2000. 11. Chen, Y H. and Lee, H. M., "A neural network system for 2D feature recognition", lnternationat joumal of ComputerIntegrated Manufacturing, Vol. 11, No.2, pp. 111-117, 1998. 12. Chan, C C H., "ANN-based feature recognition and grammar-based feature extraction to integrate design and manufacturing", PhD Thesis, University of Iowa, USA, 1994. 13. Srinath, G., "Optimising neural net input for feature recognition", MSc Thesis, Arizona State University, USA, 1993. 14. Gu, Z., Zhang, Y F. and Nee, A. Y C, "Generic form feature recognition and operation selection using connectionist modelling",journal of Intelligent Manufacturing, Vol. 6, No.4, pp. 263-273, 1995. 15. Joshi, S. B. and Chang, T C, "Graph-based heuristics for recognition of machined features from a 3D solid model", Computer Aided Design, Vol. 20, No.2, pp. 58-66, 1988. 16. Zulkifli, A. H. and Meeran, S., "Feature patterns in recognising non-interacting and interacting primitive, circular and slanting features using a neural network", International joumai of Production Research, Vol. 37, No. 13, pp. 3063-3100, 1999. 17. Wu, M. C and Jen, S. R., "A neural network approach to the classification of 3D prismatic parts", International journal of Advanced Manufacturing Technology, Vol. 11, No.5, pp. 325-335, 1996. 18. Demuth, H. and Beale, M., Neural Network Toolboxfor Use with MATLAB, The MATHWORKS Inc., 2000. 19. Charalambous, C, "Conjugate gradient algorithm for efficient training of artificial neural networks", lEE Proceedings-G Circuits Devices and Systems, Vol. 139, No.3, pp. 301-310,1992. 20. Dagli, C H., Poshyanonda, P and Bahrami, A., "Neuro-computing and concurrent engineering", In Parsaei, HR and Sullivan, WG (eds), Concurrent Engineering: Contemporary Issues and Modern Design Tools, Chapman & Hall, London, pp. 465-486,1993. 21. Devireddy, C R. and Ghosh, K., "Feature-based modelling and neural networks-based CAPP for integrated manufacturing", International journal of Computer Integrated Manufacturing, Vol. 12, No.1, pp. 61-74, 1999. 22. Li, Y, Mills, B., Moruzzi,]. L. and Rowe, W B., "Grinding wheel selection using a neural network", Proceedings of the 10th NationalManufacturing Research Conference, Loughborough, pp. 597-601, 1994. 23. Mei, j., Zhang, H. C and Oldham, W]. B., "A neural network approach for datum selection in computer-aided process planning;' Computers in Industry, Vol. 27, No.1, pp. 53-64, 1995. 24. Osakada, K. and Yang, G. B., "Neural networks for process planning of cold forging", Annals of the CIRP, Vol. 40, No.1, pp. 243-246, 1991. 25. Gu, Z., Zhang, Y F. and Nee, A. Y C, "Identification of important features for machining operations sequence generation", lntemationai journal of Product Research, Vol. 35, No.8 , pp. 2285-2307,1997. 26. Santochi, M. and Dini, G., "Use of neural networks in automated selection of technological parameters of cutting tools", Computer Integrated Manufacturing Systems, Vol. 9, No.3, pp. 137-148, 1996. 27. Park, M. W., Rho, H. M. and Park, B. T, "Generation of modified cutting condition using neural network for an operation planning system", Annals of the CIRP, Vol. 45, No.1, pp. 475-478, 1996. 28. Le Tumelin, C, Garro, 0. and Charpentier, P, "Generating process plans using neural networks", Proceedings of2nd International fMJrkshop on Learnino in lntellioent Manufacturing Systems, Budapest, Hungry, 1995. 29. Shan, X. H., Nee, A. Y C and Poo, A. N., "Integrated application of expert systems and neural networks for machining operation sequencing", Neural Networks in Manufacturing and Robotics, ASME, PED-vol 57, pp. 117-126, 1992. 30. Sakakura, M. and lnasaki, I., "A neural network approach to the Decision-making process for grinding operations", Annals of the CIRP, Vol. 41, No.1, pp. 353-356,1992.



 Neural network systems technology and applications in CAD/CAM integration 37



31. Fausett, 1., Fundamentals of Neural Networks: Architectures, Algorithms and Applications, Prentice Hall International, Inc, 1994. 32. Knapp, G. D. and Wang, H. P, "Acquiring, storing and utilising process planning knowledge using neural networks",Journal of Intelligent Manufacturing, Vol. 3, No.5, pp. 333-344, 1992. 33. Park, M. W, Park, B. T, Rho, Y. M. and Kim, S. K., "Incremental supervised learning of cutting conditions using the Fuzzy ARTMAP neural network", Annals of the CIRP, Vol. 49, No.1, pp. 375378,2000. 34. Chen, C. 1. P and LeClair, S. R., "Unsupervised neural learning algorithm for setup generation in process planning", Proceedings of International Conference on Artificial Neural Networks in Engineering, Pl'. 663-668, 1993. 35. Roy, R., Chodnikiewicz. K. and Balendra, R., "Interpolation of forging preform using neural networks",Journal ofMaterials Processing Technology, Vol. 45, Nos. 1-4, Pl'. 695-702, 1994. 36. Dong, J. X., Tang, X. Q. and Wang, S. c., "Inference mechanism for CAPP tool based on artificial neural network", Proceedings of 1stCongress on Intelligent Manufacturing, Puerto Rico, pp. 994-1002, 1995. 37. Giusti, E, Santochi, M. and Dini, G., "COATS: an expert module for optimal tool selection", Annals of the CIRP, Vol. 35, No.1, pp. 337-340,1986. 38. Ming, X. G., Mak, K. 1. and Van, J. Q., "A hybrid intelligent inference model for computer aided process planning", Integrated Manufacturing Systems, Vol. 10, No.6, Pl'. 343-353, 1999. 39. Kandel, A. and Langholz, G., Architectures for Intelligent Systems, CRC Press, Boca Raton, FL. 1992. 40. Medsker, 1., Liebowitz, J., Design and Development of Expert Systems, Macmillan, Basingstoke, 1994. 41. Dereli, T and Piliz, H., "Optimisation of process planning functions by genetic algorithms", Computers & Industrial Engineering, Vol. 36, pp. 281-308,1999. 42. Chang, P T and Chang, C. H., "An integrated artificial intelligent computer aided process planning system", InternationalJournal of Computer integrated Manufacturing, Vol. 13, No.6, 1'1'.483-497,2000.



 NEURAL NETWORK SYSTEMS TECHNOLOGY AND APPLICATIONS IN PRODUCT LIFE-CYCLE COST ESTIMATES



KWANG-KYU SEO



This chapter describes an approximate estimation method for the product life cycle cost (LCC), called an approximate LCC estimation method, which allows the designer to make comparative LCC estimation between the different product concepts. The proposed approach provides the approximate and rapid estimation of product LCC based on high-level information typically known in the conceptual phase. The product attributes at the conceptual design phase and LCC factors are identified and the significant product attributes are determined by statistical analysis. An artificial neural network (ANN) is trained on product attributes and the LCC data from pre-existing LCC studies. This approach does not require a new LCC model. 1. INTRODUCTION



The ability of a company to compete effectively on the increasingly competitive global market is influenced to a large extent by the cost as well as the quality of its products and the ability to bring products onto the market in a timely manner. In order to guarantee competitive pricing of a product, cost estimates are performed repeatedly throughout the life cycle of many products. In the early phases of the product life cycle, when a new product is considered, cost estimate analyses are used to support the decision for product design. Later on when alternative designs are considered, the best alternative is selected based on its estimated life cycle cost (LCe) and its benefits. Manufacturers usually consider only how to reduce the cost the company spends for material acquisition, production, and logistics. In order to survive in the competitive market environment especially resulted from the widespread awareness of 38



 Neural network systems technology and applications in product life-cycle cost estimates



39



global environmental problems and legislation, manufacturers now have to consider reducing the costs of the entire life cycle of a product. The costs incurred during life cycle are mostly committed by early design decisions. Studies reported in Dowlatshahi (1992) and by other researchers in design suggest that the design of the product influences between 70% and 85% of the total cost of a product. Therefore, designers can substantially reduce the LCC of products by giving due consideration to life cycle implications of their design decisions. The research on design method for minimizing the LCC of a product also becomes very important and valuable. The need for sustainable development has begun to change the way many companies design products. Generally, product designers are being asked to judge the cost of the products to be developed. Not only is this an additional task for designers, but it is also necessary something they are qualified to do. Therefore, the cost models created by cost estimators should be integrated with traditional design models, making the parametric cost results available on demand. However, the use of detailed parametric models is not well suited to early conceptual design, where ideas are diverse and numerous, details are very scarce, and the pace is swift. This is unfortunate because early phases of the design process are widely believed to be the most influential in defining the LCC of products. Therefore there is a need for a method that directly addresses the issue of providing cost information to the designer irrespective of the design context in which the cost information is used. The lack ofestimation methods for the life cycle cost in early design phase motivated the development ofa method. The proposed method has to be developed to offer good prediction of product LCC in response to design decisions and design guidelines for reducing product LCe. This chapter describes a research to develop an approximate LCC estimation method for use in the conceptual design. The proposed method allows to the approximate and rapid estimation of the Lee based on high-level information typically known in the conceptual phase. An artificial neural networks (ANNs) is trained on product attributes and the LCC data from pre-existing LCC studies. The product designer queries the trained artificial model with new high-level product attribute data to quickly obtain an approximate LCe for a new product concept. 2. BACKGROUND



2.1. General product development



The process of product design or product development is interdisciplinary, time consuming, and involves many tradeoffs. Product design includes every technical aspect of the product, from the purchasing of components to manufacturing, assembly, service, and obsolescence. A successful product not only performs well for the company through high profit, low investment time, and improved future capability, but it also must be valued by the customer and follow government regulations. Ulrich and Eppinger (1995) define five stages of product development for an engineered, discrete, physical product: concept development, system-level design, detail



 40



Kwang-Kyu Sea



design, testing and refinement, and production ramp-up. The process begins with a mission statement and ends with product launch. It should be noted that the end of the development process might change in the future with the inclusion of another stageproduct take-back-as governments worldwide contemplate mandatory product takeback laws. However, such an inclusion, although it will likely influence a company's value structure, would not alter the key activities within the previous five stages. Fundamentally important to the concept development process is that a great many decisions have been made by the end of the phase-everything from deciding the targeted market to selecting the most-likely-to-succeed concept for further development. A concept is described in terms of its form, function, and features. The concept that continues beyond this phase will also carry with it a set of specifications, an analysis of competitive products, and an economic justification of the project. The conceptual phase of product design is the most influential of all phases. It becomes important to have the product objectives represented during this phase. If the cost is appropriately included, cost requirements can then be used in a test for concept feasibility along with other requirements, such as performance specifics and function. This means the designers must be able to evaluate the approximate cost performance of a wide range of solution concepts early in the design process. Decisions that emerge from the conceptual phase are also most likely never to be changed to any significant degree. This resolved decisiveness is due to the large amount of resources-time, manpower, and money-needed to start over or make a change once a certain path has been chosen and ship deadlines are approaching. It is essential to include the cost early to prevent cost mistakes, which may not be corrected or mitigated later, from occurring. There are many good things that can come out cost estimation in conceptual design. However, there are also several limiting factors to the phase as well. Time is probably the least plentiful resource for the development cycle as a whole (Ulrich and Eppinger 1995). It can mean the difference between a product and a successfulproduct by beating competition to the shelf. Therefore, time saving support tools are crucial throughout the product development process. In conceptual design, though, lack of information is as much a problem as lack of time. Without information no type of cost, environmental, or other functional performance evaluation can even begin. Traditional product designers are not necessarily qualified to evaluate the objectives when they design the products. However, designers are recently being asked to assess the objectives, especially LCC, of the products they are developing. In response, many different methods have been developed in attempting to estimate the cost concerns into the product development process. 3. AN APPROXIMATE ESTIMATION METHOD FOR THE PRODUCT LIFE CYCLE COST USING ANNS



3.1. The concepts



In this section, the possibility of an approximate estimation method for the product LCC, called an approximate LCC estimation method, is investigated and validated.



 N eural network systems techn ology and applications in produ ct life-cycle cost estimates 41



Detailed product description data



Existing LCC study data



Figure 1. Training process of an approximate LCC estimation method.



The propo sed method provides the useful LCC estimation of products in terms of product attributes related with LCC factors. The LCC factors and high-level produ ct attributes are introduced and identified by correlation test between them. An approximate LCC estimation method based on ANNs is a different approach from other LCC methods. It has the flexibility to learn and grow as new information becomes available, but it does not require the creation of a new model to make as LCC prediction for a new product concept. Also, by supporting the extremel y fast comparison of the cost performance of product con cepts, it does not delay product developm ent. Learning algorithms train ANNs using the identified high-level product attributes and the corresponding LCC factors. Through this training, the ANN is adapted to emulate existing LCC studies and generalize trends between produ cts. This is illustrated in Figure 1. The product designers query an ANN model with high-level produ ct attributes to quickly obtain an approximate LCC for a new product con cept. Designers need to simply provide high-level attributes of new product concepts to gain LCC predictions based upon trends inferred from real produ cts and LCC studies used as training data. However, the approximate LCC estimation model is not envisioned as a replacement for traditional detailed LC C models, but as a complement to them. In the early design stages the learning LCC leverages previously conducted detailed LCC studies to provide rapid feedback on a wide variety of con cepts. In later design stages, when a smaller range of variations is under consideration, detailed parametric LCC model s can be used. Results from the detailed LCC models are then added to the training database as new training material for the approximate Le e estimation model.



 42



Kwang-Kyu Sea



Detailed product information



y



LCC Factors



Detailed LCC



c=>-



life cycle cost



agg rega tion



Figure 2. The aggregation scheme for the



Lee.



There are three components ofthe approximate LCC estimation model: a meaningful set of product attribute inputs; a useful set ofLCC factor outputs; an appropriately trained LCC model based on ANNs. The product attribute as inputs must be meaningful to designers and consist of only product attributes typically known during conceptual design. The LCC factors as outputs should also be in a form useful to cost estimators and designers in different contexts. Therefore, LCC factors would provide the most flexibility as different schemes can be applied subsequently. The LCC training data must represent a range of products and contain many complete input samples of product attribute data and corresponding outputs ofLCC factors. Data transparency should be maintained with any LCC by fully stating any assumptions, estimations, or uncertainties. Finally, the structure of the approximate LCC estimation model must be chosen, trained, and validated. In application, the approximate LCC model must be fast and provide reasonable LCC estimates. Given these observations, there are three key areas that must be investigated in order to evaluate the learning LCC concept. Firstly, the feasible LCC factors to predict the LCC must be established. Secondly, a list of reasonable product attributes must be identified and correlated with LCC data to create a set of meaningful attributes. Thirdly, it must be established that an approximate LCC model can be trained to effectively emulate LCC results. The LCC training data will be developed in the course of gathering information to evaluate the three areas. 3.2. Development of the life cycle cost factors



The first issue is to establish the feasible LCC factors for use in training the approximate LCC estimation model based on ANNs. In order to identify the LCC factors, all the costs incurred in the product's life are investigated and enumerated. The LCC of a product is determined by aggregating all the LCC factors as shown in Figure 2. The total cost of any product from its earliest concept through its retirement will eventually be borne by the user and will have a direct bearing on the marketability of that product [Wilson 1986]. As purchasers, people pay for the resources required to bring forth and market the product and as owners of the product, people pay for the resources required to deploy, operate and dispose of the product. The product



 N eural network systems technology and applications in produ ct life-cycle cost estimates 43



Table 1 T he list of the life cycle cost facto rs Life cycle phase



LC C factor



D esign Production



market recognitio n. development mate rials. energy, facilities, wages. salaries. waste. pollution . health damage material. energy, maintenance. transpo rtation. storage. breakage, warranty service, packaging, waste, pollution , health damage disposal/recycling du es, energy, waste, disposal, pollutio n, health damage



Usage



Disposal/ Recycling



LCC can be decomposed int o some cost factors (Fabryc ky and Blanchard 1991 ). T his decomposition is by no means th e most comprehensive and representative of all produ cts or any product. The cost factors considered would dep end on the stage in whi ch we want to use the model, th e kind of information to be extracted from the mo del, the data available as input to the model and the product bein g designed. While the LC C is the aggregate of all the costs incurred in the produ ct 's life, it must be point ed out that there are differences between the cost issues th at w ill be of interest to th e per son designing th e product and the firm develop ing the product in the LC C analysis. Table 1 provides a list of cost facto rs for product life cycle that was adapted to the feasible LC C factor s useful for pred ictin g the product LC C (Alting, 1993). In this chapter, the life cycle ene rgy and service costs are shown as exam ples to estima te the product LC e. 3.3. Development of product attributes



This sectio n focuses on the det ermination of the set of design prop erti es, or product attributes, w hich all produ cts possess, th at will allow the approxima te Lee estima tio n mod el to fulfill its fun ction al requ irem ents. Three rul es were applied as follows. (1) th e attribute values mu st be known , or easily quantified in conceptual design; (2) th e set must not be so large as to create excessive co m plicatio ns in the neural network architecture of th e appro ximate Lee estim ation mo del; (3) and the attributes should be independent of each other, yet fully represent the elem ents of the Lee factors. To identi ty the product attri butes to int eract with the approxima te LCC estimation model based on ANNs, an ex tensive list of possible descriptors was compiled. Narrowing of the list occur red in several ph ases: grouping th e ge neral attributes , ide nt ifYing whe the r the attributes are kn own in conceptual design , and iden tifying relationships amo ng attributes and between attributes and the Lee factor s to elimi nate redundancy and ensure co m plete ness. Especially, maint ainability attribute s are also identified to estimate the service cost of usage ph ase.



 44 Kwang-Kyu Seo



Table 2 Organizational groupings of the identified product attributes Group name



Associated product attributes



general design properties elementary design properties functional properties economic properties



durability, conductivity, strength, degradability material content, recycled content mass, volume, performance, intended application potential sales volume, product liability, average selling price, regulatory compliance, price, manufacturing cost assemblability, process lifetime, use time, energy source, mode of operation, power consumption, in use flexibility, upgradeability, serviceability, modularity, additional consumables distribution mass, distribution volume, means of transport, transport distance Recyclability, teusability, disassemblability



manufacturing properties operational properties distribution properties end-of-life properties



3.3. 1. General product attributes



The second issue is to define product attributes for use in training and querying the approximate LCC model. The attributes need to be both logically and statistically linked to elements in the LCC factors, and also be readily available during product concept design. The attributes must be sufficient to discriminate between different concepts and be. They must also be easily understood by designers and cover the scope of the product life cycle. These criteria were used to guide the process of systematically developing a product attribute list. With these goals in mind, a set of candidate product attributes, based upon the literature and the experience of experts, was formed (Alting and Legarth 1995; Brezet and Hemel 1997; Clark and Charter 1999; Fiksel 1996; Hanssen 1999; Eisenhard 2000; Sousa, Eisenhard and Wallace 2001). Experts in both product design and cost estimation discussed as candidate attributes derived from the literature. After product attributes were identified, they were grouped for organizational purposes developed by Hubka and Eder (1992) as shown in Table 2. If the designer was able to specify or estimate an attribute in an appropriate qualitative or quantitative sense, the attribute was deemed specified. If the designer could not specify the attribute, but could typically rank order concepts, the attribute was deemed ranked. If an attribute could not be specified or ranked, but the designer could provide a yes or no type of answer, the attribute was deemed binary. For example, the designer might know that a concept will contain polymers, but not be able to specify or rank the amount used. If the designer could typically provide no information about an attribute, it was deemed unknown. Finally, if an attribute did not apply to the class of products designed by the participant, the attribute was categorized as not applicable (Eisenhard 2000). 3.3.2. Maintainability attributes



Service ofusage phase can be defined as the combination of all technical and associated administrative actions intended to keep an item or system in, or restore it to a state in



 N eural net work systems tech nology and application s in prod uct life-cycle cost estimates 45



Table 3 Identified maintainability attribute for produ cts Accessibility Interchangeability R easernblabiliry Modularity



Standardization Simp licity R edundancy Identification



Diagnosability Tribe-concepts Ergon omics



which it can perform its requ ired function. The main purpose of service activity is to redu ce the adverse effects of breakdown and to increase the availability at a lower cost, in order to increase performance and improve the depend ability level. We focus on service after occurring breakdown of a product or system . Maintainability is adapted as the parameter representin g service for a product. Maintainability is an imp ort ant aspect of life cycle concerns and plays significant role during the usage phase of a product. It is the design attribute of a product or system which facilitates the performance of various service activities, in particular, inspection , repair, replacement and diagnosis. These activities for a goo d serviceable produ ct should not only be performed in quick possible time but also with optimal personnel and support equipment (Wani and Gandh i 1999). Moreover, design characteristics of a product or system w hich facilitate maintainability will be effective if du e recognition is given to the facto rs which suppo rt system service during the usage phase. It is indispensable to perceive all the aspects of maintainability right from the design stage of the produ ct in systematic way. T his emphasis is to develop methodology to evaluate th e maintainability of product at design stage qualitatively or quantitatively. Maintainability attributes for electro nic products, in general, are identifie d and they are referred as attributes ascribed to the characteristics of the produ ct serviceability. They are also considered as the pro duct attributes at the early produ ct design stages. The identified maintainability attributes (U tez 1983; Takata et al. 1995; Vujosevic et al. 1995; Paasch and Ruff 1997; Wani and Gandhi 1999) are present ed in Table 3. They are also estimated by an appro pr iate qualitative or quantitative sense. 3.3.3. Determining theJinalproduct attributes using statistical analysis



In this section, the final produ ct attributes are determined to estimate th e product LCC in the approximate LCC estimation model based on ANNs. In order to determine the produ ct attribute, the correlation tests were performed between the candidate product attributes and the LCC factors. This step was to elimin ate redundancy and ensure complete life- cycle coverage with in the attributes set. In this study, the cost oflife cycle energy consumption and service were the important elements of LC C factors defined previou sly was show n, as examples, to predict the LC e. Produ ct attributes and detailed LCC data were compiled for use in analysis of descriptor redundancy and life- cycle coverage. Based up on these data, the candidate attribute set was again refined and then tested for first order relation ships with the LC C facto rs. Bivariate Pearson produ ct-mom ent correlations were computed and correlation tests to 95% statistical significance were



 46 Kwang-Kyu Sea



performed between quantitative attributes and the data of Lee factors for various products. Linearity and bivariate normality in the data were assumed in checking for trends. The Pearson correlation coefficient, r , is computed by



(1) where N is the number of data points and x and Sx are, respectively, the mean and standard deviation of variable x, and likewise for variable y. If the correlation signiftcance, or p-value, was less than 5% (0.05), then independence was rejected and x and y show linear correlation. In order to identify the qualitative attributes, bivariate Spearman rank correlation tests were computed and correlation tests to 95% statistical significance were performed between qualitative attributes and the data of Lee factors for various products. Spearman's method works by assigning a rank to each observation in each group separately. Then calculate the sums of the squares of the differences in paired ranks (df) according to the following formula: r:;==l-



6 (d {



2 1



+ d2')2 + ... + d 2 ) } n(n" - 1)



Ii



(2)



where r s is Spearman rank correlation coefficient and n is the number of observations. This first order examination required careful interpretation and grouping of products. For example, the data in Table 4 suggest that many product attributes are strongly correlated with many of the Lee factors (the life cycle energy cost) as expected. Insight gained about product attributes through the analysis will later be proposed as a structure for specializing the approximate Lee models to improve results. Mass and power consumption were most strongly correlated with the Lee factor (the life cycle energy cost) and disassemblability, additional consumable and energy source were strongly. The affect of qualitative attributes on the Lee factor was assessed visually through scatter plots. Additionally, it is believed that some correlations were not apparent because of potentially non-linear relationships between attributes. The product attributes strongly correlated with the Lee factors are used to predict the product Lee in the approximate Lee estimation model. Table 5 show the ftnallist of 21 product attributes chosen for use to predict the life cycle energy cost in the approximate Lee estimation model. The analysis provided a basis for belief that the attribute list could span the elements in the Lee factors. Sampling data with product attributes and corresponding the service cost from actual historic service activities were also collected for different electronic products. Based upon these data, the identified product attribute set was again refined and then tested for first order relationships with the service cost. Bivariate correlations were computed and correlation tests to 95% statistical significance were performed between



 Neural network systems technology and applications in product life-cycle cost estimates



47



Table 4 An example of correlation coefficients and tests: product attributes vs. LCC factor (the life cycle energy cost) T he list of pro duc t attr ibutes



The coe fficient of correlatio n



0.9656 -0.1092 -0.3760 0.2320 0.6035 0.6658 0.9890 0.46 10 0.0933



M ass Lifetime Usc tim e O peratio n m od e Ad di tio nal consu mable Ene rgy sou rce Pow er co nsum ptio n



Mod ularity I )ur ahility



- 0.0 100 0.5807 - 0.0295 - 0.0060 - 0.0-155 - 0.0325 0.7730



U pgradability Serviceability



Flcxibiliry



Post cons uma ble material



Reu sability R ccyclability I ) isassemhl ahilit y



Table 5 Product attribute list used in testing the approximate LCC estimation model for the life cycle energy cost



Product attributes



mass ceramics fibers Ferrous metals non-ferrous metals plastics paper/cardboard chemicals wood other materials assemblability process lifetime use time mode of operation additional consumable energy source power consumption modularity serviceability disassemblability



Unit



Level of information



kg



quantitative, specified quantitative, specified quantitative, specified quantitative, specified quantitative, specified quantitative, specified quantitative, specified quantitative, specified quantitative, specified quantitative, specified qualitative, binary qualitative, specified quantitative, specified quantitative, specified qualitative, specified qualitative, binary qualitative, specified quantitative, specified qualitative, binary qualitative, binary qualitative, binary



% mass % mass % mass



% mass



% mass % mass



% mass



% mass % OlaSS



dimensionless dimensionless hours hours dimensionless dimensionless dimensionless watt dimensionless dimensionless dimensionless



quantitative attributes and the data of the service cost for various products as shown in Table 6. The product attributes strongly correlated with the service cost are used to predict the product service cost. Finally, 24 product attributes for the service cost are chosen as shown in Table 7 and used as inputs in ANN models.



 48



Kwang- Kyu Seo



T able 6 An example of co r relat ion coe fficie nt s: p roduct attr ibutes vs. service co st I'ro dul"t attributes



The co cflici cru o f co rrelatio n



0.49



1113\\



lifetime



0.-13



usctimc o peratio n m ode l"1l\.·r!--~·



vo u rc e



power consumprion flexibilit y up g radability modularity acn', ibiliry reawcmblability simp licity



U.61 0.61 0.05 0 .U2 -0.02



0.-17



0 .51 0 .56 0.62



0.-17



Table 7 Product attri bute list used in th e AN N m odel for service cost Pro du ct attr ibu tes m ass ce rami cs fibe rs ferro us metals no n-fer rous m et als plastics paper/ cardboard ch em icals wood ot he r materials assem blabiliry disassernb lability lifetim e use tim e m ode o f o pera tio n serviceability upg radea bility mo du larity accessibility



reassemblabiliry standardization sim plicity identifi catio n diagno sabiliry



Unit



Level of information



kg



quan titative, specified quant itative, specified qu ant itative, spe cified qua nt itative, spe cified quantitative, specifie d qu antitative, specified qua ntitative, spec ified qua nt itative, specified quantitat ive, spe cifie d qu antitative, spec ified qualitative, bin ary qua litative, speci fied quantitative, sp ecified quan titative, specifie d q ualitative, specified q ualitative, binary qualitative, binary qualitative, speci fied qualitative, specified qualitative, specified qu alitative, spec ified q ualitative, specified qua litative, spec ified qualitative, speci fied



%1mass



%, mass %, BU SS lJ{, Blass (Xl mass (){J mass %1 mass <X) mass lJ{) mass dim e nsio nless dim en sionl ess ho ur s h ours dim ensionless di mensio nless dime nsio nless dim ensionless dim ensionless dim ensionless dime nsionl ess dim en sionless dim en sionl ess dimensio nless



4. A CA SE STU DY



T his section describ es the application of the approximate LCC estimation method based on ANNs and shows how the proposed meth od can be applied to estimate LCC such as life cycle energy and service costs. T he structure of artificial neural networks



 N eural network syste ms tech no logy and applications in product life- cycle cost estimates



49



Table 8 Examp les ofl earni ng patterns to estima te th e life cycle ene rb'Y cost in the approx imate LC C estimatio n mod el Inpu ts Produ ct



M ass (kg)



O utp ut s



Ferrou s M . ('X>l11Jss)



Plastics Lifetime ('){,mass) (ho urs)



U se time (hrs) (daily hour s)



Powe r co nsump. (watt)



1 2 3 4 5



6.49 1.04 0.17 0.64 1.8



31.3 4 16. 19 50.33 22. 16 22 .22



59.74 77. 65 49.16 7 1.09 55.56



61320 26280 43800 2160 43800



(1.01 13 7.5 45 0.27



898 58 0 13 770 .55



148 149



8 1.6 49.78 1160



58.33 67 .08 67 .74



27.64 11.98



8.33



122640 87600 105120



0.46 24 1.18



500 12.5 680. 2



ISO



Energy cost(S)" 5 15.7 1 20.53 1.79 11.47 709 .39



635 8.84 1467.29 85068.47



"Energy cost is tota l cost of enerbry consumption duri ng product's life cycle.



for th e approximate LC C estimation model is developed. Th en , using the identified produ ct attributes as inputs and the LC C as outputs, the LC Cs are estimated according to various concepts in early produ ct development. 4.1. D ata collection



As mentioned earlier, th e feasibility test of the prop osed meth od was conducted by focusing on the tot al life cycle energy and service costs compo nents of the LCC facto rs. Sampling data with produ ct attr ibutes and correspo nding th e life cycle ener gy cost from th e past studies were collected for 150 products and th e service costs were collected for 40 different electron ic products. T he lite cycle energy cost was obtained by toral energy consumption during the life cycle of products. T he examples of learning patterns to estimate the life cycle energy cost in the approximate LC C estimation model are shown in Table 8. In order to calculate th e service cost, the histori c data are collected and failure rate (F R ) as constraint is added in equation (6.1). The equ ation is useful in determining the service cost of a product. (3) Wh ere:



= Service cost ($) = Fixed labor cost ($) L T = Labor time (h) L R = Labor rate ($/ h) CM = C onsumable materi al and repair/replacem ent parts cost ($); F R = Failure rate



C Sm ';cc L f-Ixcd



 50



Kwang-Kyu Sea



Table 9 Examples oflearning patterns to estimate the service cost in the approximate LCC estimation model



Inputs Mass Product (kg)



Outputs



Ferrous M. Plastics Lifetime Use time (hrs) (%mass) (%mass) (hours) (daily hours)



Power consump. Modularity Service (0-4) (watt) cost($)*



1064 58 0 13 616.44



1 2 3 4 5



8.17 1.04 0.18 0.64 1.93



32.62 16.19 45.77 22.16 2.85



61.58 77.65 32.86 71.09 65.54



38 39 40



49.78 40.46 35.01



67.07 8.83 24.24



27.64 25.81 51.75



61320 26280 43800 2160 43800



0.01 OJl1 7.5 0.5 0.27



87600 24 87600 3.2 121764 24



13 616 19



1 1 1 1 4



6.00 1.79 8.25 1.85 2.75



3 3 4



18.00 6.54 12.57



"Service cost is mean cost of product service during usage phase.



In equation (3), LFixed refers to fixed labor cost when a service representative visits a customer site. The labor time, L T , is the service time such as Mean Actual Repair Time (MART) or Mean Actual Maintenance Time (MAMT) associated with repairing or replacing the individual components that fulfill the primary function. The cost of the replaced parts or materials is the mean replacement cost ofproducts and represented by the value eM. The failure rate, FR , is obtained by the reliability information ofproducts. In this study, I use constant and independent failure rates which are appropriate for earlier conceptual design phase. The above equation (3) only reflects the costs borne by the company that produces the product. In short, the equations provide an internal view of service. The equations do not reflect the cost or hardship experienced by the customer. Therefore, the company needs to examine more service policy from the customer's point of view. Sampling data with product attributes and corresponding the service cost were collected for 40 different electronic products. The examples of learning patterns to estimate the service cost in the approximate Lee estimation model are shown in Table 9. 4.2. Development of training algorithms 4.2.1. Backpropagation algorithm



There are different models of neural networks. General details about artificial neural networks can be found in Rummelhart et al. (1994). Taxonomy of the types of artificial neural networks can be found in Beardon (1989). We employ a connectionist, feedforward backpropagation neural network. Since the predictive capability of neural networks is typically nonlinear, it is appropriate to explain that feedforward neural networks perform a kind of nonlinear regression in which a multilayer network is trying to find a low-order representation in the weights between the network layers. That representation itself is, in general, a nonlinear function of the physical input variables that allows for the interactions of



 Neural network systems technology and applications in product life-cycle cost estimates 51



relation ships among many input variables at one time. Thus, the inputs becom e dependent on one another through network interaction and ultimately, generate nonlinear estimation s as output variables. Backpropagation (BP), selected for my design, is a neural networkin g algor ithm in whi ch activation is passed forw ard through the network and the output unit activations are compared with a teachin g vecto r. These represent the input/output pairs. The comp arison of input/output pairs results in error scores, which are used to propagate changes back down through the layers of weights. Weights represent the numerical strength of the connections or links between a node and its neighbors in a neural netw ork and can have eith er positive or negative values. Th ese weights represent the "i ntelligence" of the network-the essence of its predi ctive capability. The role of an activation function is to combine the input being broadcast to a nod e from other nodes in a netw ork . A typical activation function compresses the network activation impinging on a node between predetermined limits-usually a value between zero and one. A sigmoid, or s-shaped, activation function on the basis of its excellent predictive capabilities demonstrated in the previous studies. During the learning proce ss, the sigmoid unit is roughl y linear for small weight s (a net input near zero) and gets increasingly nonlinear in its respon se as it approaches its points of maximum curv ature on either side ofthe midp oint. T hus, at the beginning of learning, when weight s are small, the system is mainly linear and seeking a linear solution. As the weights grow, the network becomes increasingly nonlinear and begins to move toward a nonl inear solution to the problem. Thi s linearity prop erty makes the unit s more robu st and allows the network to reliably attain the same solution in repeated experimentation . Thus, two different trainin g sessions, using the same input data and randomly initialized weights, should con sistently predict the same results. The ability to train multilayer network s is an important step toward building int elligent applications. Neural networks mu st learn their own representation s because it is not possible to program them by hand . The optimization ofneural network paramet ers is critical in order to achieve the best possible predictive ability. Five different paramet ers can be adj usted in the creation of a backpropagation neural network : hidden units, number of layers, learning rate, mom entum and number of epochs. The number of hidd en unit s refers to the number of nodes plus a threshold node which are to be placed betw een the input and output vecto rs. Layers represent the number oflayers of hidd en un its between the input and output vectors. Learning rate is the numeric value by wh ich the weights between the input, hidden, and output layers are adjusted. Momentum is a parameter, which can increase the pace oflearning, potenti ally reducing the amount of time that it takes to train the network . The number of epochs refers to the number of tim es a data set is applied to the neural netw ork for training, tun ing, and testing. 4.2.2. Development eif training algorithm with backpropagatioll



In order to decide the struc ture of backpropagation neural netw ork , the convergen ce rate of error was checked by changing the number of hidd en layers, the number of nod es in each layer and by adj usting learning rate h, and momentum term a. Here, h and a are constants wh ose values are between 0 and 1.



 52



Kwang-Kyu Sea



Hidden layer Output layer 16 nodes 1 node for lite cycleenergycost 20 nodes I node for service cost



x, Inputs: Identified product attributes



::~~~~t:~3~~~~-""Life



cycleenergycost or Servicecost



Outputs: Product Lee



Figure 3. Structure of the backpropagation neural network to estimate the product Lee.



More than 50 experiments were performed to determine the best combination of the learning rates (h), momentum term (a), number of hidden layers, number of neurons in hidden layers, learning rules and transfer functions. Figure 3 shows the structure of the developed BP neural network, which consists of an input layer with 21 or 24 nodes, a hidden layer with 16 or 20 nodes, and an output layer with one node. The most popular learning rules, generalized delta rules and a sigmoid transfer function were used for the output node. 4.3. Testing and the results



The artificial neural network with backpropagation algorithm for the approximate LCC models was implemented in C++. The training of the backpropagation neural network took 2,688 seconds for life cycle energy cost with 140 learning patterns on a 500 MHz Pentium III processor. When TJ and a were 0.6 and 0.1 respectively, the number of iteration was 60,000, and the mean square error was 0.00014. The ANN runs its learning cycle by reading a text file containing the training data. Once model learns, users can set the model inputs (product attributes) to values corresponding a product concept. The approximate LCC model based on ANNs then immediately provides the predicted LCC output. The training for service cost took 102 seconds for 30 learning patterns. When TJ and a were 0.6 and 0.25 respectively, the number of iteration was 5,000 and the mean square error was 0.00062. The trained neural network was evaluated using products with known LCC results. The approximate LCC estimation model was estimated in two different ways: absolute accuracy and ability to generalize trends. Ten different products were used in the estimation. The results of LCC prediction and accuracy comparisons are provided. The results of life cycle energy cost prediction and accuracy comparisons for the ten products



 Neural network systems technology and applications in product life-cycle cost estimates



53



Table 10 Comparison of the life cycle energy cost of products as predicted by the approximate LCC estimation model with the actual LCC Product 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.



Vacuum Cleaner Mini Vacuum Radio Heater Coffee Maker Washing Machine Refrigerator (5) Refrigerator (L) TV LCD TV



Actual LCC ($)



Predicted LCC ($)



Relative error (%)



596.21 20.53 24.15 2893.56 464.37 6358.84 313.41 2221.19 2935.2 2895.17



570.99 19.36 23.24 3241.37 488.61 6365.83 306.3 2193.87 3020.91 2880.69



4.23 5.72 3.75 -12.02 -5.22 -0.11 2.27 1.23



Ave. absolute error Max. absolute error



-2.92 0.5 3.79 12.02



"Training sample size is 140, ** Test sample size is 10.



Table 11 The predicted results of product service cost (5C) Product 1. Vacuum Cleaner 2. Mini Vacuum 3. Radio 4. Heater 5. Coffee Maker 6. Washing Machine 7. Refrigerator (5) 8. Refrigerator (L) 9. TV 10. LCD TV



Actual 5C ($)



Predicted 5C ($)



Relative error (%)



6 1.79 9.46 3.23 2.83 22 12.38 16 6.54 12.57



6.02 1.75 9.44 3.22 2.81 22.01 12.38 16.01 6.68 12.59



-0.38 2.08 0.12 0.15 0.97 -0.02



Ave. absolute error



Max. absolute error



o



-0.02 -0.06 -2.11 0.59 2.11



"Training sample size is 30, ** test sample size is 10.



are provided in Table 10. Those of service cost for ten products are provided in Table 11. Secondly, the ten products were used to test the approximate Lee estimation model's ability to generalize and predict trends correctly for a given product concept. The characteristics of each test-case product were held constant, with the exception of the attribute for which trends were being assessed-mass and power consumption. The life cycle energy cost is only shown as an example tor ability to generalize trends. The mass and energy consumption results for the washing machine shown in Figures 6 and 7, are representative in illustrating trends as predicted by the approximate Lee estimation model. Results produced in trying to assess trends with respect to mass and energy consumption for the washing machine were generally good.



 54



Kwang-Kyu Seo



7000 .........



E.A



'-"



Vi 0



Washing Machine



6000 5000



u



4000



u >. u



. ..• . . . Actual LCC



3000



- - - Predicted LCC



~ ....J



2000



2



1000



0



Refrigerator (s mall)



0



4



6



8



10



12



Products Figure 4. Compa rison results of the LCC of products in Table 9.



25



6. Washing Machine



20 .........



V)



'-"



Vi



15



- - • . . Actual SC



0



u



u



.~



c:



u



en



( )



10



•



5



tcr



Predicted SC ($)



5. Coffee Makcr



0 0



10



15



Product Figure 5. Comparison results of the service cost of produ cts.



4.4. Discussion



In order to test the capability of the trained ANN, ten ano ther samples were used as the test set whi ch had no t been used in the training. The results of the produ ct Lee predicted by the ANN model for ten products are provided in Tables 10 and 11. In



 N eural network systems technology and applications in product life- cycle cost estimates 55



6900 6700 ,.-..,



lA



~ 6500 o



u



~ >. o ~



:J



-



6300



Predicted LCC ($) •



Actual LCC ($)



6100 5900 0 '---



-



-



-



o



' - - - - -----'- - --



100



50



-'



150



Mass (kg) Figure 6. R esults of mass trends for the washing machine.



6400 6390 6380



§ 6370 'in 0



u



~



o



•



6360



>,



u



...!:! 6350 ~



•



6340 6330 0 0



500



1000



1500



Predicted LCC ($)



-



2000



Energy Consumption (W) Figure 7. R esults of enerb'Y consumption trends for the washing machin e.



Actua l LCC ($)



 56



Kwang-Kyu Sea



these Tables, some observations from the testing results are summarized as follows: (1) Among ten testing samples, it can be found that the percentage errors ofthe samples are within ±6% except for the life cycle energy cost of the heater (12%). This is considered as a very good LCC estimation at the early design stage. (2) The maximum deviation ofLCC estimate from the actual LCC is about 12% which is acceptable for practical use. (3) The performance of the trained ANN is consistent in the training, validation and testing samples. During the early conceptual design stages ofproduct development, available data are limited and the cost analyst must depend primarily on the use of various parametric cost estimating techniques in the development of cost data. The accuracy of a LCC model deviated from an actual LCC is typically between -30 and +50% (Creese and Moore 1990), so the proposed method based on ANN shows better product LCC estimation and gives the guidelines for the cost-effective design decisions in conceptual design phase. The prediction results of trends for the other products were generally good too. The approximate LCC estimation method to predict the product LCC is somewhat generalized since the results of trend experiment were shown to be satisfactory. This generalization can be extended to the product LCC according to various product attributes and diverse products. The two advantages of the approximate LCC method are summarized as follows: (1) The product attributes include the cost aspects related to products. Extracting such product attributes can be easily done by a product designer. Detailed information for cost estimation is not required. (2) The ANN based method can help better conceptual design through evaluating the costs of different design alternatives. 5. CONCLUSIONS AND FUTURE WORKS



Life cycle engineering as an approach to the development of a product has been recognized as an effective way to compete in the current global market. One aspect of life cycle engineering is LCC analysis. The growing demand on producers to develop products that are inexpensive to acquire, use, and dispose of has necessitated that the LCC of products be considered during the design of the product. For designers, estimating the LCC of a proposed product during its development phase is required for a number of reasons including: (1) Determining the most cost efficient design amongst a set of alternatives, (2) Determining the cost of a design for monetary purposes. (3) Identifying cost drivers for design changes and optimization.



 Ne ural net work systems technology and applications in product life- cycle cost estimates 57



In this regard, LC C analysis sho uld no t only be seen as an approach for det ermining the cost of the produ ct but also as an aid to design decision making. It has been recognized that the design process needs cost models that: (1) Take into account the complete life cycle of products. (2) Ca n be used at the very early stages of design. (3) Can provide inform ation to designers in a timely mann er and in a form that can be understood and used. The produ ct LC C is mainly determined by early design decisions. But, at early conceptual design stages designers do not kn ow the costs incurr ed in subsequent life cycle phases. Thus, th e estimation me tho d for minimizing the produ ct Le c should be able to offer sufficient prediction of the product LCC in response to design decisions and design guidelines for reducing the product LCe. T he lack of estimating meth ods for early conceptual design motivated the development of the approximate LC C estimation concept. Thi s chapter has described the approximate LCC estimation method for predicting the produ ct LC C, especially focused on the life cycle energy and service costs, in conceptual pro duct design. Three areas critical to the preliminary validation of the approach were developed: model ou tputs in the form of the LC C factors; model inputs in the form of a compact, meaningful, and understandable set of product attribute s; and the ability to predict the product LC C using the ANN model. Th e LC C factors for the approximate LC C estimation mo del were investigated and they were able to be used to predict the produ ct LC e. A list of meaningful produ ct attributes needed for inputs to the approximate LC C estimation model were made and related with elem ents of the LC C factors. A set of produ ct attributes was identified, and tested for first order relationships with elements in the list of LC C facto rs. Finally, LC C data and prod uct attributes were collected for produ cts, and the approximate LC C estimation models based on ANNs were trained to predict the produ ct LC C , and then tested. The learn ing LC C mod els for ten products with know n LC C results were successfully tested to assess performance in two categories: absolute accuracy and the ability to predict trends associated with changes for a given produ ct con cept. It is apparent that the approximate LCC method based on ANN is feasible for estimating the cost incurred in subsequent phases of the produ ct life cycle based on design decisions at early conc eptual design stages. The proposed method can be used to estimate the product LCC and gives the guidelines leading to cost-e ffective design decisions in the conceptual design phase. There are some future work s as follows. Firstly, the various produ ct attributes for LC C factors are iden tified and further tests using more data are needed to determ ine to what extent the AN N mod el can provide reasonable predictions for pro duct attributes and to test for ano ther LC C facto rs. Secondly, in order to learn the prop osed method faster and more effectively, "learning space" has to be narrowed, or filtered, in a preliminary stage. For example, a



 58



Kwang-Kyu Seo



classification of products into general categories will lead to mQre specific relationships between product attributes and Lee factors. The reasonable product groups according to product categories can provide the predicted Lee more accurately by using the product attributes related with the product groups. REFERENCES Alting, L., 1993, Life-cycle design of products: a new opportunity for manufacturing enterprises. In Concurrent Engineering: Automation, Tools, and Techniques, A. Kusiak (ed.) (New York: Wiley), 1-17. Alting, L., and Legarth, J., 1995, Life cycle Engineering and Design, Annals of the CIRP, 44/2: 569-580. Bearden, e., 1989, Artificial Intelligence Terminology: A Reference Guide, Halstead Press, New York. Blanchard, 13. S., 1979, Life cycle costing-a review, Terotechnica, 1,9-15. Breset, H. and van Hemel, e., 1997, Ecodesign, a Promising Approach to Sustainable Production and Consumption. Paris, France: United Nations Environmental Program (UNEP) Industry and Environment. Clark, T and Charter, M., 1999, Eco-design Checklists for Electronic Manufacturers, Systems Integrators, and Suppliers, of Components and Sub-assemblies. http://www.cfsd.org.uk Creese, R. e. and Moore, L. T, 1990, Cost modeling for concurrent engineering, Cost Enoineerino, 32(6) June, 23-27. Dowlatshahi, S., 1992, Product design in a concurrent engineering environment: an optimization approach, journal o{Production Research, 30(8),1803-1818. Eisenhard, J. L., 2000, Product Descriptors for Early Product Development: An Interface between Environmental Experts and Designers, M.S Thesis, MIT Fabrycky, W J. and Blanchard, W J., 1991, Life-Cycle Cost and Economic Analysis (Englewood Cliffs, NJ: Prentice Hall). Fiksel, J., 1996, Design for Environment: Creating Eco-efficient Product and Processes (New York: McGraw-Hill). Gershenson, J. and Ishii, K., 1993, Life-cycle design for serviceability. In Concurrent Engineering: Automation, Tools, and Techniques, A. Kusiak (ed.) (New York: Wiley), 363-384. Hanssen, O. J., 1999, Sustainable Product Systems-Experiences Based on Case Projects in Sustainable Product Deveiopment,Journal o{CleanerProduction, Vol. 7, Pl'. 27-41. Hubka, V and Eder, WE., 1992, Engineering Design: General Procedural Model of Engineering Design (Zurich, Switzerland: Heurista). Ishii, K., 1995, Life-cycle engineering design. Design for Manufacturability, ASME, DE-81, 39-45. Paasch, R. K. and Ruff, 0. N., 1997, Evaluation of failure diagnosis in concepts design of mechanical systems, 7ral1s. o{ASME:J. oi Mech. DesiJin, 119(1),57-67. Porter, M. E., 1985, Competitive Advantage: Creating and Sustaining Superior Performance, New York, The Free Press. Rich, E. and Knight, K., Artificial Intelligence, McGraw-Hill, New York, 1991, Pl'. 487-509. Rumrnelhart, D., Durbin, R., Golden, R., and Chauvin, Y, Backpropagation: the basic theory, Backpropagation Theory, Architectures, and Applications, 1995: 1-34. Rummelhart, D., Widrow, 13. and Lehr, M., The basic ideas in neural networks, Commul1icatiotls of the ACM, Vol. 37(3), 1994: 87-92. Sousa, I., Eisenhard, J. L. and Wallace, D., 2001, Approximate Life-Cycle Assessment of product Concepts Using Learning Systems, journal of industrial Ecolocv, 4/4: 61-81. Takata, S., Hiraoka, H., Asama, H., Yamoka, N. and Saito, D., 1995, Facility model for lifecycle maintenance system, Annals oi the CIRP 44/1: 117-121. Tarelko, w., 1995, Control model of maintainability level, Reliability EI1JiineerinJi and System Safety, 47/2: 85-91. Utez, H., 1983, Maintainability of production system, Maintenance Manayement International, 4, 55-68. Ulrich and Eppinger, S., 1995, Product Design and Development (New York, NY: McGraw-Hill, Inc.) Vujosevic, R., Raskar, R., Yeturkuri, NV,Jothishankar, Me. and Juang, S.-H., 1995, Simulation, animation and analysis of design disassembly for maintainability analysis, InternationalJournal of Production Research, 33(11), 2999-3022. Wani, M. F. and Gandhi, 0. P., 1999, Development ofmaintainability index for mechanical systems, Reliability EnJiil1eerillJi and System Safety, 65(3), 259-270.



 NEURAL NETWORK SYSTEMS TECHNOLOGY IN THE ANALYSIS OF FINANCIAL TIME SERIES



RENATE SITTE AND JOAQUIN SITTE



As human nature strives for wealth and comfort in life, the ways ofattaining wealth have changed through time. In the dark ages the focus was on turning material into gold, but nowadays the efforts go into managing and manipulating trades enabling profitable growth in different economic strata. For decades, financial time series predictions have been the target for profitable trade. Those who succeeded are reluctant to share their secrets. Thus successful applications of times series prediction techniques are unlikely to be found in the scholarly literature of the field. We cannot promise a pot of gold, but we will explain financial time series, with emphasis in using neural networks for their prediction, as a technique which has brought significant improvements not only to time series predictions but also enabled new technologies in many other areas. Neural Networks are particularly attractive for the prediction of financial time series because they do not require any kind of model knowledge about the relationships between variables [1]. In this chapter we shall review different approaches to financial time series. We present this material in a way that is understandable to a wide and interdisciplinary audience. We start with an introduction of financial time series in their wider context, providing the background to their modeling and the applicability of neural networks. The next section provides an overview of time series, their components, and modeling techniques. This is followed by a tutorial on neural networks specific for financial time series. Then comes a section on data preparation. It explains in detail the steps required for the preparation of time series data as input for neural networks. This is done with the purpose to show newcomers to the field that neural networks are not difficult to 59



 60



Renate Sitte and Joaquin Sitte



apply. Finally a section updating the review of financial time series applications compiled in Zhang's earlier work [2]. For the sake of completeness, an appendix explaining financial derivatives can be found at the end of this chapter. INTRODUCTION AND CONTEXT



In general, a time series is a collection of data ofsome variable that changes over a time, and is recorded, typically at regular time intervals, for example the monthly rainfall of a region, or the yearly yield of a farmed produce. Financial time series are those that record an event related to finances, for example the daily exchange rates between two currencies, or the daily closing value of a specific stock. The changing values of financial time series are the result of complex phenomena. The effect of such phenomena is visible as a blend of slopes and bumps in a time series plot. The most noticeable variations are the trend, the cyclic and the periodic variations, and the day-to-day variations. The trend is an identifiable long-term gradual variation in the time series. It is usually predicted by extrapolation. The cyclic and periodic variations follow either seasonal patterns or the business cycles in the economy. Short term and day-to-day variations appear to be random and are difficult to predict, but they are often the source for financial trading gains and losses. There is a still the unresolved controversy that dates back to the beginning of the century, whether the short term variations of certain financial time series behave like a random walk, or not [3] [4]. If they were a random walk, any attempt to predict would be futile. Despite this controversy, the so-called charting methods remain popular among financial speculators, in particular on stock markets. Chartists assume that past values of the time series contain most of the information required to predict its future behavior. In explaining and predicting stock value fluctuations, two hypotheses have emerged. One is the so-called efficient market hypothesis [3), which states that current asset prices always fully reflect the available information. This means that there is no publicly available information, such as a history of past values, that will allow the holder of such information to obtain sustained above market average returns. A corollary of the efficient market hypothesis is that short-term price variations follow a random walk and are thus unpredictable. The second hypothesis states that the prices are not only determined by previous values, but also by other external or unknown variables. This hypothesis interprets fluctuations as the result of delayed or incomplete information that influences the stock market prices. External phenomena such as political turmoil or climatic changes influencing agricultural yield, may affect subsequent stock market prices. Accurate predictions are difficult because it is not possible to model, quantify, or even know a priori such external phenomena. Because of their potential for pecuniary gain the prediction of financial time series has been and continues to be a topic of intense research. However it is also a subject shrouded in secrecy for the simple reason that whoever finds a method that consistently produces correct predictions will not want this method become known to others. Disclosure ofsuch knowledge is likely to rapidly evaporate its financial value. In agreement



 N eur al network systems techn olo gy in th e analysis of finan cial time series 61



with the " efficient market" hypothesis, any technique that leads to sustained aboveaverage returns will upon divulgation be factored into the marked and thu s will be neutralized. Time series predi ction is usually based on an extrap olation , and even slight deviations can lead to wrong decisions, missed opportunities and financial losses not just for an individual, but for a wh ole eco no mic system. Predi ctions tend to hold only for very short periods, but financial systems require oft en lon g term strateg ies and early decision s. Short and medium term predi ction of events in th e environme nt is essential to the success of any human endeavor. The capacity of predicting wh at will happen reflects our level of understandino of the pro cesses unfolding in our environment . Understanding means having established causal relation ships between events that associate th e later occurrence of an effect with an earlier occurrence of th e cause. The scientific method is the tool for establishing causal relations. Th e scientific method induces causal relations from the qu antit ative analysis of observations. Such an analysis aims at establishing a causal relationship between the quantities being observed (dependent or output variable), other observable quantities (explanatory or input variables) of the process. The causal relationships found in this procedure for m the model of the observed proc ess. Prediction of the beh avior of eco no mic systems by det ailed modelin g of the dynamics is a big and difficult task because of the size and complexity of most econo mic systems of interest and the lack of full understanding of the causal laws in action. Despite the scarcity of widely accessible and reliable econ omi c model s th ere is a great variety of financial and economic data th at characterizes various aspects of economic and financial systems and that are recorded at regular intervals resultin g in many econom ic and financial time seri es. T hese time series are wide ly used for making poli cy and business decision s and the disclosure of the next value in the series is awaited with great expectation. Without reliable mod els there is no way to obtain the next new value in the series other than waitin g until its time has arr ived. It is needless to say that the co rrect pred iction of future values, ahead of their occur rence, can lead to substantial pecuniary rewards, hence empirical method s that allow the prediction of econo mic and financial tim e series are in great dem and. Wh en a causal dependency on other explanatory variables is not known or hard to establish , it is tempting to try to predict future values of a tim e series based exclusively on its past values. In its strict sense this is the problem of time series prediction. The justification for it is that variatio n over time of the data is a reflection of the unknown, dyn amics of the system and that futur e states of the system will dep end on its past states, if the system is deterministic. Traditionally, sto chastic models, based on statistical analysis of tim e series, were used for prediction s [5). In mo re recent time s attention has focused on N eural Networks (N N) meth ods [6) 171 [81 for the prediction of financial time series. This was becau se in the meantime the deficiencies and critiques oflinear model s in financial applications have become well kno wn [9]. Artificial neural networks have the advantage of ease of use. Subsequently alterna tive m odelin g approaches using N eur al networks, Fuzzy Logic, Evolutionary Co mputation, and C haotic Dynamics have been designed.



 62



Renate Sitte and Joaquin Sitte



There was skepticism in the 1990s about the whether neural networks provide an improvement over linear and traditional techniques. Sufficient evidence has now accumulated to overcome the skepticism. An intuitive example of a modulated sinusoid is described further down in the Time Delay Neural Networks section. This example shows very convincingly that no "intelligent" analysis is required, yet the neural network is capable of producing an astonishing result. In general, neural networks approach treat both, the linear and non-linear processes, in the same way. It is necessary to assess the predictive performance and the range of validity of the neural network predictions prior to their use. It is also scholarly practice to report the findings of such exercise. Although numerous papers have been published reporting results of applying neural networks to the prediction on various fmancial time series the value of these publications is diminished because they lack the discipline of comparing the proposed method to one or more accepted reference techniques, or to a "benchmark" problem. In many cases the authors use their own variants of a neural network technique without justifying clearly the need of the variant, and even worse very often the crucial details of the technique used are not described in the paper making it very laborious or almost impossible for another researcher to reproduce the reported results. A scholarly example of the few exceptions of this weakness is Walczak's study of the effect of the size of the training set on the accuracy of the prediction of currency exchange rates, with well-specified neural nets [10]. It is often stated in neural networks papers that neural networks methods are superior to statistical methods for time series prediction, as for example the statistical formulations oflinear predictors. However, as time series prediction have evolved over the years using a variety of techniques, some of them based on statistics and probability, we cannot - as it is often wrongly done - suggest a separation into statistical models and neural network models. Clearly statistical methods can also be applied with neural networks as the volatility prediction papers show. Here we refer to earlier methods as traditional methods. The tutorial about neural networks later in this chapter is not "yet another neural networks review". It focuses on recent insights on the applicability of Time Delay Neural Networks (TDNN) and Feed Forward Neural Networks (FFNN) in particular, and recognizes their link with dynamic systems, providing strong theoretical support. Zhang et al. did a comprehensive review ofthe state ofthe art offorecasting with neural networks up to around 1996. Their review focused on Multilayer Perceptron (MLP) networks because it was the neural network type predominantly used for forecasting. In the years since 1996 the understanding of MLPs and their training has matured and other networks, particularly recurrent neural networks have gained importance in the prediction of time series. Therefore we will only summarize the state of MLP for time series prediction and will present in more detail recent developments relevant to the prediction of financial time series. Also, since the last review in the mid 1990's, interests have moved beyond the simple predictions of the next value in the time series, to the assessment of the volatility, requiring extensions of the techniques. This translates into precision estimates of the



 Neural network systems technology in the analysis of financial time series



63



prediction, in other words, predicting the mean square error (MSE) of a prediction. This is where statistical methods come into play again. The prediction of volatility is essential for risk management, in particular in relation to the trading of financial derivatives. Financial derivatives are briefly explained in the appendix at the end of this chapter. TIME SERIES AND THEIR TECHNIQUES



This section provides an overview to time series and its related operations and techniques. It focuses specifically on those topics that are relevant for the understanding of neural networks applied to financial time series. First the time series and its components are explained. Then the related operations such as detrending and index calculations are summarized. For additional readings about time series, its theory, methods, and statistical techniques, or economics the reader is referred to literature that is specific to these topics [11] [12] [13] [14]. Modern financial systems are a man made invention. One must see money as a tradeable good just as any other consumable. We exchange money - real or virtual for goods, including money itself, but the time of trade is not limited to one single event. The time gap between the start of a trade, i.e. handing in one good, and finalizing the trade, i.e. paying back in currency or other good can be anything between immediately and up to decades apart, with very different socio-economic conditions. The value of our traded goods rises, falls and fluctuates, in a continuous dynamic way. These movements are often influenced by natural phenomena such as good crops and abundance of produce offered on the markets, triggering progressions of wealth, or the opposite of it, in times of shortage and natural disaster. Likewise they are also influenced by human interests in modes and fashions of consumerism, be it in food, leisure, or knowledge and technical innovations in semiconductors, communications, biotechnology, or any others. In financial time series, the main interest is on forecasting, but neither nature nor technology nor the next fad of consumerism can be forecast easily and accurately. Neither can their concurrent effect and that ofmany other factors on the financial situation be predicted. So, why would one wish to forecast financial markets situation? Because, by knowing, when a good becomes more scarcely and more desirable, one can wait and sell it for a higher price, than in times of abundance and competition. Conversely, one might not be able to endure further decline in prices without incurring in loss. In the end, it boils down to predicting the next change in abundance, scarcity or fashions. However, just predicting is not enough, the predictions must be very accurate as well. Decisions based on accurate predictions are rewarded with gains, wrong or inaccurate predictions lead to wrong decisions and loss. Modern economics is based on the assumption that the ability of prediction is not necessarily the result of one or another individual's premonition abilities, but that the knowledge is embedded in the data ofthe system, in the form ofresponse by consumers and brokers to earlier patterns of market situations. The prediction ability consists then, in finding the right data and an algorithm that is able to tap into that information and predict from it the next change.



 64



Renate Sirte and Joaquin Sitte



500,------,-----r---.,....-----,------,---,---r--l 450 400 350 x



is 300



.t;



8250 CD



c,



~200



150



50 O'-----'-----'----'------J'------'-----'----'--'



1976



1979



1982



1985



Date



1988



1991



1994



Figure 1. Example of a financial time series: The daily closing values of the S&P 500 stock market index.



Financial time series



A time series is a collection of values of a variable observed at specific times, and arranged in chronological order, for example the value of stock market shares at closing time as shown in Figure 1. The time series is an ordered set of observations



Observation



YI



Y2



Time step



II



t:



y" ...



tIl



Typically, when the variable is sampled at fixed time intervals the time is omitted and the series is given as a ordered list of values of the variable Yl, Y2, ... Y" which are only the observed values of the variable y at times t1, t2, ... t". What makes time series different from other series is that they often follow a complex pattern that for which a simple formula or set of equations is not known. More often than not, a time series is the result of several overlapping phenomena, who may be known through their effect, but whose nature is uncertain. For whatever reason, the purpose of knowing the governing rules and equations of a time series is to predict its future behavior or parts of it, based on past observations. Components of financial time series:



Time series, financial time series in particular, often exhibit easily recognizable features. As mentioned in the introduction their effect is visible as a blend ofslopes and bumps in



 N eural network systems technology in the analysis of financial time series 65



a time series plot. They are the characteristic components or variations of a time series. Four main component s can be distinguished: the trend , the periodical variations, the cyclical variations and the short- term fluctu ations. They are illustrated in Figure 2. 1. TIle trend is a gentle lon g-t erm compo nent (also known as secular variation). It refers to the overall shape of th e time series plot , for example followin g a linear growth, expon enti ally rise, logistic, or other. Sometimes it can be difficult to distinguish the overall shape of a time series from other variations in the tim e series. Therefore the length or scope of a time series sho uld be sufficiently long to ensure that the trend can be der ived from the tim e series data. 2. Cyclical variations are medium or long-term oscillations meandering around the trend. These cycles can be periodi c or aperiodic, i.e. variable period. If they are peri odi c, then the duration of a cycle must be more than one year to be considered a cyclic variation, else it wou ld be a seasonal variation. An example are the business cycles with the four alternating phases ofprosperity, recession, depression and recovery, whi ch are not subject to a calend ar year. For good observati ons, a financial time series should include at least two eco no mic periods, e.g. two co mplete business cycles or more. 3. Seasonal variations are very regular cycling patterns, typically with annual recurrence. For example annual rain patterns , or animal popul ation growth in spring, auto mo tive car produ ction , or seasonal sales. . 4. Short-term fiuctuatiollS are sho rt tim e, quite irregular variatio ns with " noise like" appearance, or sudden jumps th at are attributed to random events such as disasters, failed crops, or politic al events. In stoc k market time series they are called day-to-day variations. Th ey are difficult to predict but they are the source of stoc k trading gains and losses and the aim for predictions with lucr ative rewards. These co mpo nents are not completely independent to each other, nor are they exclusive to financi al time series. Cy clic variations can influ en ce the trend, the seasonal variations, and the random variations. C onversely, stron g seasonal variations can influence th e cyclic variations, and these in turn affect the trend . Likewise the random variations through wars, natur al disasters, or outbreak of epidemic diseases can affect any other component. Two main ways of composition s of a time series have eme rged that are widely accepted; they are based on the assembly of a time series' compon ent s, which are either added or multiplied. Y =T +C +S + I



(0.1)



Y= T x C x S xI



(0.2)



wh ere Y is th e observed value, T th e trend , C the cyclic variation, S the seasonal variation and I the irr egular or random component respectively. Each component



 y



a



time y



b



y



c



time



Figure 2. The different components of a Time Series: (a) the trend, (b) the cyclical and the periodical variations and (c) the short-term variations.



 N eural network systems tech nology in the analysis of financial time series 67



varies on a different time scale. The cho ice of operation however depend s on the data and the intenti on of the mod el. The pur pose of the analysis of tim e series is to decomp ose the series and separate the compo nent of inte rest for whatever purpose of the study. For example after obtaining the trend , predictions can be made by extrapolating the trend func tion for short period predictions of a nation 's econ omy. For each compo nent in the time series, a different model is applied. In general, these models are obtained by fitting the data to a specific model. Analysis 


If our level of interest on the time series is on the larger components, such as the trend or the cyclic variations, th e smaller fluctuations appear as irrelevant noise that do no t contribute to predictions. To the contrary, they wou ld lead to overfitting the model and distort the result. We eliminate all the small variation s by smoothing the fun ction . There are several me thods of smoothing data, and their suitability depends on the type of data. In the context of financial time series the most popular is that of the moving average (or running average). Other variations of smoothing algorithms are the weighted moving average and weightedsmoothing. The moviuo al'erage



The moving average is a techniqu e that is used for several purposes in financial time series analysis. It is used to smo oth the data, that is, to even out the random shortterm fluctuations, and for detrending. A weighted version of the moving average is used for linear predi ction in the autoregressive moving average (AR MA, ARIMA) technique. A moving average consists in generating a new series based on the piecewise averages of small successively overlapping groups of the or iginal data. Th e new series is then used for forecasting. A moving average of a series is obtained- for instance for a run period of 3, by taking the average of elements [at ,a2,a3], [a2,a3,a4], [a3,a4,aS] . . . and so for th. A higher run-period value smoothes the series more, a lower value smoothes less. For higher values, in the limit it tends to the centroid of th e function . This means, that by using a larger run-period , the prediction is affected by increasing error caused by over or underestimating. While this would be rarely desirable, a typical run-period of 3 or 5 is sufficient for time series predictions. For a tim e series whose progress in time we wish to predi ct, this is a more appropriate data preparation, than simply averaging non-overlapping groups. M oreover, by simply averaging in non-overlapping groups the time series is redu ced, and with it valuable information is lost. Trend estimating and detrendiug



The process of separating the trend from a time series is called detrendin g. The rationale for identifying the trend and separating it can vary. It can be either because of one's interest in the trend itself, or to remove its masking effect on the other compo nents that come s from its dominant effect on the time series. W hat remains after removing the trend we obtain what is called the residuals. R emoving the trend allows better



 68



Renate Sitte and Joaquin Sitte



Table 1 Examples of non-linear functions for trend fitting Gompertz function



y = ka V Logistic function I



Y = k+aV



Exponential function



y = eX Logarithmic function



y = logx Second order polynomial



y - a + bx



+ (X 2



f ~



• \ ~



~ ~



..



.:



r>.



observation of the other variations. This is particularly important in the application of neural networks, which would rather pick up the trend, than the lesser variations when they are the focus of interest. Despite the fact that detrending has been practiced for decades, there are also claims that one should not remove the trend in time series analysis, because it is considered to contain information that is somehow correlated to the other time series' movements, and that would be lost otherwise [15]. The simplest way for estimating the trend to be removed is by fitting a function to the original time series data points, using the least squares method. The functions to fit can be linear or non-linear functions. Table 1 shows some examples of non-linear trend recommended by L. Chao of [16]. There are various methods for detrending. The two most widely used ones are by estimating the trend by curve fitting; the other is by using the moving average. DETRENDING BY CURVE FITTING. This method consists in fitting a function using the least squares method and then subtracting it from the original data points. In the selection of the fitting function, caution must be taken not to use higher polynomials, although they may give a better fit. The reason is, that in doing so, one might overfit, by including and subtracting with the fitted curve not just the trend but also the cyclic or even seasonal variations. The resulting time series may be devoid of desired information. DETRENDING WITH THE MOVING AVERAGE. This method is suitable when the trend is not known and beyond the interest of the case. Detrending is done by either subtracting the local average from each data point, or by dividing each data point by the local average. In recent research we have found that this method of detrending can introduce correlations to the data [17]. Figure 3 shows how increasing the runperiods increases the autocorrelation. Detrending by subtraction does not introduce correlations. At this time however, it is not known whether other methods of detrending also might introduce correlations, nor is it known to what extent they would do so.



 Neural network systems technology in the analysis of financial time series



69



Day to Day Autocorrelation l:



0



1.0



- - detr



~



ns



e



~



0 0 0



0.5



-



RP=3



-



RP=5



-



RP=7



:::::J



ns



shift (days)



-0.5



o



5



10



15



20



25



Figure 3. Comparison of day-to-day autocorrelations for different run-periods and the detrended cost series of the S&P 500.



Another method for detrending is done by taking differences between successive samples of (stock) sample values as tl.y(t) = y(t) - y(t - 1)



(0.3)



This eliminates constant and slowly changing trends, including seasonal variations. Another alternative for trend removal is by taking the logarithm of the quotient of successive values y(t) tl.y(t) = l o g - y(t - 1)



(0.4)



Seasonal indices



The seasonal variations are expressed by their seasonal index. These are sets of numbers whose values are relative i.e. as a percentage to the monthly movements. On average these 12 numbers must add to 100%, but their index values must add to 1200'/\) for one year, or be correspondingly adjusted if necessary. Several well documented data filtering techniques or more traditional calculations for those indices are available in the literature. All have the common purpose of removing or adjusting the seasonal fluctuations. There is however, growing controversy about removing those fluctuations, because they are likely to be part ofthe sought information when using neural networks



 70



Renate Sitte and Joaquin Sitte



for predictions time series. Another source of concern is the large period used in moving averages techniques. Some of the techniques to calculate indices use periods of up to 12 months per moving average data point. Cyclic movements



Cyclic movements can be obtained by dividing the time series by the seasonal and trend components ST as expressed in equation (0.2), leaving only the CI component of the time series. By smoothing the time series then by any means, for example the moving average, the cycling movements remain, although they are affected by a slight error introduced by the smoothing operation. Again as a caution it must be stressed that the cyclic movements may contain information that affects the time series to the extent that removing them would equate ofthrowing away decisive information for the prediction. In general, any decomposition of a time series either in factors or additions has the risk of splitting some important information. Prediction



The prediction of a time series is a case oflearning from experience. The only information available for prediction are the previous observations of the quantity to predict. Specifically, the time series prediction problem consists of predicting the value y(tn+1) of an observation to be made at a future time t,,+l following a sequence of n similar observations y(ttJ, y(t2), ... , y(t,,) made at past times t1, t2, ... , t". The notion of a predictor helps in concisely formulating the prediction problem. A predictor is an operator P (.), that acting on a set of past values of the series returns a prediction for the value of the series at a future time:



y(t + 1) = P(y(t),



y(t - 1), ... , y(t - (n - 1)))



(0.5)



Predictors fall into two main classes: linear predictors and non-linear predictors. The theory oflinear predictors is well developed, but oflimited applicability because of the prevalence of non-linear phenomena, particularly in economy and finance. Some common predictors are • The moving average. The justification for using moving averages is that the noise masks the value of the data but that the noise will cancel on average. Because the function is slowly varying, the average of the last m points is a good predictor for the next value without noise. • Slope extrapolation. Special case of a linear predictor that only uses the two most recent observations. It only works when there is no noise. • Local curve fitting. Least mean square fitting of a function that can be non-linear, to the last m observations and extrapolation the fitted function to the next time. • Artificial neural networks



 Neural network systems technology in the analysis of financial time series



Models



71



of time series



According to the definition, a time series would be just the sequence of an observed set of numbers ordered along a time scale. However, a variety of time series can be distinguished either in the way the time series is observed, or in the phenomenon whose observations are recorded. It is important to recognize differences between types of time series to be aware of their limitations and nature. While most of the characteristics described here are common to financial time series, the apparent similarity of time series is at times misused as the following example demonstrates. It is common practice to use the sunspots and Mackey-Glass time series for benchmarking newly developed forecasting methods. The problem arises from the fact that these time series differ in character from the typical financial time series, The sunspots and Mackey-Glass series do not have long term trends, are highly cyclical and are not very noisy, while financial TS which have growth, have only modest periodic variations and are high in noise. In the end, the comparison is somewhat pointless, because the series are so different that to use them for benchmarking equates to comparing apples and pears. Univariate vs. multivariate modeling



A crucial distinction comes from the number ofvariables that are considered in its modeling as either univariate or multivariate. A time series is univariate when it is modeled with only one time dependent set of data observations. A time series is multivariate when additional variables are recorded in association with the time series main observations. For the prediction, the recorded data are expected to be correlated. Here is an example: the daily recorded cost of stock market shares at closing time is a univariate time series. If we maintain also a record the volume, and use both, the cost and the volume to predict next day's cost, then a multivariate model is being used. It is important to notice that our recording (or the lack of it) of additional data sets does not warrant that the time series by its nature actually depends on any of the other recorded variables. Whether it does or not depend on other variables can be found out easily with a correlation test. The choice of modeling as univariate or multivariate is ours, and consequently this includes the ease and accuracy of the prediction. Linear vs. non-linear modeling



First of all we have to distinguish between two different ways of referring to "linear" (or non linear) in time series. One of them refers to the overall shape; witch may be exponential due to the dominating trend. The other is the model, that is, the equations that are used for the prediction, which mayor may not be linear. When referring to the overall shape, the linearity is dictated by the shape of the trend. It is a condition ofthe data, which in financial time series typically is exponential. Usually after detrending, the data points of the time series lay scattered around a flat line. When referring to the prediction, linearity is an important distinction that affects the choice of the model.



 72



Renate Sitte and Joaquin Sitte



THE ARMA/ARIMA MODEL. The Auto-Regressive-Moving-Average (ARMA) is perhaps the most important linear model for time series with noise. They belong to the statistical estimation models. Observations in a time series can be affected by noise to the extent that it affects the values before or during observation. The ARMA models are used to estimate the most likely value of the prediction (mean and variance) in the presence of noise. The presence of noise links the linear models with statistical estimation theory. With additive noise the observation y(tn) will be sum of the data a (tn) and the noise £(tn)



y(t,,) = a(t,,)



+ 8(t,,)



(0.6)



Time series that appear noisy pose a special challenge and therefore noise removal (smoothing) is often the first step in extracting information from a time series. The theory of noise removal is also well developed in the area of signal processing [18]. A method that removes noise is called a filter. The input to a filter is the noisy data and the output of the filter is a data with reduced noise. The theory of linear predictors also draws on the theory of linear filters used in signal processing. A linear filter is nothing more than a weighted sum of the M most recent values of the signal. M



y(t)



= La(k)y(n -



k)



(0.7)



k=l



In the signal processing literature the set of coefficients a (k), k = 1 ... M is called the finite impulse response (FIR) of the filter. Because as time passes the filter moves along the series always covering the most recent values linear filter is also called a moving average. Replacing y(t) with y(t + 1) turns the linear filter expression (0.7) into a linear predictor, of course with different coefficients. The ARMA moving average predictor with orders p, q is obtained from the linear predictor by adding a linear term of the errors e, of each of the observation. x(n) = ho +



L h(i) x(n - i) + 80+ L e(j)8(n - j) p



q



,=1



j=!



(0.8)



When q = 0 then the model becomes an autoregressive (AR) model oforder p. When p = 0 the model becomes a moving average model of order q . The generalization ofthe ARMA model to non-stationary time series is the ARIMA (Autoregressive Integrated Moving Average) model. We will not discuss linear models any further and refer the interested reader to one of several excellent books written on the topic [12].



 Neural network systems technology in the analysis of financial time series



73



Stochastic vs. chaotic modeling



Our next distinction is between chaotic and stochastic time series. According to Menna and Rotundo, the most marked difference between stochastic and chaotic time series is the number of variables that characterize the system; they propose a method for choosing the model either way. The assumption is that irregularities in the data can come from some deterministic chaos and the presence of an attractor, or by a stochastic process [19]. STOCHASTIC MODELING. The following explanation relies heavily on Granger and Watson [11]. Consider a discrete time stochastic process, with variable Yt. We are at time t and wish to predict about Yt+h. The proposition Yt+h will be based on whatever information available at time t; we call it It. Given the set of information It, anything that can be inferred about Yt+h is contained in the conditional distribution of Yt+h given It. Because each prediction step is based on the current information, a small inaccuracy in the information, results in a slightly inaccurate prediction, which in turn, will be part of the information for the next one. Therefore, it is difficult or impossible to predict several time steps ahead. This happens to a lesser extent, if each time we can critically assess how far off the prediction was made. With this we can acquire a confidence band for a single step prediction. We call this a pointforecast; it is simply one single value Yt+h. The best point forecast of Yt+h is the one that is as close (It) as possible to the actual event, that is, with minimal error in the prediction. Let be the forecast of Yt+h based on the information set It. The error of the forecast is then



t:



(0.9)



From statistics we know that the linear minimum error is the mean square error (MSE). Based on this, as a coarse approximation we can estimate the of the time series prediction as Y'+1



= (function of y,) + 8,+1



(0.10)



where }it = (Yt, Yt-l, ...). In a statistical approach, a time series would be modeled as the joint distributions of all the possible probabilities of the occurrence of the event in observation. It is clear that we can not determine all the possible probabilities of all the variables that can influence the financial system, hence for practical reasons a linear time series is modeled as the first and second order moments, of what in theory would be all possible joint distributions. These moments are the well known mean and the variance or covariance for univariate or multivariate distributions respectively. A quick reminder: the moment generating function m (t) for a discrete variable Y with density f(y) is defined as the expectancy E of ely m(t) = E(e'Y) = Le tYfry) Y



The r th moment is obtained by differentiating r times with respect to t.



(0.11)



 74



Renate Sitte and Joaquin Sitte



This can be extended to the case of many variables, whose formulations can be found in statistics books. In time series, however, the interest is in the specific case of the mixed moments, where the covariance is the most important one to us. It is clear that if the variables are not correlated, their covariance equals zero. The mean and covariance function of the variable Yt are defined as J-t y = E(y,) qJx,y(r, s)



= Cov(x" y,) = E[(x



(0.12) r -



J-tc)(Ys - J-t y)]



(0.13)



where rand s are integers. The covariance is always calculated between two variables only. The auto-covariance is the covariance ofa variable with its own lagged value. The mean function and the covariance are sufficient to determine and predict a time series by statistical methods. Statistical predictions are calculations that in one or another way are based on the mean and covariance in a variety of refinements and escalating complexity. Several important concepts are required in time series. One of them is that of stationarity. From a qualitative point of view one can see a stationary time series as if it was produced seamless in one cast, while a non-stationary time series appears segmented into noticeably different periods [20]. A time series Yt is weakly stationary (or covariance-stationary) if its mean and auto-covariance functions are independent of time. It is strictly stationary if it depends on the intervals separating the dates only, but not on the date itself. Stationarity plays an important role in studying time series with statistical methods. Neural networks deal implicitly with the statistics of the data, and stationarity of statistical distribution of the time series data will improve prediction. Preliminary stationarity tests are not required. Another important concept is that of ergodicity. In the simplest possible explanation, an ergodic state is a positive recurrent aperiodic state [21]. That is, that after some time span the same conditions or events re-appear, but the time between the re-appearances is irregular. Financial time series exhibit ergodic behavior in the sense that some patterns are repeated in the same way, or very similar, as in the past. The concept often refers to the increasing amount of information that becomes available, as time progresses in the time series, when it is modeled as a stochastic process. CHAOTIC MODELING. Holyst, Zebrowska and Urbanowicz have modeled financial time series as a combination of stochastic and chaotic series [22]. This is important, as it seems to be pioneering work in investigating the" chaosity" of financial time series, showing that financial time series can contain a chaotic component. Their result leads to the important conclusion that underlying deterministic process may be identified form the time series data and thus short term prediction of that component may be possible. Until now financial time series were assumed to be driven by stochastic processes. A quick review of concepts from chaos theory will be helpful for understanding this type of model. A chaotic system is a deterministic system that is highly sensitive to



 Neural network systems technology in the analysis of financial time series



75



initial conditions. Minute changes in initial conditions give different results, which is known as the butterfly effect. The sequence of states or trajectories become aperiodic for a certain range of system parameters, but would be periodic outside those intervals. In the aperiodic intervals the system is said to be in chaotic regime. Two chaotic time series can be very different pointwise as for example two trajectories around the same attractor, but be produced by the same dynamical system. Chaotic systems exhibit ergodic behaviorre-visiting every point in an interval (or arbitrarily close to a point) with varying periodicity. Modeling time series using chaotic models is becoming popular and has several advantages. Time series have their seasonal regularities, as well as aperiodically recurring movements, the cyclic movements. One should notice that, the modeling of chaotic time series can only be based on artificially created models, regardless whether the time series originated from real events or not. Chaos theory, like mathematics in general, is an artifice. What this means is that one would not be able to derive a model empirically, based on the observations, as it is usually done in science, rather create or design a model that fits, knowing that it will be affected by error. While there is wide agreement that there is no conclusive evidence of chaos in financial data, detecting chaotic structures in financial data is complicated by the large noise component [23]. Chaotic time series are often used as benchmarks for financial time series [24] [25]. As indicated earlier, this can be misleading if the time series are considerably different such as the sun-spot time series and a financial time series, which have substantially different noise and trend components. Nevertheless, when the intention is to gauge the predictive ability of neural networks by benchmarking them on similar time series it certainly helps to assess the confiability of the prediction. For the purpose of predicting financial time series using neural networks, the presence or absence of chaotic components is not really relevant for predicting purposes. The neural networks would be applied in the same way as any other financial time series. While the neural network can predict the time series to whatever its ability to predict, the weights of the network cannot be mapped to specific parameters of time series that are suspected to be chaotic. Indeed the neural network's weights can never be mapped to a time series, chaotic or not, because the networks are structurally different from the phenomena that produce the time series, although both may exhibit the same behavior. Switching time series



Our last special time series are the switching time series. This type of time series originates from different sources. The series is generated by non-stationary processes, as the combination of several, alternately activated sources [20] [26]. A typical feature of switching time series is that their elements are observable, but their sources are not observable, and may even be unknown. The challenge is then to predict when a change (switching) occurs, and identify the source. In financial time series their elements stem from different market phases. Other examples of switching time series are computer network routing, medical and biological applications, video segmentation or handwriting recognition.



 76



Renate Sitte and Joaquin Sitte



,-



window



--...,....c prediction



Figure 4. The training and testing partitions of neural networks.



NEURAL NETWORKS FUNDAMENTALS



The idea of using artificial neural networks to predict time series dates back to 1964 when Hu [27] used the adaptive linear neuron for weather forecasting. Research on artificial neural networks languished until the mid 1980s when there was a strong renaissance of interest in neural networks, propelled by enormous growth of cheap computing power. In the 20 years since the explosive rebirth ofinterest in artificial neural networks the mathematical foundations of the field have been clarified and artificial neural networks techniques have been applied and tested widely. A core body of knowledge has taken shape and neural networks have gained their place in the toolboxes of engineers, scientists and economists for data analysis and modeling of complex systems. Neural networks are useful for two reasons: First, they are generic mathematical constructs with a large number of parameters, called weights, that can be adjusted to represent almost any multivariate function and a wide range of dynamic systems. Second, given a set of sample points of an unknown function, there are algorithms for finding the weights such that the neural network approximates the unknown function. Thus artificial neural networks are a data driven technique for modeling. For the purpose of illustration consider the application of a feedforward neural network to time series prediction. There are three phases: the training phase, the testing phase and the out-of-sample prediction phase. In the training phase, sections (a window) of equal length of time series data are shown to the neural network, together with its expected prediction. For the testing phase, the neural network is shown new sections of data, and its prediction is compared to the value known from the data. This is illustrated in Figure 4. The testing phase provides a measure for the confidence of the prediction. Finally in the out-of-sample prediction phase the window with the most recent values of the series is shown to the network to obtain the prediction for an unknown future value of the series. Experience shows that predictions of several steps ahead become increasingly unreliable, thus it is preferred to train the networks for one-step-ahead prediction. Predictions several steps ahead can be made iteratively by successively incorporating the prediction in the input window. However this will amplify the small error in the next value prediction, and the prediction will go off course. This section introduces the fundamental concepts for understanding the use of neural networks for the prediction of time series. It is written with the non-specialist



 Neural network systems technology in the analysis of financial time series 77



reader in mind. The mathematics is confined to the minimum needed to convey the concepts correctly. To keep the exposition brief we refrained to provide detailed accounts of the artificial neural network training algorithms, as computer program libraries with a wide range of training algorithms are widely available. Correct use of these programs does not require understanding the full derivations of the training algorithms. Artificial neural networks



Artificial neural networks are computational constructs inspired by the facts and hypotheses about information processing in natural systems (animals). Animal brains are surprisingly homogeneous in their structure. Under the microscope the cerebral cortex appears as a network ofa large number ofvery similar cells, called neurons that connect through filamentary ramifications to many other neurons. Each neuron cell has many inputs from other neurons and combines the incoming signals into a single output that is distributed by a long fiber, called the axon, to many other neurons. Signals travel through the neuron in only one direction: from inputs to output. Neuron input and output fibers touch at points called synapses. It is believed that the synapse modulates the signal transmission between nerve cells and that gradual modification of the synapse characteristic result in learning. All neurons appear to do the same type of computation; the differences between neurons are only in the number and characteristics of their synapses. Following the above model, artificial neural networks consist of many interconnected simple computing elements called artificial neurons. An artificial neuron has many inputs, each characterized by a weight factor that corresponds to the synapse characteristic in natural neurons. "Learning" occurs by modifying the weights for specific inputs. Although inspired by nature, the artificial neuron is a highly simplified mathematical abstraction of a natural neuron. In fact it is not yet certain that it captures the essence of the computation of a natural neuron. Regardless of this ANNs have very useful mathematical properties. Like natural neurons, artificial neurons are meant to operate concurrently and thus artificial neural networks should be massively parallel computing structures. However hardware implementations ofneural networks are not yet widely available and therefore for most applications ANNs are simulated as mathematical models on conventional sequential computers. Despite the predominant sequential digital simulation of ANNs, parallel signal flow diagrams of ANNs are helpful and widely used for thinking about and understanding the parameterization of the neural network algorithms. Figure 5 shows the flow diagram representation of a so-called feedforward neural network. There are many different neural net types depending on the kind of neuron being used and how they are interconnected. The two main types of neural networks applicable to time series prediction are feedforward and recurrent neural networks. A range of proven mathematical theorems supports the computational capabilities of each type. Feedforward neural networks can represent almost any kind of continuous multivariate function while recurrent networks are able to mimic almost any dynamic



 78



R enate Sine and Joaquin Sin e



y(i)



Figure 5. Flow diagram representation of a sample feedforward neural netwo rk. Signal flow is from left (input) to right (output). Th e larger circles represent neurons. Each of the incoming lines to a neuron has an associated weight factor. The smaller circles on the left side are distribution point s for the incomin g signals.



Xn X n+l -



Wn Wn+l



Figure 6. Diagram of a gene ric artificial neuron.



system . Before going further into neural networks we should first take a look at the artificial neuron by itself. The sin gle neuron element



N eurons are multiple-input single- output devices. The mo st widel y used artificial neuron is the generalized McCulloch- Pitts neuron illustrated in Figure 6. The M cC ulloch-P itts neuron calculates the weighted sum of its 111 inputs and then applies a sigmoidal (s-shaped) function a (e) to the sum to produce the output.



(0.14 )



 Neural network systems technology in the analysis of financial time series 79



w



x



where is a vector of real numbers, is a vector of weight coefficients characteristic of each neuron and b is the (constant) bias.! The linear discriminant



'"



~" ~T~ h(x)=i..Jw,x;-b=w x-b



(0.15)



;=1



is an affine transform of the input space R'" onto the real numbers R. A sigmoidal function is any function j : R ---+ R that is bounded and monotonically increasing. In practice either the logistic sigmoid a(h) = 1 + e-klt



(0.16)



or the hyperbolic tangent sigmoid a(h)



1 - e- kit



= tanh(h) = --,..,-1 + e- klt



(0.17)



are almost alwaysused, depending on whether the output should be unsigned or signed. The reason for choosing the sigmoid is that the derivative is simple to compute from the value of the function a'(h)



= ka(h)(l



- a(h))



(0.18)



This is useful, as we shall see later, when training neural nets. With the sigmoidal transfer function the output of the neuron becomes a non-linear function of its inputs. The neuron output is always between 0 and 1, or -1 and 1 in the case of the hyperbolic tangent sigmoid. The input to a neuron can have any real value, however when it comes from another neuron then it will be limited to the output range ofthe preceding neuron. The expression (0.14) for the output of the artificial neuron does not involve time. For many applications of neural networks, such as pattern classification, time is not relevant and the McCulloch-Pitts neuron can be thought of producing its output instantaneously after applying the input. This approximation is even valid for time series prediction. The reason is that values in a time series correspond to discrete moments in time separated by time intervals that are large compared to whatever time the neuron needs to calculate its output. A more realistic view is that there is some propagation delay of the signal through the neuron. Thus once the input is held stable the output will stabilize at the value of 1 We use vector notation to simplify the mathematical expressions. Vector notation makes it easier to describe operations on large collections of numerical data. A weighted sum is the same as the scalar or dot product of two vectors; they are a vector of weight coefficients and the data vector. All vectors are by default assumed to be column vectors. Using the rule of matrix multiplication the dot product can be written concisely as the product of the transpose of the weight vector (row vector) times the data vector (column vector).



 80



Renate Sitte and Joaquin Sitte



expression (0.14) after a certain delay t>. t. By measuring time in multiples of t>. t the output of the neuron at the next time step is y(t



+ 1) = a(uJ



x(t) - b)



(0.19)



This describes the artificial neuron as a discrete time device. The time dependent behavior of McCulloch-Pitts neuron can be modeled more accurately by a differential equation, which describes how the output changes continuously with time. However, we do not require this level of detail for time series prediction. Another way oflooking at the generalized McCulloch-Pitts neuron is as non-linear extension to linear filters. Linear neurons, that is, neurons with a linear instead of a sigmoidal transfer function are identical to adaptive linear filters widely used in digital signal processing. A linear filter is adaptive when used in conjunction with an algorithm that modifies the weight coefficients to meet some performance criteria. In an artificial neuron it is always assumed that the weights will be modified with training or learning procedure. Feeclforward neural networks



Neural networks like the one in Figure 5 are called feedforward because the signal flow is always in one direction from network inputs to network outputs. FFNN are made up of several layers of neurons. The outputs of the previous layer are the inputs for next layer. The outputs of any layer only feed into the next layer but never into the same layer or previous layers. The layers before the output layer are called hidden layers. The first layer receives external inputs. In Figure 5 the smaller circles on the left side are not neurons; they are input distribution points from which the neurons in the first layer take their inputs. Inputs are always supposed to be held constant long enough to let the signals propagate through the network and allow the outputs to stabilize to new values. Transient behavior is not of interest. Feedforward nets represent static mappings of inputs to outputs and there is no explicit reference to time. There are two important types of feedforward neural networks that are relevant for time series prediction: Multilayer Perceptrons (MLPs) and Radial Basis Function (RBF) networks and their generalizations. Both types of feedforward neural networks are general multivariate function approximators. We will discuss the MLP networks first not only because historically they were studied first, but also because the RBF networks were developed to overcome some deficiencies of the MLP networks that one needs to understand first. An m-input feedforward neural network takes as input m-dimensional vectors and produces an p-dimensional output of real numbers. For time series prediction and many other applications the required output is a single scalar quantity (p = 1). Thus the feedforward network implements a function f : R nJ --+ R. Neural networks contain a large number of parameters in the form of connection weight coefficients, which gives them their flexibility. Finding appropriate values for a large number of



 Neural network systems technology in the analysis of financial time series



81



parameters is usually a complex undertaking. The special usefulness of feedforward neural networks arises from the existence of so called training algorithms by which the weights of a feedforward neural network can be found so that its output reproduces the value of a function, to any desired degree of accuracy, at a given set of sample points. The multilayer perceptron



The most widely used feedforward neural network is the Multilayer Perceptron (MLP) named after the neural network computer Rosenblatt built at Cornell University in 1957 [28]. The MLP consists of at least two layers of generalized McCulloch-Pitts neurons. The MLP networks are remarkable because they are universal function approximators [29]. The Universal Approximation Theorem establishes the mathematical capabilities of an MLP with one hidden layer of sigmoidal neurons and one linear neuron in the output layer. Let a (.) be a non-constant, bounded and monotone increasing continuous function. Let II' be the p-dimensional unit hypercube [0,1]P and let C(I p ) be the space of continuous functions on II" Then, for any given function f E C(Ip) there exist values of M, Vi, hi, Wi} such that the function



Y(Xl, X2, ... ,



x p)



11 =~ via



(



f; P



W,) Xj -



hi



)



(0.20)



converges uniformly to f(XJ' X2, ... , x p ) . The function y(xJ, X2, ••. , xI') is precisely the expression of the output of a feedforward neural network with p inputs, M neurons in the first layer and one neuron in the second layer. In vector notation the same expression is: (0.21)



where W is the matrix ofweight coefficients. By virtue of the theorem an MLP should never need more than one hidden layer to represent a continuous function with any desired degree of accuracy. Furthermore the neuron in the output layer is a linear neuron, that is, the weighted sum of the output of the hidden layer neurons is not fed into a sigmoidal transfer function. Another way oflooking at the MLP expression (0.20) is that the network implements an expression of the unknown scalar function of p variables as linear combination of a set of p-dimensional basis functions, which are the sigmoid functions. An important point to keep in mind is that the Universal Approximation theorem is an existence theorem, it guarantees the existence of such an approximation, but it does not tell how to find the values of the parameters. We will return to this matter later.



 82



Renate Sitte and Joaquin Sitte



Use qffeedfonvard neural networks



The Universal Approximation theorem tells us that we can represent any continuous multivariate function on a bounded domain as an MLP feedforward neural network. This is useful whenever all we know are the values of a multivariate function at a set of sample points and we would like to know what values the (unknown) function takes at points not in the sample. If we can find the parameters of an MLP that reproduces the known values at the sample points we can evaluate the MLP at any other point and take the result as the value of the unknown function at that point. This is precisely the problem of non-linear multivariate regression. The problem of multivariate non-linear regression is: Given the set of possibly noisy values (y;, XI), (y~, X2), ... , (y;, q ) of a function on a set of sample points find a function y(xI) that best fits the given values. The mean square error (MSE) over the sample points is most often taken as a measure of the best fit. The hypothesis is that the best fit function will also produce good estimates of the unknown function values at points other than those used in fitting (out-of sample estimate). The mean square error minimization principle can be applied to finding the parameters W, and bfor an MLP with a fixed number M of neurons in the hidden layer. Straight application of gradient descent minimization of the sum of the squares of the errors over the sample set with the function (0.21) leads to the most widely known algorithm for training an MLp, namely the backpropagation algorithm", Nevertheless, the backpropagation algorithm suffers from slow convergence, and it sometimes fails to converge to the lowest error value. This behavior is a direct consequence of applying gradient descent to an error surface that is far from being a nice parabolic surface for which gradient descent works best. The error surface it the sum of the square of the errors over all training samples, considered as a function of the network weights. In fact the error function looks like numerous stepped terraces crisscrossed by deep and narrow ravines. Clearly, a gradient based method will tend to become stuck on terraces where the gradient is very small. Although the scarcity of true local minima had already been demonstrated in the late 1980s, it is still frequent to find in the literature references to the back propagation algorithm as being susceptible to get stuck in local minima. During the 1990s much effort was dedicated to improving the convergence of MLP training algorithms. Many improvements to the backpropagation algorithm were proposed and different approaches of error minimization were explored. The most successful are the so-called second order algorithms derived from Newton's method for function minimization, and line search algorithms such as the conjugate gradient algorithm. These algorithms are much more complex than the simple backpropagation algorithm but their performance is substantially better. Today the backpropagation algorithm is rarely used. Libraries of improved neural networks training algorithms are widely available; such as for example the Matlab neural networks toolbox.



x



v



2The name backpropagation was given to the algorithm because it works like successive application ofa single layer network training algorithm from the output layer backwards to the preceding layers. Unfortunately there is some widespread confusion between training algorithms and network structure. Some authors refer to feedforward networks as "backpropagation networks," which is clearly incorrect as backpropagation is not the only training algorithm for feedforward networks.



 Neural network systems technology in the analysis offmancial time series 83



Even with a good training algorithm the approximation given by the neural network is by no means unique. Several factors see to that. First, training usually starts from an initial weights estimate, usually a set ofsmall random numbers. Thus repeating training from another initial position will often end up on a somewhat different solution. Second, error optimization algorithms work with a fixed number of neurons in the hidden layer. Thus it is necessary to try different numbers of neurons in the hidden layer before a satisfactory solution can be found. In most cases increasing the number of neurons in the hidden layer will decrease the final error. Putting too many neurons in the hidden layer not only increases training time but also carries the danger of overfitting. Overfitting occurs when the fitted function also fits the noise of noisy data, in this case the error on the training sample is low but it will be much higher on samples that are not from the training set. Overfitting essentially consists in the network memorizing the sample set. Finally, even if we avoided overfitting, training a network with a different sample is likely to produce again a somewhat different result. Due to all these factors. statistical inference techniques should be applied to the prediction of out-of-sample values. We will discuss this later in the context of validation. Localized response neurons



The convergence problem of MLP training can be traced back to the shape of the error surface, which in turn is a direct consequence of using sigmoidal McCullochPitts neurons. This is because the logistic sigmoid function tends fairly quickly to the limiting value of 1 as the argument grows in the positive direction. This asymptotic value stretches out to infinity. Matching a set of function values requires the correct combination of the neuron outputs to add up to the value at that point. With the neuron output being non-zero over a half of its domain (the input space), on average, probably half the neurons in the hidden layer contribute to the function value at any one point. Thus the training will have to modify the weights of about M/2 neurons at each training point. Fitting at the next point may destroy the setting at the previous point. A careful juggling is required to reach the simultaneous fitting at all points. In fact in the limit of the sigmoid becoming a step function it has been shown that MLP training is of np-complete complexity [30]. This problem can be overcome by using neurons that have an non-zero output only in a finite volume in input space. Initially multivariate Gaussian functions were used relying on the fact that general function approximation theorems have long been known for Gaussian functions. The output of a symmetric multivariate Gaussian neuron is: (0.22)



r



The Gaussian is centered at the position in the input space, and the neuron output decays with increasing Euclidean distance Ilx - rll between the input vector and In other words the equipotential surfaces of a RBF are hyperspheres. the center



r.



x



 84



Renate Sitte and Joaquin Sitte



r



The position vector of the center of the Gaussian together with the width k of the Gaussian constitute the adaptable parameters of this neuron. RBF feedforward networks



Networks of radially symmetric localized response neurons are called Radial Basis Function (RBF) networks, for the reason that other than Gaussian radially symmetric functions can and have been used. We will however limit our discussion to Gaussian neurons. In an REF network each neuron has its own position in input space. The network output is a linear combination of the output of all the neurons:



(0.23)



The structure of the REF network is similar to the MLP in that it has a hidden layer of non-linear neurons and a single linear neuron in the output layer. Training the network implies positioning neurons at the right points in space and giving them the right width such that the added outputs produce a function with a RMS errors on the training data. With this scheme only neurons in the neighborhood of a data point contribute to value at that point. As expected RBF networks train much faster than MLP networks, sometimes up to a hundred times faster. Also, quite importantly, training can be incremental. The error can be reduced by adding another neuron to an already trained network. This cannot be done with an MLp, where changing the number of neurons in the hidden layer requires completely retraining the network. RBF networks however suffer from another problem, namely the so-called "curse of dimensionality" which makes the number ofrequired neurons grow with the power of the input space. The cause of this is easy to visualize. To represent a function over a domain ofinput space it is necessary to cover that volume with overlapping REF neurons. It is like filling a volume with spheres. The number of equal sized spheres that fit into multidimensional cube grows with the power ofthe dimension of the cube. For example if 9 (32) non-overlapping discs (2D spheres) fit in a square of side d, to fill a cubic box of side d with spheres of the same diameter we need 27 (33) spheres, and so on. A function that would suffer most from the dimensionality curse is one that changes sign regularly along every dimension, like an egg carton. The space would have to be filled with contiguous RBF functions of alternate sign. Fortunately many functions of interest have a more benign behavior. Often they oscillate slowly allowing one, or a few, basis functions of the same sign, to cover a large volume in input space, thus pushing back the onset of the curse of the dimensionality. However, as long as the basis functions are radially symmetric the problem does not go away completely even in the case of benign functions. The corresponding universal approximation theorem for RBF networks requires an REF neuron for each data point in the sample. This can lead to large networks. The



 Neural netw ork systems technology in the analysis of financial time series 85



power of local response network s is to use fewer, hopefully much fewer neurons than data points to beat the curse of dimensionality. An obviou s strategy is to start out with a small number of neurons and use an error minimization method to adj ust the width and centers of the basis functions such as to obtain the lowest error. If the error target is not achieved then more neuron s can be added successively cente red at the positions where the largest errors occur. Often the initial centers for the neurons are picked at random form the sample set. A variety of pro cedure s have been used for training RBF networks. Often the positions ofthe centers are determ ined and the adaptation of the width and amplitudes, the I' S and ks in (0.23), are optimized in separate steps. A cluster ing technique is used to place the centers, and then a gradient method is used to find the width s and amplitudes that give the lowest error. Th ere are potential impro vements to the simple REF network s that have yet to be fully utilized. The functi on fitting capability of a multivariate Gaussian node is enh anced ifits equipotential surfaces are ellipsoids with arbitrary ori ent ation, instead of spheres. A linear coordinate transformation in the form of a metric matrix M j (x - rj) that expands and rotates the axes achieves this purpose. Substituti on into (0.22) gives



(0.24)



and (0.23) becomes the output of a generalized Gaussian network



y(.x)



= L Vj '"



IIM;(;-i,f e



,



~~,~



(0.25)



j =1



Th is generalization increases the number of parameters per neuron that characterize the shape of the function from one to the square of the input dim ension . It is also possible to combine sigmoi d functions in such a way that they create a localized function similar to ellipsoidal Gaussian neurons but wi th even greater representatio n capability. These Local Cluster Neural Networks (LC N N) also have a matrix of dimension equal to the input space that characterizes the shape of the local cluster function [31]. However studies have shown that the number of requ ired local cluster neuro ns decreases drastically in comparison with the number of required spherical neu rons for a given level of accuracy more than compensating the increase of number of parameters per neuron. Moreover, LCNN net work s as well as ellipsoidal Gaussian networks can be trained effectively by optimizing the centers and the function shape parameters at the same time with an optimized gradient descent algorithm. Tillie delay neural networks



Tim e series prediction can be cast int o a multivariate regression problem suitable for solution with a feedforward neural network . When a conventio nal feed-forward neural



 86



Renate Sitte and Joaquin Sitte



1-3 1-2 t-1



t+1



Figure 7. Diagram of a Time Delay Feed Forward Network with one hidden layer, where the input is a sequence of recent values of the time series.



net is used for time series prediction it is called a Time Delay Neural Network (TDNN) [32][33]. In multivariate regression each input to the feedforward neural network receives the value of one of the independent variables. In a time series we do not have independent variables, we only have a sequence of values. However the idea behind using a feedforward NN for time series prediction is simple. To similar patterns of the m most recent values of the time series there should correspond similar values of the next data in the series. Thus there should be a mapping of the vector of m most recent values, now considered as independent variables to the next values. This mapping that can be learned by a feedforward NN. Thus the feedforward network becomes a predictor as defined in equation (0.5). It is necessary to prepare the time series data in a special way for input into a TDNN. An input sample for the TDNN consists of a window containing a chosen number m of successive values of the time series. The most recent value in the window is called the current value y(t) and the preceding values are called the delayed values, y(t - 1), y(t - 2), ... , y(t - m - 1), as shown in Figure 7. Every window of n values y(t), y(t - 1), ... , y(t - n - 1) into the time series record is an input vector for the neural network that will output a prediction 1'(t + 1) for y(t + 1).



 Neural network systems technology in the analysis of financial time series



t-3



t-2



t-1



t



y(t-3)



y(t-2)



y(t-1)



y(t)



87



t+1



Figure 8. Illustration of the working of a tapped delay line. The most recent time series datum is fed into the delay at one end (right). Data remain in each delay stage for one time step (unit delay) after which all data are simultaneously shifted to their next (left) delay stage as the next new datum comes in. The result is a window of the most recent values of the time series available on the taps between the delay stages.



An alternative view is to feed the time series signal into a tapped delay line consisting of a sequence of unit delays with the neural network connected to points in between the delays as shown in Figure 8. The delay line acts as a short-term memory for the recent values of the series. In this fashion the time series is applied to the first input and then propagates through the delays from one input to the next. The effect will be that the network input window sweeps over the values of the time series moving the window one position at each time step. The method just described for using feedforward networks for time series prediction was initially purely empirical but has later found a theoretical justification by Takens' embedding theorem [34]. Takens' embedding theorem has to do with creating a multidimensional state variable form the observation of a time series of an observable quantity of a dynamic system. The state variable of dimension d is simply created by associating with each time t , a state vector consisting of the segment y(t - 1), y(t - 2), ... , y(t - d) of the time series. Although dynamic systems are governed by differential equations it is also possible to describe a dynamic system by a mapping of state XI at a time t = k to the state Xk+ 1 at t = k 1



x



+



(0.26)



Repeated application of f creates a trajectory in the systems state space.



 88



Renate Sine and Joaquin Sin e



..



...



0.8 0.6



..



.. .. ..



0.4



r. . -;;::, 0.2



.r



0



0.2 0.4



.... ..



0.6



..



0.8



f



0



....



5



10



15



... time



... 20



25



30



35



Figure 9. T ime series obtained from the modulated sine function y(t) = sin(t) sin(t/ 10).



Let x (l ), x(2), ... be the first coordinate of a trajectory of a dynamic system corresponding to the mapping f that has an attractor of dim ension m. Takens' theorem says that if d :::: Zm + 1 then the tim e-delay vectors x(k)



==



(x(k + d), x(k



+d -



1), . . . , x (k» )



(0.27)



will accurately recon struct the attractor of the dynamic system correspo nding to f. More precisely if the attractor A is a m-dimensional manifold then the points x(k) give an embe dding of A in Rd. Thus it is possible to int erp ret the time series as a one-dimensional projection of the mapping f [35]. The task of the neural network is to learn one component of the mapping f : R d +- R d xl (k + 1)



= 11 (x (k»



(0.28)



Figure 9 shows the (noiseless) time series generated from the function y(t) = sin(t) sin(t/lO). The trajectory in a state space of dimension d = 3 generated from the time series is show n in Figure 10 and Figure 11. Clearly the attracto r is of dim ension d = 2 embedded in three dimension s. Figure 12 shows the result of trainin g a TDNN network with 8 hidden unit s and a window of 4 time steps to predict one step ahead for the modul ated sinusoid time series. Th e network was trained with th e first 20% of the data after whi ch it could



 Neural network systems technology in the analysis of financial time series



89



State space trajectory (d=3)



0,5



y(3)



a



0,5



1



a 0,5



y(2)



a



0,5



0,5



y(l)



0,5



Figure 10. View of the attractor for a modulated sine time series, embedded in a three-dimensional state space, The view shows that attractor in profile demonstrating the it is indeed two-dimensionaL



____



State space trajectory (d=3) ~



----- -



-----=---e-



0,8 0,6



0.4 02



y(3)



°



0,2



0.4



06 0,8



1 1



y(2)



-----1 °



C--~_ _~,--- ---~",-- ~-



~5



y(l)



--



~-----"



0,5



Figure 11. Front view of the attractor for the modulated sine time series showing details of its two-dimensional structure.



 90



Renate Sitte and Joaquin Sitte



1.5



- - Time series NN Prediction Prediction error



1.0 0.5



-



:i="



~



0.0 -0.5 -1.0



20% Training ~set~



-1.5 0



200



<:



80 %Predicted 400



600



800



1000



timet Figure 12. TDNN prediction of a modulated sinusoidal time series. The network was trained on the first 20% of the data and was used to make one-step-ahead prediction on the rest of the data. In this figure the prediction totally overlaps with the time series, showing only one solid line.



accurately predict the next value for the remaining 80% of the data. Without having seen a sample of the time series where the amplitude decreases, it was able to capture the shape of the modulation envelope from an initial segment of the time series. In this figure the prediction totally overlaps with the time series, showing both curves as one solid line. The horizontal curve in the middle of the figure is the prediction error. It is also interesting to note that each of the neurons in the hidden layer ofTDNN acts as a finite impulse response filter (FIR) feeding into a sigmoid non-linearity. In addition to this, some authors have suggested to replace each weight in a FFNN by an FIR filter, that is, a sequence of delays and associated weights. The comparative study by Bone et al. suggests that FIR synapses are a complication that is not justified by the results [36]. One of the design parameters for a TDNN network is the size of the time window. Too big a window will increase the computational effort without any improvement in prediction accuracy. Too small a window may not capture the correlations that exist in the data. The optimal window size is data dependent and there are no theoretical principles for determining the optimal window size a priory. Trials have to be made with different sizes to find the smallest size that gives a desired prediction accuracy. The size of the time window determines how far back in time the series is remembered.



 Neural network systems technology in the analysis of financial time series



91



Recurrent networks



In recurrent neural networks there are no restrictions on the interconnection between the neurons. The output of a neuron may feed into any other neuron in the network including itself. A feedback loop occurs when there is a connection from the output of a neuron to one of its own inputs, either directly or through a chain of neurons in the network; hence the name recurrent. In the case of a direct connection one of the inputs to the neuron at time t is its output at time t - 1, which is caused by the inputs to the neuron at time t - 1. Since the inputs at time t contain the output of the neuron at time t - 1 the neuron's output at time t is affected by the neuron's output at all previous times. The dynamics of a network with recurrent connections implicitly provides a memory of previous inputs to the network. The immediate consequence for time series prediction is that there is no need for a delay line on the inputs of a recurrent neural network for making the history of earlier inputs bear on the current output. The feedback loops in recurrent networks cause a complex dynamic behavior of the networks that is far from being completely understood. Despite of the complex dynamics it has been possible to prove that recurrent neural networks can approximate almost any dynamical system, discrete [37] or continuous [38]. Because of the great diversity of the possible interconnections patterns there are many different types of recurrent neural networks. We will limit the discussion to some of the simpler recurrent neural networks that have been applied to time series prediction. The elman net



A simple way of constructing a feedback network is to add feedback loops to the MLP as proposed by Elman [39]. Elman adds a set of context units that buffer the output of the hidden layer units, as shown in Figure 13. The context units feed again into all of the hidden layer units in the same way as the input units. In this way the output of the hidden units is applied in the next time steps to the hidden units along side with new input. At each time step the input to the network consists only of the current value of the time series. Contrary to the normal FFNN, the training data for the time series has to be presented to the network in sequence. At a given time the output of the context units is dependent on the sequence of values previously presented to the network. Instead of feeding back the outputs from the hidden layer the network output can also be feed back to the hidden layer via a context unit, or one context unit for every output in case there are more than one output. This form of feedback network, proposed by Jordan, is known as a Jordan network [40]. Training recurrent networks



Recurrent neural networks require more complex training algorithms than feedforward networks. However the Elman and Jordan networks can still be trained with the same algorithms as the FFNN. This is not the case for the more general recurrent



 92



Renate Sitte and Joaquin Sitte



output feedback



Figure 13. The Elman net captures the temporal context by feeding the output of the hidden layer units back to the context input units.



networks. There are two mam algorithms for trammg recurrent neural networks: Backpropagation-through-time (BPTT) and real-time recurrent learning (RTRL). As fully recurrent networks appear not to have any advantages over the TDNN or Elman and Jordan, partially recurrent networks, we will not explain these algorithms here. Relativeperformance of different neural network architectures



In a comparative study of the performance ofTDNN,Jordan and Elman networks for predicting chaotic time series Mandziuk and Mikolajczak [24] found that there was little difference between the performance of Jordan and Elman networks. The best performance was obtained for TDNN networks with two hidden layers. Support vector machines



A variant of feedforward neural network algorithms with roots in statistical learning theory known as Support Vector Machines (SVM) have received much attention in recent time because of their good performance in classification tasks. The idea behind support vector machines is to the decision surface by mapping the input space into a high dimensional feature space. This idea can also be applied to regression. In this case the time window data will be approximated by a linear function in higher dimensional space. The fitting criterion is the usual least square error and the neural network is called a LS-SVM.



 Neural network systems technology in the analysis of financial time series 93



Contrary to the popularity of SVM for classification the application of SVM to regression is still in its infancy. Van Gestel et al. have used LS-SVM to estimate the error (volatility) in the one-step-ahead prediction ofthe US short-term interest rate and the DAX30 index [41]. The advantage of the LS-SVM formulation is that expressions for the error can be obtained and the Bayesian inference framework can be applied to the estimation of the error. Validation



From the earlier description of using feedforward neural networks for regression it might appear that all we need to do is to train a neural network with a set of example input-output pairs until it reproduces the output corresponding to each input with desired accuracy. Once this has been achieved we can obtain the unknown function value for any other input (out-of sample input). The problem with this procedure is that we have no indication about the trustworthiness of the network's answer for out-of-sample inputs. Note that only when we are certain that the data contain no noise can we expect to obtain negligible small RMS error on the training set. The first step towards obtaining some indication of the accuracy on out-of-sample network outputs is to set apart some of the available input-output examples for testing the network performance. The data in this test set are not used for training. The desirable situation is that the RMS error of the network output on the test set and the error on training set are roughly the same. An error on the test set that is significantly larger than the error on the training set indicates that the network is been tuned too closely to the particular values of the training set. One solution to the problem of overtraining is to set apart a third set from the available data, the so-called validation set. During training the error on the validation set is monitored and training is stopped when the error on the training set drops below the error on the validation set. This technique is known as early stopping. An alternative is to reduce the number of neurons in the hidden layer of the feedforward network until the error on the training set does not decrease with further training time. It is known that the more neurons there are in the hidden layer, the more accurately the network can fit every data point. Essentially what this does is to force the appropriate level of smoothing by limiting the networks capacity of closely following the rapid variations of the data. The achievable smallest error depends on the combinations ofmany factors and their possible multiple choices, such as the number of hidden nodes in a feedforward neural network, the parameters ofthe neuron transfer functions, the particular split of the data set into training and test set, and the parameters ofthe training algorithm. Furthermore each training session is started with a random initialization ofthe weights. Therefore for a particular data set, there arise infinitely many training situations, each characterized by a particular combination of choices of the above factors. The consequence is that the output of a neural network, for a particular input can be treated statistically as a random variable. Any output obtained is just one sample from the possible outputs of a population of training situations. Given all these rather arbitrary decisions one would like to know what confidence can be put on the on the output of a trained neural network. Variances of estimators of



 94



Renate Sine and Joaquin Sine



characteristics of the population, such as the mean, can usually be inferred from a set of samples if the underlying probability distribution is known.'. In the case of neural networks this probability distribution is not known. The bootstrap technique allows the estimation of variance and other measures of error of an estimator without having to know the underlying probability distribution. Bootstrapping is a method for statistical inference where empirical sample distributions are generated for statistical analysis. The bootstrap technique applied to an available sample of n elements consists of drawing a large number of bootstrap replicas of the sample, each obtained by drawing with the same likelihood and with replacement, nelements of the original sample. The estimator is calculated for each bootstrap sample and from this the variance of the estimator is calculated with respect to the average of the estimator over the bootstrap samples. For example, consider the RMS value of the error of prediction of a time series by a neural network. An estimate of the variance could be obtained by applying the bootstrap technique to the sample ofRMS of values obtained by training the network n times with randomly initialized weights. Bagging is the idea of the bootstrap method applied to the training set. Multiple Bootstrap replicas of the training set are created and used for training. The output of all the networks is averaged to obtain the output value for a given input vector. Cross-validation



The common practice for evaluating the performance of a neural network in an regression or classification is to split the available set of input-output data pairs into three sets: the training set, the validation set and the test set. The training set is used for fitting the weights (training) of the network, the neural network performing best on the validation set is used to test how well the network reproduces the output for inputs not used in training (test set). An alternate method is a jackknife technique that consists of using all except one input-out pair for training, and using the remaining pair for testing. This is done for many times always leaving out a different pair. DATA PREPARATION FOR NEURAL NETWORKS



The preparation of the data for time series analysis depends much on what the focus of interest is, and the granularity or level of resolution required. If the focus interest is on a monthly or yearly resolution, then it is covered by the indexes and seasonal movements techniques, and things like trend or random fluctuations should be removed or smoothed. Larger scale economies would rather be interested in trends or cyclic movements. Much of the preparation depends on the techniques to be applied. When using statistical methods much of the data is preprocessed. In this process some of potential information is removed, intentionally or not. This is done under the assumption that the lost information is not relevant. This happens to a much lesser :,An estimatorfor a characteristic of the population is the value for the characteristic, i.e. a parameter computed on a sample, if the sample is sufficiently large, which in turn depends on the variance. The sample mean is an estimator for the mean of



the population.



 Neural network systems technology in the analysis of financial time series



95



scale when using neural networks. Still, to obtain good results with neural networks the input data have to meet some requirements. Results can be biased by the transformations that are required for pre-processing the data. While the choice ofpre-processing depends on the data, the widely practiced and recommended pre-processing of the data such as detrending with a moving average can introduce auto correlations in the data, which in turn can affect the predictions [17]. A systematic analysis to gauge possible correlations introduced to the data among the variety of time series analysis techniques is yet to be done. The time series data preparation suitable for input to neural networks comprises four steps • Detrending • Smoothing • Normalizing and scaling • Structuring the data. Detrending



Given their importance, detrending and smoothing alternatives were explained earlier. Detrending is important for neural networks especially when the trend is strongly increasing or decreasing. Otherwise the neural network would predict just the dominant trend, and not the other features that we might be interested in. The reason is that neural networks have a finite output range, which is determined by the neuron transfer function used. For sigmoidal neurons this range is either (0, 1) or (-1, 1), for Gaussian functions it is (0,1). Although these values can be scaled up or down when linear neurons are used in the output layer, which is mostly the case, the output range will always be finite. In practice the number of data points in a time series will be finite and consequently the values of the series will be bounded. It is customary, but not necessary to normalize the data, i.e. scale the data to the range (0, 1) or (-1, 1). For time series that do not have a strong trend component, like for example the sunspot series no scaling of the data is required. However for series with a pronounced trend the finiteness of the output range will cause the network to become insensitive to the variations in the time series in the region where the values are small compared to the regions where the values are large. This is particularly pronounced when the time series has an exponential trend, which is often the case with financial series such as stock indices. In such cases it is necessary to remove the trend, at least partially to obtain a series that does not have pronounced differences in the range of variation of the values across the whole time span of the series. This also applies when the values of the series are large in relation to their variations. In this case the series has a large constant trend that should be subtracted. Smoothing



The other preprocessing operation commonly applied to time series is smoothing. Neural networks, when used properly (see the discussion of overfitting earlier in this



 96



Renate Sitte and Joaquin Sitte



chapter), do not require strong noise filtering capabilities and smoothing of the data. However if short-term variations are not of interest smoothing the series may help to reduce the time required for training. Normalizing and scaling the data



Normalizing or scaling the data is common practice in data analysis for several reasons, for example to easy comparison among different data sets. For neural networks normalization is only required when the output neurons have a sigmoidal transfer function, which is mostly the case when the purpose is classification. With a sigmoidal transfer function in output neuron it is advisable that the output values avoid the saturation regions of the sigmoids. By scaling them to be between (-0.8, 0.8) or between (0.2, 0.8). [6] [24]. Normalizing and scaling can be done in the following way.



NV =



DV NV MD md UppN 10wN



DV-md MD-md



x (UppN - lowN) + lowN



(0.29)



Data value Normalized value Maximum data value minimum data value upper normalized boundary lower normalized boundary



In time series prediction, linear transfer functions are preferred for the output neurons. Scaling is not necessary in this case, as the weight adaptation in training can undo any scaling. Structuring the data



So far the research on time series prediction with neural networks indicates that perhaps the best predictive capability for time series is achieved with time-delay feed-forward neural nets. For such a network an input an input sample consists ofa window containing a chosen number m of successive values of the time series. The most recent value in the window is called the current value y(t) and the preceding values are called the delayed values, y(t - 1), y(t - 2), ... y(t - m - 1), as shown in Figure 7. Each input to the network will correspond to a specific delay. It is convenient to think that each input to the feed-forward neural network is connected to the next by a delay line. In this fashion the time series is applied to the first input and then propagates through the delays from one input to the next. The effect will be that the network input window sweeps over the values of the time series. At each time step the network senses one current value and the corresponding set of delayed values. From these inputs the net will be trained to predict a future value of the series. In a one-step-ahead prediction this value is the next value y(t + 1) in the series.



 N eural network systems techn ology in the analysis of financial time series 97



c1



c2



c3



...



...



...



c2



c3



c4



...



...



...



c3



c4



c5



...



...



...



... F lce Tr aining data



v1



v2



v2 v3 ...



...



... ...



...



...



Test data



v3



...



...



...



...



..,



v4



...



...



...



...



...



v5



...



...



...



...



...



"Vol ~meT raining data



...



...



.. .



I v3 v4



Figure 14. Schematic diagram for the neu ral network input data preparation . shifting left by on e position in each row.



The data inpu t set is based on detrended, normalized and scaled time series. We have to train the network by telling it that on inpu t of small section of time series (the I), . .. , y(t)) it must predict y(t 1). window), (y(t - Ill ), y(t - m For practical purposes, we can think of the input data as a large data array A of m (the window size) rows, and N - 1/1 columns, where N is the size of th e time series. For the case of a one-day ahead predi ction. the data block is arranged such that the first row contains the data y(l ) to y(N - m + 0). The second row is shifted by one data point and contains the data y(2) to y(N - m + I). Co nsecut ive rows are always shifted by one place, until the last row which starts with Y(I/1 ) to )'( N - 1) or less. The last few elements of data have to be discarded to fill all colum ns after shifting the data to the left in one po sition in each row, as shown in the diagram in Figure 14. Each data input vector will be a column of size m of the array A. We can thin k now of ano ther row that contains the data po ints y(m + 1) to y(N) . This row will contain th e expected predictions that th e network must learn for each of the columns above. For a mu ltivariate data set, the matrix A wo uld be appended with another set of rows with the sam e struc ture as A, but wi th data values from th e oth er variable, and so on. For each variable, another block of rows is appended. The order of the two or more layers of blocks is not relevant. For recurrent networks this step is not needed as only on e value, th e most recent one, of the series is required as input for the prediction. H owever these data have to be input to the network in the correct time ord er, both whe n training and when predicting . T his is because the output of a recurrent network not only depe nds on th e current input but also on what has be en input to th e network before. In contrast, th e data vecto rs for the TDNN are independent can be input in any order. That is the



+



+



 98



Renate Sitte and Joaquin Sitte



output of the network only depends of the input vector (window) and not what has been input before. The last thing that remains to do, is to decide how much of the data will be training data, and how much will be testing data. This has been explained in the neural network section under Validation. Time series and neural networks applications



This section focuses on recent applications of neural networks to financial time series reported in the literature. Neural networks can be used in financial time series for different purposes. The design factors of a neural networks such as the architecture, the input variables, the amount of training data, the span of the prediction, all have a significant impact on the predicting performance of the NN. In this section we look at the purposes of the application, the type of network used, and the performance achieved. The aim is to provide insight into what has been done, to be able build upon previous expertise and to open new windows of research to the research community or newcomers to this area. The predominant application has been the one-step ahead prediction of the time series. Recently several researches have extended the use of neural networks to the prediction of the volatility of asset prices. Time series forecasting with neural networks has started only a bit over two decades ago. An extensive and elaborate survey was compiled by Zhang et al. in 1996, summarizing the then state of the art offorecasting with ANNs [2]. Zhang et al. include in their work not only financial time series, but also environmental science time series, such as river flow, sunspots, transport time series such as air traffic, and others, such as student performance. Based on the fmdings ofthe works surveyed, they concluded that the popular FFNN has several weaknesses. One of them is that the training parameters of the neural network playa critical role in the performance of the network, and that the neural network's learning rate depends on the complexity of the data. A range of modified backpropagation emerged and improvements in the predictive performance were achieved. The improvements were achieved in particular due to introduction of improved optimization algorithms. Zhang et al. then suggested a future introduction of optimization techniques such as genetic algorithms, simulated annealing, or other optimization algorithms to facilitate the search of an appropriate objective function that works well for the prediction. Expansion of neural networks to Fuzzy neural networks or wavelet neural networks was also deemed as promising. However so far these techniques did not have the expected impact. The conclusion from Zhang et al. exhaustive survey was that for the cases of the financial time series covered in their survey, the ANNs outperformed statistical models' predictions. In some cases of weekly, monthly and hourly exchange rate time series the performance was better in at least one order of magnitude, in others it performed only marginally better, but monthly or weekly data were better predicted than annual data. The stark differences in performance could be attributed to possible inadequate choice of network, and inappropriate or lack of data preparation.



 Neural network systems techno logy in the analysis of financial tim e series 99



In the following years, since Zh ang's et al. compilation in 1988, many applications of time series and neural networks have been reported. However, no t as many as one could wish do belon g to the financial tim e series. W hile most of the research don e there after, consistently agrees that the predictive performance of neural networks is higher than tradition al methods, the focus is now on improving those metho ds and expanding to modeling comp lex events such as multi variate, switchin g, and chao tic tim e series. Types of applications



To make the discussion about applications and neural networks performance more specific and meaningful , we have grouped the applications into three gro ups: • Forecasting • Multivariate models • Switching time series • C haotic time series Within these groups we look at the performance of both, using sole neural network s and neural network s combined with other techniques such as Fuzzy neural network s and Gene tic Algorithms. A substantial interest seems to lay in applications of forecasting stock market and currency exchange rates. Additi on al effort goes int o probing for indirect or complex infor mation that can be contained in a tim e series, and which is reflected in the prediction. In tho se cases the time series is then modeled as a multivariate tim e series, rather than a simple predicting the future by looking at the past. Modeling the tim e series as a multivariate series means that there is correlation in the data, wit h the aim of extracting this implicit infor mation it to aid in decision taking. Forecasting stock market indices and CIIrretlcy exchange rates



Forecasting is based on the motivation to produ ce predictions of future events as accurate as possible, based on informa tion of past events, for financially beneficial purposes. Repo rted work on financial time series prediction using neural networ ks often shows a characteristic one step shift relative to the original data. T his seems to imply a failure of the neural network, because a shift corresponds to a rando m walk prediction. To investigate this Sitte & Sitte set up a comprehensive, systematic analysis ofdifferent tim e delay N eural Networks (TD N N) and Elman neural networks applied to the detrended S&P 500 time series [7][42]. System atic short- term correlation experiments were done with networks trained to predict one day ahead to detect a mapping of previou s values on current values if such a mapping issuppo rted by the data. To find the best performing predictor, the input to the neural networks was varied for each prediction run by changing the window size, ranging from one day to up to a month back in tim e. The network s were trained on the first 60% of data, with weights initialized randomly, and tested on the remaining 40%. In another group of experiments the neural networks



 100



Renate Sine and Joaquin Sitte



were arranged by systematically changing the combinations ofwindow size and number of nodes in the hidden layer between 2 and 32 each. For the Elman recurrent networks the number of context units was progressively increased from 2 to 16. All the networks were trained and tested on the same portion of the data set. Finally, to investigate the influence of the size of the training set on the prediction capability of the networks the proportion of the data used for training were changed by incrementing the training proportion in steps of 10%. For each of the experimental runs three RMSE errors calculated separately for both, the training set and test set; the recorded errors were the random walk prediction error, the TDNN prediction error, and the difference between the neural networks prediction and the random walk prediction. Calculations of the autocorrelation between the data shifted for up to 50 days revealed that there are no short-term correlations in the data. The result from these systematic experiments were be summarized as follows [7]



[42]:



• For all parameters tried, both, the TDNN's and Elman nets' best prediction was the random walk prediction • Increasing the window size or the number ofnodes in the hidden layer ofTDNNs, or more context units in Elman networks had no effect on the prediction performance. • Increasing the training set did not improve the predictions • The randomness of the residual time series was supported by statistical tests Prediction errors for several low order polynomial extrapolations were also calculated, but their predictions were significantly worse than the random walk predictor. In essence this research indicated that the random walk prediction behavior is not a limitation of the network, but may be a characteristic of the time series. It is consistent with findings of decades of statistical analysis. One should be aware that the S&P 500 index is a special case in that it is an average of a large number of stocks and therefore the deviations of individual stocks from the economy's exponential growth have been averaged out. The economic environment affects individual stocks differently and many do not track the overall exponential growth of the economy. These results do not exclude the possibility of other tools being able to extract more information, for lucrative purpose, as the purpose ofthis study was not to find a "better way" to predict stock markets, but to study the predictability of neural networks. A different stock market prediction was sought by Plikynas et al. in their work on neural networks applied to forecasting stock exchange indices [43]. The particular interest was to investigate the construction of compound models that include macroeconomic data due to their possible influence on the indices for better predicting Lithuania's National Stock Exchange (LNSE) indices. To do this, measures to estimate the accidentness of index movements derived by using analysis for entropy and correlation. The data were then preprocessed by filtering out model input variables such as LNSE indices, macroeconomic indicators, Stock Exchange indices of other countries such as the USA-Dow Jones and S&P, EU-Eurex, Russia-RTS. Different backpropagation ANN learning algorithms with variable configurations,



 Neural netwo rk systems technology in the analysis of financial time series



101



iteration numbers and data form-factors were applied to find the best approximation and forecasting capabilities. A linear discriminant analysis was used to compare the autoregressive, autoregressive causative and causative trend model performance of the ANN. The results from this study have shown a high sensitivity to ANN parameters. In particular the NN-MLP method achieved one or two orders of magnitude higher prediction performance than the simple autoregressive trend model. It is anticipated that with the appropriate optimi zation techniques, the neur al networks could outperform the multidimensional linear regression . The work of Pantazopoulos et al. [44] uses an elaborate neurofuzzy approach to aid in trading strategies involving stoc ks and options. The options trading strategy is based on neural networks predict ors from the daily prizes of movements of the index, whi ch can be up, down, or sallie. Trading strategies are tested on their profitability using historical data of the S&P 500. The prediction is typically a few days ahead. The returns reported on the simulated trades appear extremely high. It is unclear from their paper how the neural netw orks were trained. Two meth odologies were set up. One methodology is set up for the stock trading strategies with a FFNN with two hidd en layers and a single output layer. For the other methodology a set of three neural networks with recurrent connections, with three outputs each, were used. The fuzzy part consisted in training the networks with three values represent ing the center, and the upper (center +a) and lower (center -a) limits of a triangular membership function . The outputs of the trained networks are then combined into crisp value using a fuzzy (centroid) average. N either the motivation of this complicated procedure nor a comparison with a standard method were given. T ino, Schittenkopf and Dorffner, compared Markov models and Elman recurrent networks for the prediction of volatility of DAX and FTSE stoc k index for the purpose of options trading [45]. They quantized the volatility to two levels, up and down. The reason being that earlier studies by the same authors showed them that predictors operating on continuous time series performed worse than those operating on quantized series. The two possible input symbols were coded as a pair of digits one position for each value, that is (1,0) and (0,1). The networks also had two outputs, one for each of the two quantized values. The performances of the predi ctors were evaluated by the average profit per day obt ained by a trading strategy based on the predictions. They found that the Elman net never performed much better than the simpler Markov models. They also found in a separate study that TDNN never outperformed the Markov models. Walczak has investigated the effects of different sizes of training sample sets on forecasting cur rency exchange rates [10). This work demonstrates that, given an appropriate amount of historical knowled ge, neural networks can forecast future currency exchange rates with 60 percent accuracy, while those neural networks trained on a larger training set have a worse forecasting performance. In addition to higher-quality forecasts, the reduced training set sizes reduce development cost and time. Different training sets are used with training data spanning between 1 and 21 years. Best perform ers were with 1 to 5 years training data, with neural networks with three input values, one hidden layer with five nodes and one single output value.



 102



Renate Sitte and Joaquin Sitte



Kodogiannis and Lolis developed improved models to forecast currency exchange rates, for one and multiple step ahead predictions [46]. They compared several models on conventional neural networks prediction using MLp, RBF, Dynamic neural networks such as autoregressive neural networks (ARNN) and memory Elman (M.ELMAN) and Neurofuzzy systems. Results compared with real data revealed that Adaptive Fuzzy Logic Systems have a slightly better performance than MLPs or Elman nets. Hu and Tsoukalas predict the volatility of the European Monetary System (EMS) exchange rates, by using a combination of four conditional volatility models and an neural networks [47]. The input data spanned 15 years for the US dollar and 10 European currencies. The prediction capability was tested and compared on out of sample predictions. The neural networks used was an MLP with one input layer with four input nodes, one hidden layer with four nodes, and one output layer with one node, and additionally 5 bias nodes. Inputs were normalized to the (0, 1) range. The RMSE and RMAE were recorded. The prediction results were mixed, for the EMS currency rates as a managed float regime, the one for the conditional volatility models predicted better, but for crisis times the neural network predicted better, in the sense that it had consistently lower RMSE than the other models. Singh compares the forecasting performance of a non-NN long memory pattern recognition system with the performance of neural networks [48]. The pattern recognition system uses conventional pattern recognition models. The introduced concept is based on the observation that most financial markets have a long memory, in the sense that whatever occurs today, affects the future forever. Singh claims that such long-term data correlations cannot be detected with predictors that use only a short portion of historic data to predict. A comparison is performed with tests using data from the DAX, FTSE, FRACAC, EOE and S&P 500 series for a duration of eight years. Two neural networks are developed, one to predict the up or down change in direction, the other for predicting the index returns. Each of the neural networks has five inputs, and one output. The nodes in the hidden layer were incremented stepwise to find optimum. For training 90% of the data was used, and 10% for testing. Training is done with the backpropagation algorithm. The result of this study is that in general the proposed pattern recognition system comes close to an neural networks in predictive performance, but still requires further refinement. Multivariate models



There are two ways in which time series can be modeled: univariate or multivariate. One is in the way we have alwaysbeen told to distinguish between uni and multivariate models, that is by having one or more than one variable. For example, if we model a time series as the cost of stocks, then it is univariate, but if we model it with cost and volume, it becomes multivariate. Another, perhaps less intuitive way is associated with neural networks. It consists in taking a window, that is a set of, say, three or five successive readings of the time series data points at a time. These lots of three or five or whatever number of data points can be interpreted as a multidimensional vector, and would input to a neural network with three (or five) input nodes. However, these



 Ne ural network systems techn ology in the analysis of financial time series 103



Volume Predictions



Co st Pred ictions



0.003



7.00E-Q4



0.002



6.50E-Q4



0.001



vol win dow



6.00E-Q4



3



4 4



(a)



3



5.50E-Q4



2 1



cost window



window



4



OJ 0



u



i :2



'!c.



3 2 cost wi ndow



(b)



Figure 15. (a) Volume prediction errors and (b) cost prediction error s, both based on varying combinations of volume and cost training window sizes between 1 and 4.



windows could come from one single data set, or from different data sets, for example a window of five data points could be made up of - say three data points from cost and two data points from volume at closing time. In a follow up to earlier work [71 [42] investigations of the S&P500 time series predictions by Sitte & Sine were expanded from a univariate to a mult ivariate time series model by includ ing both , cost and trading volume [17]. To do this the TDNN was trained on earlier values of both , volume arid cost, but predicting either the volume, or the cost only. Several sets of combinatorial exp eriments were performed by varying the training window size from 1 to 4 days of volume data, and 1 to 4 days of cost data. Fluctuations in the volume grow with the average volume. So, for the volumecost experiments we de-trend ed with a moving average by dividing each original data point by the moving average at that point. After findi,;g that the moving average introdu ced an autoco rrelation , different moving average peri ods between 3 and 7 were tested and comp ared these with the prediction of the tim e series de-trended by subtraction. While detrendi ng the time series by subtracting the trend does not int rodu ce autocorrelations in the data set, detrending with a moving average does introdu ce some correlations. Despite this, the volume-cost predi ction s still make better predictions than a random walk. In the volume-cost experiment s the TDNN were trained with both, volu me and cost, and predi cted eith er volume or cost, the errors on the test set were systematically 20-25% lower than the random walk errors. Wh ile this could be attributed to the introdu ction of correlations by the ru nnin g average data preparation , the results show that there are also contributions by the volume data, slightly improving the cost prediction. This becom es apparent by comparing th e two histograms in Figure 15. In both cases the T D N N were trained with exactly the same data: volume and cost. The histograms show the test error for different combinations of window size proporti ons. T he left figure shows the volume predictio n for the



 104



Renate Sitte and Joaquin Sitte



next day, and the right figure shows the cost prediction the for the next day. If the correlations were due to the data preparation only, both figures would be qualitatively equal. However, they are not, as one shows very little variation while the other clearly shows variation. It gives evidence that the volume prediction is little affected by the size of the volume or cost windows, while the cost predictions are indeed affected by these parameters. These figures also demonstrate that there is information in the volume that affects the cost, because there is a decrease in the test error as the volume window size increases. Conversely there is also an effect of the cost on the prediction, because the error decreases as the cost window sizes increases. In comparison the random walk errors for this set of experiments were 4.4 x 10- 3 for the volume predictions and 8.2 x 10- 4 for the cost predictions. A word of caution: there is no point in comparing the values of the volume test prediction errors with the values of cost prediction errors, because the absolute value of the errors depends on the data set. The comparisons become meaningful only in comparing trends within either class, and comparing those trends between the classes. Another application looks at energy forecasting. Given a lack of consistently successful elasticity forecasting methods, Nasr et al. set up a set of experiments to estimate gasoline demand using neural networks [49]. A set of four different neural networks configurations provides for modeling the gasoline consumptions as both, univariate and multivariate models. The first is a univariate model that uses gasoline consumptions only. The second forecasts gasoline consumption as a function ofprice. The third model includes the car registration as a determinant of gasoline consumption. The fourth model combined all three time series. The neural network parameter values and number ofhidden nodes vary according to the modeling of the gasoline consumption time series, in any of the four models, i.e. between 1-4 hidden layers. The neural networks were trained with a backpropagation algorithm. RMSE and MAD are the indicators of the prediction performance. The data set used was a time series of seven years that was split into five years for training, one year for testing and two last years for evaluation of the prediction performance. Between 1 to 10 input values are used, either from one single variable such as price only, or various values for the 3 variables, for example gasoline consumption, price and car registration. This research found that multivariate models gave better results than others, with approx 23% improvement over the univariate model for this application. Burgess and Refenes proposed an extension of the neural networks, in which error feedback is provided to the neural network, achieving substantial improvement for volatility forecasting [50]. In a systematic experiment, the effect of the number of estimated parameters either as AR or ME was investigated. The experiment is based on the duality ofAR and MA, with either high or low number ofparameters involved. A total of 12models were estimated and evaluated for each of the 400 realizations of the target series. The data set used was either generated by Monte Carlo methods of up to 400 data points, or real data from the Spanish Ibex 35 implied volatility series. The windowsize used for inputs was varied from 1 to 3; the models were estimated



 N eural network systems technology in the analysis of financial time series



105



both with and without error feedback; for each combination both linear model s were estimated with no hidden units, and non -line ar models, with four hidden unit s. Switcl,i"g time series



The work of Kehagias and Petr idis [26] comprises the developed an unsupervised data allocation methodology for on line schemes for the classification and identification of switching time series. A switching time series is generated by a combination of several, alternately activated unknown (non- observable) sources, of any nature , includ ing financial. The non-observability poses a challenge on the learnin g algo rithm. In this case it is solved by separating the learn ing into a two stage iterative process, in whi ch FFNNs are first trained for separating and assigning each incom ing datum to a specific data set corresponding to each source, and a later stage of periodic retraining of the predi ctor model. The learnin g task consists in discoverin g the number of underlying data sources, and predict its output (cho ice of corresponding mo del to apply). The proposed data allocation can occur either in parallel or sequenti ally. T he algor ithm was tested on a variety of different test data, not necessarily financial, demonstrating the algorithm works efficiently despite various levels of noise. In a j oint project with the Landesbank He ssen- Thuringen and Un iversities, Heyder and Zayer have develop ed several mathematical neural network models, to enable daily forecasting of financial time series [20]. To recognize different variations market or phases in the curren cy exchange rates as a non-stationary time series. T he idea is to model the financial series as a switching time series, whose elements are generated by different market phases. N eural networks are then trained to perform as " competing experts" , that is to be able to identify to which ph ase a cur rently analyzed section of the market belongs. A refinement of the model are the Annealed Competing Experts, where not only the phase, but also the corresponding expert is identified. Expert specialization occurs throu gh escalation of the degree of competitivity. Th e poi nt is then to predict when a change of expert , and hence a change of phase, will occ ur. T he model s were then tested in a tradin g environment to predi ct the change of phase. Th e experience shows that it is possible to achieve return s on investment between 6% and 10%. An impr ovement of almost 2% is achieved with the Ann ealed Co mpetition of Experts technique over conventional neural networks for currency exchange predictions. Chaotic time series



In their pioneering work, Principe and Ku applied a time delay neural network with a global feedback loop (T D N NG F), and for comparison a no rmal TDNN [51]. Both network s had 8 inputs, 14 hidd en layer nod es, with sigmo id transfer function, and 1 linear output node. The learn ing algorithms used were the BPTT or back prop agation throu gh time, which is a modified back propagation algorithm , and the real time recurrent learning (RT RL) . For the data set 500 samples were genera ted using a Mackey-Gla ss system, and further 3000 points were generated by the algorithm. Mandziuk and Mik olajczak performed an experimental comparison between neural network architectures for short- term chaotic prediction problems (logistic map series)



 a-



<:>



...



FF NN



Swi tch ing time seri es currency exc hange



ra tes



rat es



N asr 2003 [49) Pant azo po lis 1998 [44 1 Plik ynas 2002 [43]



LlTlN



G aso line co nsu m ptio n S& I' 500



. C hao tic T S M andziuk (selectio n) 2002 [521



Keh agias, 200 2 [26 ] Kodog iannis [46]



3000 / n .a.



FFN N recu rre nt MLP



229 3/644



5yrs/2yrs



700/200



vario u s cases HOO/200



FFN N



MLPfu zzy RU F ARR N M .ELM AN FFN N , Jordan Elm an ,



M LP



cur rency ex chang e



Hu 1999 [47]



15 yea rs



n .a.



MLP



cur renc y exc ha ng e



rate s



I year



FFN N



IB EX 35



Burgess 19991 50 ] Heyder 1999 [20]



tea m



NN



Traini ng ! rest size



D ata typ e



R esearch



Table 2 Summa ry of m odeling and ANN param et ers in for ecasting



23



n .a.



1-4



m o re



4 4 1- or



n .a.



24



n .a.



1/ 4



n.a .



+ 5 bia s



15



I or 2



34 /1 8 n.a. 16 / H 16/ 24 1 & 2 laye rs 20, 30, 60 ,100 n od es 1-3



n .a.



1/ 4



n. a,



Oco 4



1- 3



nodes



H id den layer no de



# Input



1



1,3



1



I



6 n .a. 1 1



n .a.



1



n. a.



1



nodes



# O utp ut



sigmoid / lin ear Log sigmoi d



n.a.



sigm o id n .a. sigmoid



n .a.



Gaussian



sigmo id



sig m o id



n.a,



sigmoid



Transf.fun . hid den : output



SN NS



G rad- des LM o thers



UP?



UI'



[3i~



n .a ,



UI'



UI'



11.3 .



gr ad ient descent



Train ing algorith m



in d ices 0-1



n .a.



linear 0 .1 - 0 .9



0 .2-0.8



n.a.



n. a.



0- 1



n.a .



n.a .



Data nor m alizati on



RMSE



returns



tradin g



M SE, MAD



SED, PR E. RM SEE (all th ree) RMSE



RMS E, RM AE



RM SE



M SE



m easure



Perfor ma nce



 ..... o .....



PRE RMAE RMSE RTRL SED



0 .3 .



IlPTT LM MAD MSE



200 1 l lO]



Walczak



percent relative error roo t me an abso lute erro r roo t mean square erro r real time recurrent learning sta ndard erro r deviatio n



not available/ not applicable



m ean absolute dev iation mean square erro r



Levenberg-Marquardt



cur rency ex cha nge rates



DAX FTSE



M LP



Selec tio n of 5 finan cial series S&P 500



Markov M odels Elman recu rrent FFN N



TDN N Elman



TD NNGF TDNN



M ackey- Gla ss



back propagation through time



2002. 2003 [7] [42] [17] T ino 200 1



Sitte 2000,



1995 [51] Sing h 1999 [48]



Pr in cip e



3



1 to 2 1 years



2



2-32



5



8



1060, 1525



9(}- 10%



in crern . 10-90%1



1900/ 211



500/ 3000



5



2-6



2- 32



n .a ,



14



1



2



1



I



1



n .a.



Sigmoidal



Tangent Sigm o id



n .a



sig moid



n .a.



R TRL



LM



BP



BPTT, RTR L



n. a .



0, I binary



0.1-0.9



lin ear



n .a .



- 1, 1



11.3 .



profit



RM SE



ave rage prediti on er ro r selec tio n of 5 er ro rs



 108



Renate Sitte and Joaquin Sitte



[52]. It is difficult to make long-term predictions in chaotic time series. MLp, Elman, Jordan nets were used, with one neuron input and first one hidden layer of sizes 20, 30, 60 and 100; then with 2 layers. The data set was not based on real observations, but was generated as a chaotic time series. Results reveal that FF with 2 hidden layers are better predictors than other neural networks, but they are more demanding on training time and often can be early stuck at local minima. Different networks gave different results. The resulting performance may also be subject to the class of chaotic time series. Table 2 summarizes the modeling and ANN parameters used in forecasting in the works previously explained. This table is laid out in a similar way as the one presented by Zhang et al. to maintain continuity with their earlier work [2]. In conclusion we can say that neural networks outperform statistical linear methods for time series prediction. The research in recent years suggests that the time delay feed forward neural network still are the preferred network for time series prediction as their performance has not been clearly surpassed by other network types such as recurrent networks. APPENDIX



Finatuial derivatives



A financial derivative is a contract between to parties where payment is based on an mutually agreed reference. An example are stock market call and put options. A call option gives the holder of the option the right to buy a number of shares at an agreed price (strike price) at some specified future time. The holder of the option is in no obligation to exercise the option in case the share price is below the agreed price, and instead buy the shares on the open market at the lower price. A put option gives the buyer of the option to sell the shares. However is the price of the shares is higher than the agreed price on the date the option is due, the holder of the option may not exercise the option and sell the shares on the market at the higher price. It is possible for a party to sell a put option without owning the shares subject of the option. In case the buyer of the option decides to exercise the option the seller of the option will have to buy the shares at the higher price and sell them to option holder at the lower price, bearing the loss. The purpose of an option is to transfer risk from the buyer of the option to the seller of the option. The buyer limits the risk to the price of the option. There is no need for any of the parties to ever own the shares subject to the option. The case of the holder exercising a put option, the seller of the option may simply pay the holder of the option the difference between the agreed price and the price of the shares on the day the option comes due. In this case the option becomes a financial instrument in its own right dervived from the price of a certain share. The logical step is to derive an option not from the price of a specific share but instead from another reference such as the S&P 500 stock index. There are many other financial derivatives besides options; some are traded at exchanges and others over-thecounter.



 Neural network systems technology in the analysis of financial time series



109



BIBLIOGRAPHY [1] P. Refenes, Y. Abu-Mostafa, J. E. Moody, and A. S. Weigend, "Neural Networks in Financial Engineering," Wt"ld Scientitu, Singapore 1996. [2] G. Zhang, B. E. Patuwo, and M. Y. Hu, "Forecasting with artificial neural networks: The state of the art," lilt. Journal o[ Forccastino, 14 (1998) 35-62. [3J E. F. Fama, "Efficient Capital Markets: A Review of Theory and Empirical Work", Journalo[ Finance, vol 25, pp. 383-417, 1970. [4} E. F. Fama, "Efficient Capital Markets II", Journal oi Finance, vol. 46, No.5, pp. 1575-1610, 1991. [5} T. C. Mills, "The Econometric Modeling of Financial Time Series", Cambridoe University Press, 1993. [6} G.J. Deboeck, "Trading on the Edge. Neural, Genetic and Fuzzy Systems for Financial Markets",Jollfl Wiley & Sons, lnc., 1994. [7) R. Sitte, and J. Sitte, Analysis of the Predictive Ability of Time Delay Neural Networks Applied to a Class of Financial Time Series, IEEE Transactions on Systems, Man, and Cybernetics, Part C, November 2000, vol. 30, No.4, pp. 568-572. [8] A. Refenes, (Editor) "Neural Networks in the capital markets",Johll Wiley & Sons, 1995. [9} R. 13. Harriff, Book Reviews/Journal o[ Forecastiny 13 (1997) 143-147. [10] S. Walczak, "An empirical analysis of data requirements for financial forecasting with neural networks," Journal o[ Manaoement lniormation Systems, 17 (4): 203-222 SPR 2001. [11] Handbook of Econometrics, Vol II, Edited by Z. Grilichesand M. D. Intrilligator, Elsevier Sciences Publishers B V, 1984. [12] P. J. Brockwell, and R. A. Davis, "Introduction to Time Series and Forecasting," 2nd Ed., Sprillger, 2001. [13] P. J. Brockwell, and R. A. Davis, "Time Series: Theory and Methods," 2nd Ed., Sprinoer, 1991. [14] J. D. Hamilton, Time Series Analysis, Princeton University Press, 1994. [15] P. C. B. Phillips, "Challenges of Trending Time Series Econometrics," (Keynote presentation) Intemational Conjiress o[ Modellill<~ and Simulation, Townsville, Australia, July 2003, D. A. Post (Editor) pp. 945-952. [16] L. L. Chao, "Statistics Methods and Analysis," McGraw-Hili, Inc. USA, 1974. [17] R. Sitte, and J. Sitte, Effects of Data Preparation in Multivariate Time Series Predictions, Proceedino: Int. Conierencc on Computational Intelligence ForModelliny, Control & Automation CIMCA 2003, Vienna, ISBN 1740880684, M.Mohammadian (Ed.), pp. 782-790, 2003. [18] S.Haykin, "Modern Filters," Maxwell Maonillan Int. Editions, 1990. [19] M. Menna, and G. Rotundo, "Distinguishing between chaotic and stochastic systems in fmancial time series," lnternationailournal o[Modem Physics C, vol 13, No.1, (2002) 31-39 [20J F. Heyder, and S. Zayer, "Analysis of financial time series with neural networks and competing experts," Winschaitsintomiatile, 41(2): 132-137 Apr. 1999. [21] Hillier Lieberman, Introduction to Operations Research, Holden-Day, Inc. 1980. [22] J. A. Holyst, M. Zebrowska, and K. Urbanowicz, "Observations of deterministic chaos in financial time series by recurrence plots, can one control chaotic economy? Eur. Phys. J. B 20, 531-535 (20()l). [23] C. Schittenkopf, G. Dorffner, and E. Dockner, "On Nonlinear, Stochastic Dynamics in Economic and Financial Time Series," Studies ill Nonlinear Dynamics and Econometrics, 4(3): 101-121. [24] J. Mandziuk, and R. Mikolajczak, "Chaotic time series prediction with feed-forward and recurrent neural nets," Control and Cvbemctics, 31(2): 383-406 (2002). [25] K. Judd, and A. Mees, "On selecting models for nonlinear time series," Phvsica 0, 82 (1995) 426-444. [26} A. Kehagias, and V Petridis, "Predictive modular neural networks for unsupervised segmentation of switching time series: The data allocation problem," IEEE Transactions all Neural Networks, 13(6): 14321449, Nov. 2002. [271 M. J. Hu, "Application of the Adaline System to Weather Forecasting," Technical Report 6775-1 Stanford Electronic Laboratories, Stanford, CA, USA, 1964. [28] F. Rosenblatt, "The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain", Psycholojiical Review, vol. 65, number 6, pp. 386-408, 1958. [29} K. Hornik, M. Stinchcombe, and H. White, "Multilayer feedforward networks are universal approximators", NeuralNetworks, vol. 2,1989, pp. 359-366. [30J S. J. Judd, "Learning in networks is hard", Proceedincs o[ the First IllternatiOl1a1 Conference all Neural Networks, IEEE, San Diego,CA, 1987, pp. 685-692. [31J S. Geva, K. Malmstrom, andJ. Sitte, " Local Cluster Neural Net: Architecture, Training and Applications", Neurotomputino, vol. 20, 1998, pp. 35-56.



c.



 110



Renate Sitte and Joaquin Sitte



[32] K. J Lang, and G. E. Hinton, "The development of the time-delay neural network architecture for speech recognition", CMU-CS-88-152, Carnegie-Mellon University, 1988. [33] A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. J. Lang, "Phoneme recognition using timedelay neural networks", IEEE Transactiol1S onAcoustics, Speech, andSiona! Processing, vol. 37, pp. 328-339, 1989. [34] F Takens, "Detecting strange attractors in turbulence", Lecture Notes in Mathematics, vol 898, D. A. Rand and L. S. Young (Ed.) 1981, Springer Verlag, Berlin. [35] S. Ergezinger, and E. Thomsen, "An accelerated learning algorithm for multiplayer perceptrons: optimisation layer by layer:' IEEE Transactions on Neural Networks, vol. 6, 1985, pp. 31-42. [36] R. Bone, M. Crucianu, and J. P Asselin de Beauvill, "Learning long-term dependencies by selective addition of time-delayed connections to neural networks", Neurowmputing, vol. 48, 2002, pp. 251-266. [37] D. R. Seidl, and R. D. Lorenz, "A structure by which a recurrent neural network can approximate a nonlinear dynamic system", Proceedings of the 1991 InternationalJoint Conference on Neural Networks(IJCNN) Seattle, WA, USA, vol. 2, 1991, pp. 709-714. [38] K. I. Funahashi, and Y. Nakamura, "Approximation of Dynamical Systems by Continuous Time Recurrent Neural Networks", Neural Networks, vol. 6, 1993, pp. 801-806. [39] J. L Elman, "Finding structure in time", Cognitive Science, vol. 14, 1990, pp. 179-211. [40] M. I. Jordan "Serial order: A parallel distributed processing approach," Technical report No. 8604, Institute of Cognitive Science, University of California, San Diego, LaJolla, CA, USA, 1986. [41] T. Van Gestel,J. A. K. Suykens, D.E. Baestaens, A. Lambrechts, G. Lanckriet, B. Vandaele, B. De Moor, and J. Vandewalle, "Financial time series prediction using least squares support vector machines within the evidence framework," IEEE Transactions on Neural Networks, 12(4): 809-821 Jul. 2001. [42] R. Sitte, andJ. Sitte, "Neural Networks Approach to the Random Walk Dilemma of Financial Time Series", Journal of Applied lnteilioence, APIN Vol. 16, No.3 (May/June 20(2) pp. 163-171. [43] D. Plikynas, L. Simanauskas, and S. Buda, "Research of neural network methods for compound stock exchange indices analysis," Informatica, 13(4): 465-484, 2002. [44] K. N. Pantazopoulos, Boyd L. H., N. G. Bourbakis, M.J. Brun, and E. N. Houstis, "Financial prediction and trading strategies using neurofuzzy approaches," IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, 28(4): 520-531, Aug. 1998. [45] P Tino, C. Schittenkopf and G. Dortfner, "Financial Volatility Trading Using Recurrent Neural Networks," IEEE Transactions on Neural Networks, vol. 12, No.4, 2001, pp. 865-874. [46] V Kodogiannis, and A. Lolis, "Forecasting financial time series using neural network and fuzzy systembased techniques," Neural Computing & Applications, 11(2): 90-102, Oct. 2002. [47] M. Y. Hu, and C. Tsoukalas "Combining conditional volatility forecasts using neural networks: an application to the EMS exchange rates," Journal ofInternational Financial Markets, Institutions and Money 9 (1999) 407-422. [48] S. Singh, "A long memory pattern modelling and recognition system for financial time-series forecasting," Pattern Analysis and Applications, 2(3): 264-273, 1999. [49J G. E. Nasr, E. A. Badr, and C. Joun, "Backpropagation neural networks for modeling gasoline consumption," Energy Conversion and Management, 44(6): 893-905, Apr. 2003. [50] A. N. Burgess, and A-PN Refenes "Modelling non-linear moving average processes using neural networks with error feedback: An application to implied volatility forecasting," Signal Protessiny, 74 (1999) 89-99. [51] J. C. Principe, and J. M. Ku, "Dynamic Modelling of Chaotic Time Series with Neural Networks," Advmzces in Neural lnlormation Processing Systems 7: 311-18,1995. [52] J. Mandziuk, and R. Mikolajczak, "Chaotic time series prediction with feed-forward and recurrent neural nets," Control and Cybernetics, 31(2): 383-406, 2002.



 FUZZY RULE EXTRACTION USING RADIAL BASIS FUNCTION NEURAL NETWORKS IN HIGH-DIMENSIONAL DATA



F. ADMIRAAL-BEHLOUL AND J. H. C. REIBER



INTRODUCTION



Rule learning is an increasingly important topic in both machine learning and data mining research. Machine learning concerns the development of algorithms or programs, which learn knowledge or skills while data mining is about the discovery of patterns or rules hidden in the data. Given a set of corresponding input-output values of a system, the challenge consists of identifying and formulating the relations between the input-output values in order to describe the system. To identify such relations, a functional input-output description may be provided. However, when dealing with complex processes, this is generally not feasible. One needs to look for alternative methods. The use of fuzzy models described through fuzzy rules has proven to be successful. Indeed, general knowledge about actions or conclusions can be expressed by a set offuzzy If-Then rules of a Fuzzy Inference System (FIS). The basic structure ofFIS consists ofthree conceptual parts: a selection offuzzy rules (rule base), definitions ofthe membership functions used in the fuzzy rules (dictionary), and a reasoning mechanism, which performs the inference based on given facts to derive a conclusion (a fuzzy reasoning). In general, one designs a fuzzy inference system based on the past known behavior of the target system. The FIS is expected to reproduce the behavior of the target system, for example a human decision in a specific domain. Although the FIS model has a well-structured knowledge representation, it lacks the adaptability to deal with a changing external environment. The FIS is used only to mimic and not to learn and teach. If learning and automatic rule extraction abilities are required, then neural network concepts are preferred to fuzzy inference 111



 112



F. Admiraal-Behloul and]. H. C. Reiber



systems. Neural Networks (NN) received the attention of the researchers because of their adaptivity and ability to learn. However, the semantic of a NN in terms of the problem to be solved is not explicit; the information is captured by a set of weights, and thus they are considered as black box systems. ].-5. R. lang proposed a class of adaptive networks that is functionally equivalent to fuzzy inference systems [1]. He demonstrated that under simple conditions, a Radial Basis Function Network (RBFN) is functionally equivalent to a FIS. While a FIS comprises a certain number of membership functions, a RBFN consists of radial basis functions. Both models produce a center-weighted response to small receptive fields, localizing the primary input excitation. Under simple conditions, a FIS can be viewed as a neural network and vice versa. This hybridization 1 leads to a neuro-fuzzy model which cumulates the advantages of both models: the adaptivity of NNs and the well-structured knowledge of FISs. The functional equivalence provides a shortcut for better design of both FISs and RBFNs [2-7]. The analysis and learning algorithms for RBFNs are applicable to FIS and the fuzzy modeling procedure could be a good way ofinitializing a RBFN before training. The use of fuzzy clustering for fuzzy rule extraction or design of a RBFN has been proposed in the literature [2][8][9]. One generally projects the fuzzy clusters in the domains of the variables leading to a grid partitioning of the domain space. However, this method may lead to a poor characterization of the system to be modeled because of the possible overlapping projections of different clusters. When an accurate model is desired, the method tends to produce a large number of rules making the interpretability of the model difficult. Figure 1.a shows a case problem where the cluster projections are highly overlapping, assuming that a clustering technique has been able to identity the different ellipsoidal clusters. In the case ofaxes-parallel ellipsoidal shapes, the projection technique captures the information with a relatively good accuracy (Figure 1.b). However, the correlation between variables in some clusters may make the projection technique extract a large number of rules to model the system. Figure l.c shows the rules one should get if the problem was limited to identifying the clusters cl, c2, and c5 (see Figure 1.a). In this case, the projection into the variable domains leads to six rules. Including clusters c4 and c3 makes the problem more difficult and will require a larger number of rules to identity all the clusters. The complexity of the problem increases when dealing with high dimensional data. In this work we propose to extract rules with multidimensional fuzzy sets since the full information about the data (clusters) may be in the entire space (see Figure 1). Indeed, a multidimensional fuzzy set with arbitrary shape (hyper-ellipsoidal) can capture the characteristics of the clusters with a better accuracy. The system can be modeled with a smaller set of rules (one per cluster). Another motivation for the use of multidimensional fuzzy sets in fuzzy rules is related to the knowledge that one expects to get from an automatic training in high dimensions. For a human expert, a rule with 1 Hybrid (in contrast to combined) neura-fuzzy models consist ofhomogeneous architectures that are usually neural network oriented (interpreting a FIS as NN), while a combined neuro-fuzzv model is a composed model ofseparate but cooperative NN and FlS.



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data



113



xza



X2



(8)



(b)



If Xc is l\ andxl is b.t thenc, If Xc is ~ and~ is l\ thenC) If Xc is l\ andXz is b.t thenC) If Xc is .. andXz is l\ thene, If Xc is ltandXz is b.t thene, If Xc is .. andXz isl\ thenC)



(c) Figure 1. Fuzzy rule extraction using projection and grid partitioning. (a) Synthetic data presenting elongated and overlapping structures. (b) Projection of axes-parallel ellipsoids. (c) Projection of structures with correlated features.



a high number of input variables is very difficult to remember and thus, to learn. It is more intuitive to put a label on a multidimensional fuzzy set A and remember the rule "if x is A then y is B," than to put a label on each possible projection and learn several rules of type "if XI is Al and X2 is A 2 and X3 is A 3 and ... x; is A" then y is B." When dealing with high dimensional data, feature selection methods are generally



 114



F. Admiraal-Behloul and]. H. C. Reiber



used to reduce the dimensionality and thus the complexity of the problem [10-11]. In some applications this approach may be fruitful. However, the accuracy of the system using less features may drop for other applications. Furthermore, the reduced number of features may still remain too high for rule extraction. The idea of extracting rules using multidimensional fuzzy sets may sound as not new. Indeed, Delgado et al. proposed the characterization of fuzzy sets in the space dimension for a rapid prototyping for fuzzy rule-based modeling using fuzzy clustering [9]. However, because optimal characterization ofmultidimensional clusters is generally time-consuming and presents some practical difficulties, researchers neglected it and preferred the use of simple clustering techniques as the well-known Fuzzy-C-Means algorithm. In [9] the authors justified the use of a simple clustering algorithm by the fact that only a good initial set of rules (rapid prototyping) was the aim of the work. However, they suggested to tune the extracted set ofrules by genetic algorithms, which are well known to be computationally demanding. In this chapter, we present an efficient approach to fuzzy rule extraction in multidimensional problems based on optimal clustering and neuro-fuzzy modeling [42]. We present solutions to the practical problems that are generally encountered using such modeling procedures, and give a general methodology on how to design an optimal neuro-fuzzy model for fuzzy rule extraction in high dimensional data. The method integrates recent advances in optimal fuzzy clustering. The use ofappropriate similarity measures to group data into clusters is addressed, and cluster validity indices to define an optimal set of clusters to capture the information from the data are discussed. We put the rule extraction issue in a neuro-fuzzy framework in order to take advantage of the supervised training ability ofNNs (RBFN) in order to tune the rules extracted by the unsupervised clustering technique. In order to make this chapter self explanatory, let us first recall some basic concepts. 1. FUZZY SET THEORY: BASIC DEFINITIONS AND TERMINOLOGY



The human brain interprets imprecise sensory information provided by the perceptive organs. Fuzzy set theory provides a systematic calculus to deal with such information linguistically and it performs numerical computations by using linguistic labels stipulated by membership functions [13,14,43,44]. In contrast to a classical set, a fuzzy set has no crisp boundary. That is, the transition from "belong to" and "not belong to" is gradual. This smooth transition is characterized by membership functions that give fuzzy sets flexibility in modeling commonly used linguistic expressions such as the "temperature is high" or "the value is small". 1.1. Fuzzy sets



Let X be a collection of objects denoted generally by x. A fuzzy set A in X is defined as a set of ordered pairs: A = {(x, fLA(X)) [x E X}. Where fLA(X) is called the Membership Function (MF for short) for the fuzzy set A. The MF maps each element of X to a membership grade (or membership value) between 0 and 1.



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data 115



y



o



MA



0.5



o 10



20



40



60



80



Figure 2. MFs oflinguistic values "young" (Y), "middle aged" (MA) and "old" (0).



The definition of a fuzzy set is a simple extension of the definition of the classical set in which the characteristic function is permitted to have any values between 0 and 1 rather than to be restricted to either 0 or 1. Usually X is referred to as the universe of discourse. X may consist of discrete (ordered or non ordered) objects or a continuous space. The construction of a fuzzy set depends on two things: the identification of a suitable universe of discourse and the specification ofan appropriate membership function. The specification ofmembership functions is subjective; This means that the membership functions specified for the same concept by different persons may vary considerably. This subjectivity comes from individual differences in perceiving or expressing abstract concepts and has a little to do with randomness. Therefore, the subjectivity and non-randomness offuzzy sets is the primary dilference between the fuzzy set theory and the probability theory, which deals with objective treatment and random phenomena. 1.2. Linguistic variables and linguistic values



We can define fuzzy sets "young," "middle aged" and "old" that are characterized by MFs, tly(x), tlMA(X) and tlo(x) respectively. Just as variable can have various values, a linguistic variable can assume different linguistic values (or linguistic labels), such as "young," "middle aged" or "old" in our example. If"age" assumes the value "young," then we say "age is young" and similarly for the other values. Linguistic labels are stipulated by MFs. Typical MFs for these linguistic values are given in Figure 2. 1.3. Membership function formulation and parametrization



A fuzzy set is completely characterized by its ME Since most fuzzy sets in use have a universe of discourse X consisting of the real line R, it would be impractical to list all the pairs (x, tlA(X)) defining a membership function. A convenient and concise way to define a MF is to express it as a mathematical formula. Classesofparametrized functions are commonly used to define MFs of one dimension; MFs of higher dimensions can be defined similarly.



 116



F. Admiraal-Behloul and]. H. C. Reiber



Triangular MPs:



A triangular MF is specified by three parameters {a, b, c} as follows: x ::: a



0,



triangle (x;a,b,c)



=



x-a b b-a' a:::x::: c-x b c-b' :::x:::c



with a < b < c.



(1)



c ::: x



0,



Trapezoidal MFs:



A trapezoidal MF is specified by four parameters {a, b, c, d} as follows: 0,



x-a b - a'



trapezoid (x; a,b,c ,d)



x::: a a:::;x:::b



1,



b:::x:::;c



d-x d - c'



c:::x:::d



0,



d:::x



=



with a < b < c < d



(2)



Due to their simple formulas and computational efficiency, both triangular and trapezoidal MFs have been used extensively, especially in real-time implementations. However, these functions are not smooth at the corner points specified by the parameters. Smooth membership function:



Gaussian MFs



A gaussian MF is specified by two parameters {c, a} gaussian (x; r , a) =



(X_f)2



(3)



e-~



Generalized bell MFs



A generalized bell MF is specified by three parameters {a, b, c} Bel/(x;a,b,c)



=



1



2b



I I



1 + X~[



(4)



where b is usually positive. The Bell MF is also referred to as Cauchy ME Because of their smoothness and concise notation, Gaussian and Bell MFs are very popular for specifying fuzzy sets. Gaussian and Bell functions possess useful properties such as invariance under multiplication (the product of two Gaussian functions is a Gaussian with a scaling factor). The Bell MF has one more parameter than the



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data



117



Gaussian; it has one more degree of freedom which allows to adjust the steepness at the crossover points. It is important to note that Gaussian MFs and Bell MFs are unable to provide asymmetric MFs, which are important in certain applications. Sigmoidal MFs are either open left or open right. Closed asymmetric MFs can be generated using either the absolute difference or the product of two sigmoidal functions. Sigmoidal MFs:



A sigmoidal MF is specified by two parameters sigmoid (x; a, c)



{a, c}



1



= 1 + e- a (x-r)



(5)



where a controls the slope at the crossover point x = c. Depending on the sign of a, a sigmoidal MF is inherently open right or left. The sigmoidal MF is appropriate for representing concepts such as "very high". MFs of two dimensions:



Two-Dimentional MFs (2D-MFs) refer to MFs with two inputs (variables), each in a different universe of discourse. Sometimes it is advantageous or necessary to use such MFs. One natural way to define 2D-MFs is to extend one dimensional MFs via cylindrical extension: Let A be a fuzzy set in X. Its cylindrical extension in XxY is a fuzzy set c(A) defined by: erA) =



1



XxY



I-tA(X)/(X, y)



(6)



If a 2D-MF can be expressed as an analytic expression of two MFs of one dimension, then it is composite; otherwise it is non-composite. Example: Let A = "(x.y) state near (7,1)" defined by: I-tA(X, y)



= e(-(9)'-(r- I )2)



(7)



This 2D-MF is composite since it can be decomposed into two Gaussian MFs: (8)



If A is given by: I-tA(X, y) = 1 +



1



Ix _ 711y _ 11 2 5



then it is non-composite.



(9)



 118



F. Admiraal-Behloul and]. H. C. Reiber



Table 1 Fuzzy complement functions Function name



Function definition



Complement (classical)



N(a)



Sugeno's complement



N,(a)



Yager's complement



= 1- a 1- a = --



where 1 + sa Nv(a) = (1 - aU')]/w



5



>-1



Note: When A is a product of two Gaussian functions it can be viewed as two statements joined by the connective AND. "x is near 7 AND y is near 1. The product oftwo Gaussian MFs can thus express the AND operation. A composite MF could be defined using the OR connective operator as well. Classical AND and OR operations on fuzzy sets are min and max operators. The concept of defining 2D-MFs can be generalized to form the concept of n-dimensional ME 1.4. Fuzzy set operations



Fuzzy Complement:



A fuzzy complement operator is a continuous function N [0, 1] ---+ [0, 1] which meets the following axiomatic requirements. N(O)



=1



N(a)



~



and N(l)



N(b)



=0



if a ::: b



(boundary) (monotonicity)



(10)



Another optional requirement imposes the involution property N(N(a))



=a



(11)



(involution)



Fuzzy Intersection



The intersection of two fuzzy sets A and B is given by: T: [0, 1] x [0, 1] ~ [0,1] J1.AnB(X)



=



T(J1.A(X), J1.B(X))



=



J1.A(X)*J1.B(X)



(12)



where :I: is a binary operator for the function T. :I: is usually referred to as aT-norm (Triangular norm) operator. A T-norm meets the following basic requirement: T(a,O)=O



T(a,I)=a



T(a, b) ::: T(e, d) T(a, b)



= T(b,a)



if a ::: c and b ::: d



T(a, T(b, e)) = T(T(a, b), e)



(boundary) (monotonicity) (commutativity)



(13)



(associativity)



The first requirement imposes the correct generalization to crisp sets. The second one implies that a decrease in the membership values in A or B cannot produce an increase in the membership value in A n B. The third requirement indicates that



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data 119



Table 2 Parameterized T-norms and the corresponding T-conorms T-norm proposed by Schweizer and Sklar



Definition 1



T(a, b, p) = [max(O, (a- P + b-P - 1))]- P



55' (a, b, p)



=



1 - [max(O, ((1 - a)-P



+ (1 -



b)-P - 1))]



_"p



Ty(a,b,q)=I-min(I,((I-a)q+(I-b)q)~)



Yager



5 y(a,b,q)=min(I,(a Q+b q) Q) I



forq >0



DuboisandPrade



7lJP(a,b,a) = ab/max(a, b,a) for aE[O,I] 5DP(a, b, a) = [a = b - ab - min (a, b, (1 - a))]/max ((1 - a), (1 - b), a)



Sugeno



Ts(a, b, A) = max (0, ((A + 1) (a + b -1) - Aab)) with A ~-I 5s(a,b,A) = min(I, (a +b -Aab)) with A ~-1



the operator is indifferent to the order of the fuzzy sets to be combined. The final requirement allows the definition of the intersection of any number of sets in any order of pairwise grouping. Fuzzy union operator:



The fuzzy union is specified by a function S given by: S:[O, 1] x [0, 1] ~ [0,1] fLAUB(X)



= S (fLA(X), fLH(X)) = fLA(xH~fLB(X)



+



(14)



+



where is a binary operator for the function S. is referred to as T-conorm or S-norm operator. A T-conorm satisfies the following basic requirements: S(I, 1) = S(a, b)



~



°



S(a, 0) = a



S(c, d)



if a



~



(boundary)



c and b



~



d



(monotonicity)



S(a, b) = S(b, a)



(commutativity)



S(a, S(b, e))



(associativity)



= S(S(a, b), c)



(15)



According to the generalized DeMorgan's law, for a given T-norm one can always find a corresponding S-norm, and vice versa. The generalized DeMorgan's law is given by: T(a, b) = N(S(N(a), N(b))) S(a, b) = N(T(N(a), N(b)))



(16)



Several parametrized T-norms and dual T-conorms have been proposed in the literature; some of them are presented in table 2. 2. FUZZY REASONING AND FUZZY INFERENCE SYSTEMS



Fuzzy rules and fuzzy systems are the backbone of fuzzy inference systems, which are the most important modeling tools based on fuzzy logic.



 120



F. Admiraal-Behloul and]. H. C. Reiber



2.1. Fuzzy if-then rules



A fuzzy if then rule assumesthe form: "Ifx is A then y is B" where A and B are linguistic values defined by fuzzy sets on the universes of discourse X and Y, respectively. "x is A" is called the antecedent or premise, while "y is B" is called the consequence or conclusion. Examples of fuzzy if-then rules are widespread in our daily linguistic expressions such as: If the weather is warm then give more water to the plants.



"If x is A then y is B" is sometimes abbreviated as A-+ B. This expression describes a relation between two variables x and y; this suggests that a fuzzy if-then rule be defined as a binary fuzzy relation R on the product space XxY. Fuzzy reasoning (approximate reasoning) is an inference procedure that derives conclusions from a set of fuzzy if-then rules and known facts. The basic rule of inference in traditional two-valued logic is modus ponens, according to which one can infer the truth of a proposition B from the truth of A and the implication A-+ B. However, in human reasoning, modus ponens is employed in an approximate manner. 2.2. Approximate reasoning



Let A, A' and B be fuzzy sets of X, X and Y respectively. The fuzzy implication A-+ B is expressed as a fuzzy relation R on XxY. The fuzzy set B' induced by "x is A'" and the fuzzy rule "if x is A then y is B" is defined by: JLB'(Y)



= max, [min [JLA'(X), JLR(X, y)]J = v; [JLA'(X)!\ JLR(X, y)]



or equivalently B'



= A' 0 R = A' 0



(17)



(A --+ B) ,



where "0" denotes the composition operator. The single rule with single antecedent is the simplest case. A fuzzy rule may have multiple antecedents. A fuzzy if-then rule with two antecedents is usually written as: "if x is A and y is B than z is Coo; a simpler form is A x B



--+ C.



Let A, A' be the fuzzy sets corresponding to x, Band B' those corresponding to y and C and C' those corresponding to z. The fuzzy set C' induced by "x is A' and y is B'" and the fuzzy rule "if x is A and y is B then z is Coo is defined by: C'



= (A'



X



B')



0



(18)



(A x B --+ C)



A decomposition method for the calculation of C' is: C'



= [A' 0



(A --+ C)]



n [B'



0



(B --+



C)]



(19)



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data 121



Thus C' can be expressed as the intersection of C~ = A' 0 (A -+ C) and C/ = 0 (B -+ C), each of which corresponds to a single fuzzy rule with a single antecedent. Multiple fuzzy rules with multiple antecedents can be involved in describing a system behavior. The interpretation of multiple rules is usually taken to be the union of the fuzzy relations corresponding to the fuzzy rules. Fuzzy if-then rules and fuzzy reasoning constitute the base offuzzy inference systems which are the most important modeling tool based on fuzzy set theory. BI



2.3. Fuzzy inference systems



Fuzzy Inference Systems (FrS) have been used successfully in a variety of fields (automatic control, data classification, decision analysis, expert systems, ... ) and thus are known by numerous other names, such asJuzzy-rule-based systems,Juzzy expert systems, Juzzy models,Juzzy logic controllers and simply Juzzy systems. The basic structure of FrS consists of three conceptual parts: a selection of fuzzy rules (rule base), definitions of the membership functions used in the fuzzy rules (dictionary), and a reasoning mechanism which performs the inference based on given facts to derive a conclusion (a fuzzy reasoning) . The basic FrS can accept either fuzzy inputs or crisp inputs (which are viewed asfuzzy singletons), but the outputs are always fuzzy sets. If a crisp output is desired, a method of defuzzification is required. Defuzzification consists of extracting a crisp value that best represents a fuzzy set. With crisp inputs and outputs, a FIS implements a nonlinear mapping from its input space to its output space. The mapping is accomplished by a set of if-then rules, each of which describes the local behavior of the mapping. The antecedent of a rule defines a fuzzy region in the input space, while the consequence specifies the output in the fuzzy region. Two types of FrS have been widely used in various applications: the Mamdani model [11] and Sugeno model [8]. The difference between the two models lies in the consequent part of their rules. Figure 3 shows two Mamdani models using min-max as a choice of T-norm and T-conorm respectively. Figure 4 shows the Mamdani model using product -max fuzzy reasoning. Other variations are possible if one uses a different T-norm and T-conorm. The Mamdani FrS output is a fuzzy set and a defuzzification step is required if a crisp output is desired. In general there are five methods of defuzzification: the centroid of the fuzzy set area, its bisector, the mean of its maxima, the smallest of its maxima and the largest maximum. The calculation of most of the defuzzification operations is time-consuming (when arbitrary shaped MFs are used), so most of the studies are based on experimental results. This leads to the introduction of other FrS models that derive crisp values and thus do not need any defuzification computation. The Sugeno fuzzy modeles do not need any defuzzification computation since it derives a crisp value. A typical fuzzy rule in the Sugeno model has the form: "if x is A and y is B then z



= f(x,y)".



 122



F. Admiraal-Behloul and]. H. C. Reiber



min p



)I



y



p



y



PI



~c~ax



lL=b.



Figure 3. Mamdani FIS using min and max for T-norm and T-conorm. Two rules each with two antecedents.



prod ct )I



y )I



y II



y



)I



1



~



s:



LA



Figure 4. Mamdani FIS using product and max for T-norm and T-conorm. Two rules each with two antecedents.



where A and B are fuzzy sets in the antecedent, while z = f(x,y) is a crisp function in the consequent. In principle f(x, y) can be any function, but normally it is a polynomial in the input variables x, y. When f(x,y) is a first-order polynomial, the FIS is called a first-order Sugeno FIS. The zero-order Sugeno FIS corresponds to a zero-order polynomial f(x, y). Figure 5 shows the fuzzy reasoning procedure of a first-order Sugeno fuzzy model. Since each rule has a crisp output, the overall output of the system is obtained by weighted average. In practice the weighted average is



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data 123



min or product



Z I .p



I



x+q



I



Z z "p



%



x+q



%y



y +r



I



y



1-+---+---1---+4,----4



W%



+r z



Well led lVel1lle



y



x



y



z=



W,z, WI



+ W,Z, + WI



Figure 5. The Sugeno fuzzy model. Two rules with each two antecedents.



sometimes replaced by the weighted sum (z = WjZj + W2Z2). The zero-order Sugeno model can be viewed as a special case of the Mamdani model in which the consequent of each rule is a fuzzy singleton. In general, one designs a fuzzy inference system based on the past known behavior of the target system. The FrS is expected to reproduce the behavior of the target system, for example a human decision in specific domain. Generally speaking, the standard method for constructing a fuzzy system, a process called fuzzy modeling, can be pursued in two stages. The first stage is called sutjace structure determination and includes the following tasks: 1. Select relevant input and output variables 2. Choose a specific type of FrS 3. Determine the number of linguistic terms associated with each input and output variable; if a Sugeno model is used determine the order of the FrS. 4. Design a collection of fuzzy if-then rules. After the first stage, a rule base is obtained that can more or less describe the behavior of the target system by means of linguistic terms. The second stage consists of the determination of the MF of each linguistic term. This stage is referred to as deep structure determination and includes the following tasks: 1. Choose an appropriate family of parameterized MFs 2. Have a human expert help in the determination of the parameters of the MFs used in the rules. 3. Refine the parameters of the MFs using regression and optimization techniques.



 124



F. Admiraal-Behloul and]. H. C. Reiber



Although the fuzzy inference system has a good structured knowledge representation in the form of fuzzy if-then rules, it lacks the adaptability to deal with a changing external environment. The frS is used only to mime and not to learn and teach. If learning and automatic rule extraction abilities are required, then neural network concepts are incorporated in fuzzy inference systems, resulting in neuro-fuzzy modeling. ].-S. R. Jang proposed a class of adaptive network that is functionally equivalent to fuzzy inference systems [1]. The proposed architecture is referred to as the Adaptive Network-based Fuzzy Inference System (ANFrS) or adaptive neuro-fuzzy system. ].-S. R. Jang [45] demonstrated that under simple conditions, a radial basis function network (RBFN) is functionally equivalent to a frS. Let us recall the concepts of RBFNs first and then present the functional equivalence. 3. RADIAL BASIS FUNCTION NEURAL NETWORKS



A Neural Network (NN) is a set of many simple processors (units), each possibly having a small amount of local memory. The units are connected by communication channels (connections) which usually carry numeric (as opposed to symbolic) data, encoded by any of various means. The units operate only on their local data and on the inputs they receive via the connections. The restriction to local operations is often relaxed during training. Some NNs are models of biological neural networks and some are not, but historically, NNs originated in an attempt to build mathematical models of elementary processing units in the brain (neurons) and the inspiration came from the desire to produce artificial systems capable of sophisticated, may be "intelligent," computations similar to those that the human brain routinely performs. Most NNs have some sort of training rule whereby the weights of connections are adjusted on the basis of data. In other words, NNs learn from examples and exhibit some capability for generalization beyond the training data. There are many kinds ofNNs by now. New ones (or at least variations of existing ones) are invented every month. The categorization of the NNs is related to their topology (the way the neurons are connected to one an other) and the distinction between supervised and unsupervised learning methods. In supervised learning, there is a "teacher" who in the learning phase "tells" the net how well it performs (reinforcement learning) or what the correct behavior would have been (fully supervised learning). It is what statisticians know as nonparametric regression and it corresponds to the problem of estimating a function, given only a training set of pairs of inputoutput points. However, in unsupervised learning the net is autonomous: it just looks at the data it is presented with, finds out about some properties of the data set and learns to reflect these properties in its output. What exactly these properties are, that the network can learn to recognize, depends on the particular network model and learning method. Usually, the net learns some compressed representation of the data. The multilayer perceptron (MLP) architecture is the most popular one in practical applications that require supervised learning. It is a feedforward NN where the neurons are organized on layers: an input layer, one ore more hidden layers and an output layer. Neurons of the same layer are not connected to each others.



 Fuzzy rule extracti on using radial basis function neural networks in high- dimension al data



125



Although widel y used, the MLP has drawback s in three aspects : first, it tends to over generalize (in pattern classification). The network can be trained to have high accuracy in classifying pattern s from a set of known catego ries, but will also classify any out-o f-category patt ern as one of the trained categories. In real life applications, this can have severe consequences. Second, th e widely used backpropagation trainin g meth od is often too slow, especially for large- scale problems (even with optimized algorithms). Third, the knowledge represented in a MLP is not easy to comprehend. They are considered as black-box decision systems. R adial Basis Function N etworks (RB FN) are bein g increasingly popular becau se they seem to solve the three MLP problems related above [12,46]. First, the mo st popul ar RBFN with gaussian kern els, learns the pattern probability density functions instead of dividing the pattern space as the MPL do. Th erefore, when an out-ofcategory pattern is presented to the network, it will be classified as an unknown category. Second, RBFN is generally easier to train than MLP. This is because the RBFN establishes the RBF paramet ers directly from the data, and training is mainly on the output layer. Third RBFN naturally introduces the noti on of class membership which allows fuzzy partiti onin g of the input space and rule based int erpretation of the represented knowledge . 3.1. Definition



The RBFN is a feedforward neural network whi ch accomplishes an input- output nonlinear mappin g by linear combination of nonlinearly transformed inputs according to the following:



OJ



= L IVi¢i (X) '"



(20)



i= l



wh ere x is the inpu t vecto r, OJ the output of the j lh output node and !Vi are the output linear comb ining weights. T he 4>; (x) are Radial Basis Functions (R BF) and m is the number ofRBFs. Figure 6 shows the network representation with RBF kernel s represented as neurons of the hidden layer. An output of the network is a simple linear combination of the hidd en neuron outputs. A more complicated method for calculating the overall output is to take the weighted average of the output associated with each receptive field: (21) The weighted average is advantageous in that points in the area of overlap betwe en two (or more ) receptive fields will have a well- interpolated overall output between the outputs of the overlapping RBFs. All the RBFN are not th e same. T hey differ in type of RBFs used and in trainin g meth od . Co nsequently, they differ in performances.



 126



F. Admiraal-Behloul and]. H. C. Reiber



Figure 6. Radial Basis Function Network with one output node. Each component of the input vector x feeds forward to m radial basis function node whose output are linearly combined with weights {Wi.; = 1.../II} into the network output node.



The most distinguishing feature ofRBFs is that they are local i.e. they give a significant response only in a neighbourhood near a central point. Their response decreases monotonically with distance from a central point. Strictly speaking, functions whose response increases monotonically away from a central point are also radial basis functions, but because they are not local, they are less interesting for classification applications. RBFs parameters are its centre, its shape, and its width. A typical local RBF is the Gaussian function. Centered at c and of width (or radius) Y, it has the form: G(x,c,r)=exp-



( I I X 2r- C I I ~ ) 2



(22)



where R is a positive definite matrix. Other common choices of radial functions are the multi quadric function: Mq(x, c , r) = Jllx - c II~



+ r2



(23)



the thin-pate-spline function Tps(x,



c) = Ilx - cII~ log (11x - cIIR)



(24)



and the inverse multiquadric function 1



Imq(x,c,r) = r====== Jllx - c II~ + r 2



(25)



 Fuzzy rule extraction using radial basis function neura l networks in high- dim ension al data



127



Gaussian-like functions (local) are more commonly used than multiquadric-like function (no n local) which have a global response. 3.2. Training



Since Broomhead and Lowe's 1988 seminal paper [12], RBFNs have traditionally been associated with radial function networks in a single hidden layer. A RBFN is nonlinear if the basis functions can move or change size or if there is more than one hidden layer. The hidden and the output layers are generally trained sequentially: the R adial Basis Function (R BF) parameters are first fixed and the optimal linear combining weights are then computed [14,1 7]. 3.2.1. Hidden layer definition :



The hidden layer definition is the most critical step in the design of an optimal RBFN. To determine its parameters , one has to decide the number of neurons of the layer and their Kernel functions (R BFs). A RBF is generally specified by its center and width . The simplest and most general method to decide the hidden layer number of neurons is to associate a neuron to each training pattern. However, for a large-scale data set this meth od is not convenient. Therefore, a process of selecting a subset of basis functions from large set of candidates is required. In linear regression theory subset selection is well known and one popular variant is fonva rd selection in which the model starts empty (m = 0). Basis functions are selected one at a time and added to the network . The added RBF must reduce th e sum of squared errors the mo st. The process will stop adding RBF s when the error rich es a minimum and then starts to increase (heuristics are gene rally used). Orthogonal Least Squared Learn ing (O LS) speeds up the forward selection [13]. T he OLS meth od involves the transformation of the set of input patterns into a set ofort hogo nal basis vectors, and thu s makes it possible to calculate the individual contr ibution to the desired output energy from each basis vector. An other way to define the hidd en layer parameters is to first cluster the trainin g patterns to a reasonable number of groups and then assign a neuron to each cluster. The prototype and standard deviation ofeach cluster are used to describe the corresponding RBF. T herefore unsupervised or partially supervised clustering algorithms can be used. Clustering algorithms will be addressed in section 5.1. O nce the number and parameters of the RBFs defined, the hidd en layer performs a fixed nonlinear transform ation ; it maps the input space into a new space. The output layer then implements a linear combiner on this new space. The only output layer parameters to adjust are the weights of this linear combiner. 3.2.2 . Output layer definition



When applied to supervised learning with linear model s, the Least Square (LS)principle leads to a particularly easy optimization problem . H aving a training set {(Xi , M};=1



 128



F. Admiraal-Behloul and J. H. C. Reiber



the LS recipe is to minimize the sum-squared-error



= L (Yi p



S



(26)



0(Xi))2



i=l



with respect to the weights of the model. A potential problem when working with noisy training data, a large number ofinputs and small training sets is the so called over-fitting. To counter its effect, a roughness penalty term can be added to the sum of squared errors to produce the cost function m



p



Sc



= L(Yi -



0(Xi))2 +A LW~



i~l



(27)



)~1



This is called global ridge regression and involves a single regulation parameter A to control the trade-off between fitting the data and penalizing large weights. Separate regularization parameter can be attached to each RBF which leads to local adaptation of the smoothing effect.



= L (y, -



m



p



Sc



i~l



0(Xi))2



+ LAjW~



(28)



j~l



This is called local ridge regression and can be viewed as a generalization of standard ridge regression. 3.2.3. Functional Equivalence to FIS



Although FrS and RBFN were developed on different bases, they seem to be rooted in the same soil [1]. While the RBFN consists ofradial basis functions, the PlS comprises a certain number of membership functions. Both models have a mechanism whereby they can produce a center-weighted response to small receptive fields, localizing the primary input excitation. There are some conditions under which an RBFN and a FrS are functionally equivalent: • Both the RBFN and the FIS use the same aggregation method (weighted average or weighted sum) to derive their overall output. • The number ofRBF units is equal to the number of fuzzy If-Then rules in the FIS. • Each radial basis function of the RBFN is equal to a multidimensional composite MF of the premise part of the corresponding fuzzy rule . • Corresponding RBF and fuzzy rule should have the same response function (a constant or a linear equation). This functional equivalence provides a shortcut for better understanding ofboth FIS and RBFNs, in the sense that any development in the literature of one cross-fertilizes the other [3-6,47]. The analysis and learning algorithms for RBFNs are applicable to



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data



129



Rule I: Ifx is Al and y is Bl then fl=plx+qly+rl Rule 2: If x is A2 and y is B2 then f2=p2x+q2y+r2 Layer 1



Layer 2



x



: Layer J ,



·· ··· ··



Layer 4



. Layer 5



f



y



Figure 7. A two input one output ANFIS based on a first-order Sugeno model.



FIS and the fuzzy modeling procedure could be a good way of initializing a RBFN before training. This equivalence makes the FIS adaptive and prevents the RBFNs from being more black box networks, since fuzzy-if then rules can be extracted. 4. ANFIS ARCHITECTURE



Based on the RBFN/FIS functional equivalence, Jang proposed the Adaptive Neurofuzzy Inference System model. An ANFIS is a five layered network. Figure 7 shows an ANFIS with two inputs x and y, one output z = f and two fuzzy if-then rules. The ANFIS of Figure 7 is based on the Sugeno model. In Layerl every node computes the membership degree of the input variable connected to the corresponding fuzzy membership function. Each node of this layer is an adaptive node. The nodes of Layer 2 are fixed nodes whose outputs are the product of all the incoming signals. The number of neurons in this layer is equal to the number of if-then rules in the system. Each node i ofLayer 3 is a fixed node that calculates the ratio of the firing strength of rule i to the sum of the firing strengths of all rules. Every node in Layer 4 is an adaptive node whose parameter set is {Pi, qi, ri}' Layer 5 consists of a single node that computes the overall output as the summation of all incoming signals. Training such a network consists of: 1. defining the optimal number of linguistic values (ie. the number of nodes in Layer 1 and their membership functions, corresponding with the input linguistic variables,



 130



F. Admiraal-Behloul and). H. C. Reiber



2. finding the optimal set of if-then rules (ie. the number of nodes in Layer 2.) 3. defming the parameter set {Pi, qi, ri} of each node of Layer 4. RBFNs and ANFIS have two distinct modifiable parts: the antecedent (hidden layer in RBFN) part and the consequent (output layer) part. These two parts are usually adapted by two different optimization methods. Beside using only a supervised learning scheme to update all modifiable parameters, a variety of two-phase training algorithms have been proposed. A typical scheme to define the receptive field functions is the use of unsupervised learning algorithms, also referred to as clustering algorithms. Clustering algorithms are used extensively to organize and categorize data. Clustering partitions a data set into several groups such that the similarity within a group is greater than that among groups. Achieving such a partitioning requires a similarity metric which takes two input vectors and returns a value that reflects their similarity. "Hard clustering" assigns each data point (feature vector) to one and only one of the clusters with a degree of membership equal to one. Hard clustering assumes that the boundaries between the clusters are well defined. However, this model often does not reflect the nature of real data, where boundaries between clusters might be fuzzy. In this case, a more nuancial description between a feature and a specific cluster is required. 5. OPTIMAL DESIGN OF RBFN BASED ON FUZZY CLUSTERING



5.1. Fuzzy Clustering and similarity measures



Bezdek introduced several clustering algorithms based on fuzzy set theory and an extension of the least-squares error criterion [18]. Most analytical fuzzy clustering approaches are derived from Bezdek's fuzzy C-means (FCM) [19-21]. FCM has been proposed as an improvement over the earlier hard C-means clustering algorithm also known as the K-means algorithm. FCM is a data clustering algorithm in which each data point belongs to a cluster to a degree specified by a membership degree. It partitions a collection of n data points Xi (i = 1, ... , n) into c fuzzy groups and finds a cluster center in each group, such that a cost function of a dissimilarity measure is minimized. The algorithm employs fuzzy partitioning such that a given data point can belong to several groups with a degree specified by membership grades between 0 and 1. A fuzzy c-partition of X is represented by a matrix U = [flik] E me xn, the entries of which satisfy the following constraints: (i)



Mik E



[0, 1]



1::: i ::: c;



= 1,



1::: k ::: n



(



(ii)



L



1=1



(iii) 0 <



Mik n



L



k=l



Mik



< n,



1::: k ::: n



(29)



1::: i :::[



U can be used to describe the cluster structure of X by interpreting flik as the degree of membership of Xk to the cluster i. Good partitions U of X may be defined by the



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data



131



minimization of the following objective function [18] c



II



]",(U, V)



= LL(fIik)"'di~'



(30)



k=l ;=1



where m E [1, +00] is a weighting exponent called fuzzifier, V = (v1, V2, .•• , v() is the vector of the cluster centers, and dik is the distance between Xk and the i th cluster. FCM Theorem [18]: Assume m ::: 1 and d~ > 0, 1 ::": i ::": c, 1 ::": k ::": n. (U, V) may minimize] m only if: fI~



V,*



=



1



---------c,---



L'(L.") "'~, J-I d"



,"," ( r L..,k=1



fL'k



'



Xk



,"," ( )m L..,k=1 fLik



(31)



(32)



The FCM algorithm consists of iterations alternating between equations (31) and (32). This algorithm converges to either a local minimum or a saddle point of] m [21]. FCM determines the cluster centers viand the membership matrix U for a given c value as follows: Step 1: Initialize the membership matrix with random values between 0,1, such that the constraints (i), (ii) and (iii) in (29) are satisfied. Step 2: Calculate c fuzzy clusters centers Vi, i = 1, ... , c using (32). Step 3: Compute the cost function according to (30). Stop if either it is below a certain tolerance value or its improvement over the previous iteration is below a certain threshold. Step 4: Compute a new U using (31). Go to Step 2. One of the major factors that influences the determination of appropriate clusters of points is the "(dis)similarity measure" chosen for the problem at hand. Indeed, the computation of the membership degrees fL7k depends on the definition of the distance measure dik , which is usually a norm. Recent advances in fuzzy clustering have shown a spectacular ability to detect not only hypervolume clusters of different shapes [22-27], but also clusters that are actually thin shells such as curves and surfaces [29], by using an appropriate distance measure. The inner product norms (quadratic norms) on R" are generally used. The squared quadratic norm (distance) between a pattern vector Xi and the center of the kth cluster Vk is defined as follows: (33) where A is any positive definite (N x N) matrix



 132



F. Admiraal-Behloul and]. H. C. Reiber



Clusters found with inner product norms match smooth hyperellipsoidal shapes whose principal axesare determined by the eigenvectors of A. The identity matrix is the simplest and most popular choice of A. The corresponding dik is the standard Euclidean distance and the corresponding fuzzy clustering algorithm is the so called FCM. The major drawback of the FCM algorithm is that it searches for hyper-spherical shaped clusters of approximately the same size; it has the undesirable property of splitting large elongated clusters. A different choice of A leads to clusters with a more or less hyper-ellipsoidal shapes. When A is a diagonal matrix with positive elements on the diagonal, the extracted clusters are axes-parallel hyper-elliposoids. To extract clusters of arbitrary hyper-ellipsoidal shapes (not necessarily axis parallel), the covariance matrix is used [22] [25] [28]. Indeed, using the covariance matrix leads to a scaling distance in the principal component axes. Using the covariance matrix, one obtains the so-called Mahalanobis distance. For a fuzzy cluster, the classical covariance matrix is substituted by its fuzzy version defined by:



(34)



According to our experience and to the literature [26], the use of the Mahanalobis distance makes the clustering algorithm tend to partition data into either very large or very small clusters. In order to avoid the extraction of very small clusters, Gustafson and Kessel [26] fixe the size of each cluster a priori and proposed the following distance: (35)



where ak is the predefined constant which constrains the size of cluster k, and DM(Xi, v k) is the Mahanalobis distance using the fuzzy covariance matrix:



The fact that the cluster volumes have to be specified a priori constitutes a major limitation ofthe proposed algorithm when no prior knowledge is available. Later in the literature, methods based on the maximum likelihood criterion have been proposed [26-28] [30], which avoid the definition of constraints such as ai: Gath and Geva proposed an elegant solution based on an "exponential" distance [22]:



(36)



 Fuzzy rule extraction using radial basis funct ion neural netwo rks in high-dimensional data



133



with 1



r, = - L/1ik tl



/I



(37)



;=1



where Pk is th e a priori probability of selecting th e kth cluster. The major advantage of the cor responding fuzzy clustering algorithm is obtaining goo d partition results in cases ofgreat variability ofcluster shapes (still hyper- elliptical), number of points and densities. Its maj or drawback however, is its strong sensitivity to the initial values of U and V matrices. Because of the expo nential distance, the clustering algori thm seeks an optimum in a very narrow region and might be unstable during the iterative process of the clustering algorith m. T herefore, the FCM (with Euclidean distance) is used to initialize the proposed algorit hm [22]. C lustering using this distance is called Fuzzy Maximum Likelihood Estimation (FMLE). All the algorithms tracking hyper- ellipsoidal shaped clusters require the computation of the cluster covariance matrices and their corresponding inverses. T he computation of a (fuzzy) covariance matr ix requires a large number of training samples. However, in practice the covariance matrix is estima ted from a finite number of training samples. It is particularly impo rtant to no te that, w hen the ratio between the training sample size and the number of features is significantly small/ , the covariance matrix becom es singular [30]. If the fuzzy covariance matr ix is singular then the computation of the quadratic no rm becomes theoretically impossible. In practice, one sets all the offdiagonal elements to zero so it becomes regular [14-1 5]. H owever, this is not always a good estimate of th e covariance matrix and can lead to und esired partitions. A second problem related to the use of the covariance matrices is related to the time consuming property of the corresponding clustering algori thms. Indeed, at each iteration, the covariance matrices and their corresponding inverses (and some times their determinan t) have to be computed. This is comp utationally expensive in high dimensions with multiple clusters. This seems to be the reason why the Euclidean distance has been preferred in many applications. In order to solve efficiently the problem of singularity of the covarianc e matri ces and the time-c onsuming prop erty of the computation of the inverse, we suggest to use a covariance matrix estimator. 5.2. Toeplitz covariance matrix estimator



Let F be a covariance mat rix . The compo nents by:



Ii} of the



matr ix may be expressed



(38)



where



a?is the variance of



Xi



and



P i}



is the correlation coefficient be twee n



~ According to the litera ture, a ratio smaller than three is considered significantly small.



Xi



and



Xj'



 E Admiraal-Behloul and J. H. C. Reiber



134



Then F = TRI,



(10)



Thus dct(F) det]F) =



= det(T ) det( R) det (T)



and



and



R=



[



~21



P12



1



(39)



P:NI



F - 1 = y-1 R- 1y- 1



N )2 D



(



o,



det(R)



o (40)



o The matrix R can be approximated by a particular form of a Toepliz matrix given below:



RT



-



[~



~



P,\._I ]



•



p fo..·-I



P



1



-P R- 1



_ _ _1



T - 1 _ p2



(41)



P



0



0



- P 1 + p2 0



0 1 + p2



0 and



0



- p



(42)



-p 1



det(RT) = (1 _ p2)"H



(43)



T he estimation pro cess of a covariance matrix is as follows:



(I?



(i) Estimate the sample variance (ii) Estimate the sample covariance j ij and divide j i,i+1 byaiai+l to estimate Pi,i+ l (iii) Average Pi,i+l over i = 1, ... , N - 1, to obtain an estimate of p. (iv) Use p to form RT .



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data



135



Note that the Toeplitz matrix requires only (N + 1) parameters, (fiU = 1, ... , N) and p. Thus, its computational cost is not severe even if the dimension of the data is large. The Toeplitz estimator has been used to estimate the covariance matrix in [31] for the design of a Parzen classifier. Two more kernel covariance estimators have been discussed: The Ness estimator and the orthogonal expansion estimator. The Toeplitz estimator has proven to be preferable to the others when the number of features is large or the number of training samples is small. Furthermore, it requires simple computations. The reader can refer to [32] and [33] for a presentation of Ness and orthogonal expansion estimator respectively, and to [34] for a detailed presentation of the Toeplitz estimator. The comparison of the performances using an estimated covariance matrix against the exact one is not in the scope of this chapter. Hamamoto et al. gave a comparison study in [31] in the case of Parzen classifier design, which is relatively close to the fuzzy clustering problem. They concluded that the performances were comparable. In our approach, we use the Toeplitz estimator instead of the covariance matrix in (36) and refer to the distance as the approximated exponential distance. 5.3. Optimal number of clusters Up to here the number of clusters was supposed to be known a priori. However, when this number can not be defined according to some a priori knowledge, a cluster validity criterion is required in order to determine the optimal number of clusters [35-37]. Automatic rule extraction issues are generally dealing with a given set of input-output pairs. In pattern recognition problems, the outputs are class labels. The true number of clusters is generally considered to be the number oflabels present in the training data set. However, this number might not be optimal. Indeed, a class may be concave i.e. group a number ofseparated overlapping clusters (see Figure 8). When the visualization of the data is possible, that is, if the input space dimension is smaller than 4, one can identify these cases. However, in high dimensions this is not always possible. In this case, the use of cluster validity indices may solve the problem. The criteria for the definition of an optimal partition of the data into subgroups are generally based on three requirements [37]: 1. Clear separation between the resulting clusters. 2. Minimal volume of the clusters. 3. Maximal number of relevant data points concerned in the vicinity of the cluster centroid. There is a number of cluster validation indices available in the literature. An analysis of several indices was performed by Pal and Bezdek in [38-39]. Tracking the optimal partition consists of varying the number of clusters between fixed minimum and maximum values and for each given number, computing the cluster validity index. An optimal partition corresponds to a minimum or a maximum (depending on the used index) value.



 136



F. Admiraal-Behloul and]. H. C. Reiber



Figure 8. Concave and overlapping classes. Class C1 is composed of two separate clusters and c2 is composed of two overlapping ellipsoidal clusters with different axis orientations. c1 and c2 are concave clusters and an optimal clustering procedure would suggest to split them into different clusters. Classes c3 and c4 are overlapping classesthat may be merged by an unsupervised clustering process.



There is no superior cluster validity index for all possible applications (data sets) with all possible combinations of the parameters of a clustering algorithm (the fuzzifier coefficient m in fuzzy clustering for instance). We suggest to use some of the most reliable ones (according to some experiences and to the literature) simultaneously and retain the optimal value delivered by the majority. In this work, 6 validity indices, presented in Table 3, have been used in the experiments presented in section 6. The minimal value of the optimal number of clusters is set to the number of labels (classes) present in the training data. A smaller number may cause the merging of different classes (especially overlapping ones) into the same cluster. Thus, the optimal value may be smaller than the "true" number of clusters. Cluster validity indices organize data in an optimal number of groups based on "blind" requirements, that is, they are used in unsupervised techniques that do not take advantage of the labels of the patterns provided in the training data set (supervised training). The three criteria presented above may lead to merging neighboring classes. In an unsupervised frame it its impossible to avoid this merging when the three criteria are satisfied. Therefore, we take advantage ofthe availablelabels: a penalty factor is added/subtracted to the validity index in case a cluster groups patterns belonging to different classes. The penalty factor reflects the variance of labels in the cluster. The higher the variance, the higher the penalty value. One can give different definitions for such a factor. The variance is added to the validity index when the minimum value represents the optimal number of clusters and it is subtracted to the index when a maximal value is desired. 5.4. Output layer supervised training and rule extraction



The remaining step in the optimal design of a RBFN is the determination of the weights of the output layer.



 Fuzzy rule extraction using radial basis function neural networks in high- dim ensio nal data



137



Table 3 Validity indices for optimal cluster-n umber identi fication



H yper Volum e



O ptimal clusrer nr.



D escrip rio n



Validity index



t [de t (F;)jl/ 2



=



VH V



minim um



i= 1



Partit ion Den sity



(



5j



VD =



L ;= l ldet ( F



~ =



L



N



maximu m



I/" -



I Xi / (Al - vj) Fi-



II xj E



IIIj



j=1



1



( Xj -



v;) <



11 maxim um



Average Partiti on De nsity (using th e det erm inant of Fi )



5, L-tr (F;)



1 (



Average Partition D ensity (using the trace of Fi)



D AD



Trace of within cluster scatter ma tr ix



Vw = tr (SIII)



= c



maximum



;= 1



j~I 1



5w = Trace of Between cluster scatte r matri x



j))



(



N



l1 ij )



F; j~l



VB = I r (5B)



L L II jj



5B =



I



;= 1



1



v = (



(



"



j=1



maximum mi nimum



)



.



(v; - ,,) . (Vi - V)T



L 1/i I



i =l



The training of the output layer is co nside red as a labelin g ph ase. Ind eed , the optimal cluster ing ph ase determine s th e best set of clusters to model the structure of the data and outputlayer is actu ally used to put th e adequate label on clusters gro uping patterns of a same class. In terms of fuzzy rules this layer det ermines the conseque nce parts of th e extracted rules. Each output node represents a class. In general, the output weight s of a RBFN can be determined by a pseudo-inverse matrix which can be computation ally demanding when the training set is large. In this work we used the delta-rule typ e of learning algorithm (Back-prop agation) which is a less dem anding technique. The we ight between a hidden node j and a output node i can be int erpreted as the certainty degree to assign a pattern belon gin g to cluster j , to class i . A value of 1 means that cluster j groups patterns belon gin g exclusively to class i . A zero value me ans that no patterns from class I are present in cluster j. An intermediate value means that the cluster is heterogeneous reflectin g a possible class overlapping (or merging). T he weight values are thus co nsidered as certainty values that can be defin ed as th e percentage of patterns bel on gin g to class i that were gro uped in cluster j by th e clustering algorithm . These values can be computed after the optimal clustering ph ase and be con sidere d as initial set of weights for a back-propagation algorithm or as the final weights of th e output layer.



 138



F. Admiraal-Behloul and]. H. C. Reiber



5.5. Algorithm: elliptical radial basis function network design



We have so far presented a general methodology for the optimal design of a RBFN for automatic fuzzy rule extraction; we refer to a RBFN designed by following this approach as Elliptical Radial Basis Function Network (ERBFN). In this section we recall the main steps in a concise algorithm for a practical usage. Given a collection of n m-dimensional training data points Xi (i = 1, ... , n) and their corresponding labels l, (i = 1, ... , n); Ii E {I, ... , c} where c is the number of classes. The issue is to extract rules in order to label correctly the data points. Let n, be the number of clusters (RBFs), and cmax be the maximum value n, can have. Step 1: Set n, to c and set the centers of the RBFs 'Pi (x) to the mean values of the c classes. (one can choose to pick randomly a data point from each class to set the corresponding RBF). Step 2: Unsupervised Fuzzy clustering: first cluster the data using the FCM algorithm for few iterations (as described in section 5.1), then apply the FMLE using the exponential distance. In relatively high dimensional problems one can use the Toeplitz estimator to compute the fuzzy covariance matrix and its inverse matrix and determinant, to reduce the computational time. The use of the Toeplitz estimator is required in case of singular covariance matrices. Step 3: Compute the validity indexes as described in section 5.3. If n, is equal to the Cmax go to step 4 else increase n, by 1. The new RBF center can be picked randomly. However, we suggest to pick it from the most heterogeneous cluster and move the actual center of that cluster (by adding small random values or picking another data point randomly). and go back to step 2. Step 4: Set the optimal number of nodes/rules nap! to the number delivered by the majority of the validity indexes. Step 5: Design a RBFN with m input nodes, nap! hidden nodes and c output nodes. The centers of the hidden nodes are set to the prototypes of the clusters of the optimal partition. The shape of each radial function is defined by the (estimated) fuzzy covariance matrix (used in the distance computation). For each cluster (in the optimal partition), compute the percentage of each class, as described in section 5.4. Set the weight of the connection between hidden node j and the output node i to the percentage of data points belonging to cluster j and labelled i. Test the obtained network. Iffurther tuning is required, apply the well known backpropagation algorithm to train the network. 6. EXPERIMENTS



In order to evaluate the efficiency ofthe presented method, we performed experiments on synthetic and real world data. Cross validation technique is used to assess the performance of the classifiers. The results presented above are the average of 5 runs, where a random proportion of the data, 70% for synthetic data and 50% for the benchmark data (from each class), was used for training and the remaining data was used for testing.



 Fuzzy rule extra ction using radial basis function neural net works in high-di mension al data



139



Table 4 Data descrip tion of exper imen t I



CI C2 C3



Center (x.y)



Variance (x,y)



# Patterns



(24, 10) (24, 10) (17, 7)



(24, 0. 1) (1, 2) (1, 2)



123 43 102



6.1. Synthetic data experiments



The meth od was first tested on synthetic hyper-ellipsoid al clusters of varying size, axes orientation, position and density with normal distributions. By varying the distances between the cluster prototypes and controlling the variances of th e features, cluster overlapping was controlled. R andom fuzzy covariance matrices were generated in order to produce arbitrary ori ented hyper-ellipsoids. The dimension, the number of clusters and the number of data points of each cluster were subjec ted to variation. Intuiti on about the system behavior was developed first on non-overlapping hyper elliptical clusters and then by increasing the overlap between clusters belonging to the same class and finally by overlapping clusters belonging to different classes. The number of clusters per class varied from 1 to 3, and the total number of clusters varied from 2 to 10. The dimensions tested were {2, 3, . . . , 10, 20, 40} . In non-overlappin g cases the system performed very well; the accuracy (acc) was 100% with the number of hidden nod es equal to the number of clusters. The overlapping of clusters belon ging to the same class didn't affect the perform ance of the system. The number of hidden node s was sometimes less than the number of clusters becau se overlapping clusters were merged . Since the merg ed clusters belong to the same class it didn't affect the performances when the neighboring clusters from different classes were relatively far. However, as expec ted, the system was sensitive to the overlapping of the clusters belonging to different classes. We varied the cluster overlap from 10% to 30%; the performances varied from 94.1% to 72 .3% respectively. This behavior was predicted. In a totally overlapping area, the only way to distinguish the classes with acc = 100% is to assign a unit per pattern . However this guaranties the correct classification of the training pattern s and not that of the test patterns. In the following, we present two experiments to dem on strate the performances of our approach . For visualization convenience 2D-data sets arc considered in the examples. 6. 1. 1. Experiment 1



Three elliptical classes with different variances and den sities have been synthes ized (see Figure 9). Table 4 gives the descripti on of the clusters. C lasses C l and C2 cross each other and present an import ant overlappin g (24% ofCl overlaps with C2). Moreover, the centers of the Classes are similar, what makes the Euclidean distance an inappropri ate similarity measure. T he performance of ER BFN is compared to that of the Gaussian RBFN (GR BFN ). The parameters ofthe Gaussian function s are obtained after the FCM clustering step required for the initi alization of the memb ership matrix



 140



F. Admiraal-Behloul and]. H. C. Reiber



14



12 10



8



8



XC2 t.C3



6 4



2 0 0



5



10



15



20



25



30



35



Figure 9. Overlapping elliptical 2D data set used in experiment 1. Cluster C 1 crosses cluster C2 and presents 24% overlapping.



(U) for EFCM. The cluster prototypes and variances are used to define the center and width of the Gaussian kernels respectively. Figure 10 shows the accuracy of the GRBFN versus that of ERBFN. As one may note, the ERBFN showed better accuracy than GRBFN. Furthermore, ERBFN showed a growing accuracy according to the number of hidden units, while GRBFN's accuracy was oscillating. GRBFN were not able to accurately separate classC 1 and class C3 in spite of the fact that these classes do not overlap. Furthermore, the crossing of C 1 and C2 was not modeled by GRBFN. In Figure 11, we show the optimal partition, according to the validity indices, for both FCM partitioning and FMLE partitioning. A number of seven clusters was a good compromise between the validity criteria and the homogeneity of the clusters. Figure l1.a shows that a part ofCl was merged with C2: cluster 5 groups the top half of class C2 with the middle part of class C1. This reduces considerably the ability of the network to separate Class Cl from Class C2. Figure l1.b shows that ERBFN provided a better modeling of the overlapping classes Cl and C2. In this case 3 clusters (cluster2, cluster3 and cluster 5) are used to model class Cl and two clusters (clusterl and cluster4) for C2. Class C3 has been captured by 2 cluster6 and cluster7. One can notice that the optimal number of clusters (7) didn't provide the best performance for ERBFN but for the GRBFN it did. One should keep in mind that cluster validity indices highly depend on the clustering algorithm. Different ways of partitioning lead to different optimal numbers of clusters. The optimal number of cluster is relative and not absolute. Since there are no best validity indices for all data sets and classifiers, it is difficult to define or choose appropriate ones. However, we



 Fuzzy rule extraction using radial basis function neur al netwo rks in high-dimensional data



141



experiment 1 120



100 u u



III



...



~



80



\/ \ -..



60



-+-GRBFN -ERBFN



40 20



0 0



2



4



6



8



10



12



Nrof hIdden nodes Figure 10. GR BFN versus ER BFN in experiment 1. Th e data presents elongated (highly elliptical) clusters with impo rtant overlapping. In such CaSeS ERB FN outperforms GRB FN.



notic ed th at the best performance of the networks was genera lly obtained using a number superior or equ al to the "o ptimal" num ber of clusters (7 in this experim ent) and is generally close to the "o ptimal" one (10 in th is case). Th erfore , cluster validity indices sho uld be used to determine a first guess of optimal number of hidden nodes. Experiments in higher dimension s have been carried out and ER.BFN showe d bett er performances over GRBFN in presence of elon gated and close elliptical clusters. 6.1.2. Ex periment 2



In th is second exampl e two banana shaped classes B l (top) and B2 (bottom) were used to evaluate th e meth od. Two data sets have been gen erated (see Figure 12). In the first data set (banana 1), Bl and B2 co nsisted of l 00 patt ern s each whi le in th e second case (banana2), Bl and B2 had 1000 pattern s each. In both cases, a variance of l (around an arc) was used for each class. In the first set, Bland B2 are adj acent but non-overlapping while in the second set they present an important overlap. Figur e 13 gives the accuracy of the ER BFN and GRBFN. Very goo d accuracy was obtaine d using a small number of hidd en node s. Each class was split in several segments (clusters). In banana 1, 131 was captured by 4 clusters (cluster2, cluster3 , clusterS and cluster7) and 132 was captured by 3 clusters ( cluster l , cluster4 and cluster6). In banan2, B 1 was captured by 4 clusters (cluster2, cluster4, clusterS and cluster6) and B2 was captured by 3 clusters (cluster l , cluster3 and cluster7). Each cluster was mode led by a RB F node. The performances of GRBFN and ER BFN were co mparable. The clusters auto matically defined by fuzzy



 142 F. Admiraal-Behloul and]. H. C. Reiber



FCM partition



.·Y+~



a



ocluster1 a cluster2 If.cluster3 x c1uster4 xduster5 o clusterS + c1uster7



(a) FMLE partition



oduster1 -duster2 If.duster3 xcluster4 lI:duster5



oduster6 +duster7



(b) Figure 11. Optimal clustering output in experiment 1. (a) Partition delivered by the FCM algorithm (b) Partition delivered by the FMLE algorithm.



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data



B



·15



-



.



-



...



..



o



143



o o



1.0..••- •



. . .. -



• I



..



... . .--. . ".



(a)



-15



...-'!'-0---'.10



r:B1l



~



(b) Figure 12. Experiment 2. (a) Bananal data set: 100 patterns per class and no overlapping. (b) Banana2 1000 patterns per class and overlapping.



clustering are not highly elliptical and thus both kinds ofRBFS can capture accurately the shape of the bananas using a set of circles or ellipses. It is important to note that a small number of nodes (only 8 for bananal and 10 for banana2) was sufficient to distinguish the two bananas with a very good accuracy (99% for Bananal and 96.8 for Banana2). The optimal number of clusters was 7 for



 144 F. Admiraal-Behloul and]. H. C. Reiber



Banana1 120 80 u



o



III



...



.•.



100



60



1-:-cmFI ---ERBF



40



20 0 0



2



4



8



6



10



12



Nr of hidden nodes



(a)



Banana2 120 . , - - - - - - - - - - - - - - - -



80+----.""""~----=---------



g 60 + - - - - - - - - - - - - - - - - - - III 40 - 1 - - - - - - - - - - - - - - - -



I-+-ERBF : OOBF\



20 + - - - - - - - - - - - - - - - 0-1------,--,.------,--,------,------,



o



2



4



6



8



10



12



Nr of hidden nodes



(b) Figure 13. GRHFN versus ERHFN in experiment 2 when applied on the banana! data set (a) and when applied to the banana2 data set (b). The performances are comparable.



banana1 and 8 for banana2. As in experiment 1, the best performances were obtained using a number of nodes which is higher but still close. Thus our set of validity criteria was still relatively good since it gives an indication of the number of hidden nodes that should be used. The optimal number is around the value delivered by the majority of indices.



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data



145



Table 5 Benchmark data sets



Bupa Diabet Glass Vehicles



Dimension



# classes



# examples



majority



6 8 9 18



2 2 6 4



345 768 214 846



58.0 65.1 35.5 28.3



Extensive synthetic data experiments in higher dimensions have showed that in nonoverlapping cases the performances were comparable and in hyper-spherically shaped clusters the GRBFN showed a slightly better generalization than the ERBFN. However, for highly elliptically shaped clusters the ERBFN showed superior performance over GRBFN and the number of hidden nodes was generally smaller in the ERBFN than in the GRBFN. In fact, ERBFN can be viewed as being the result of the tuning of a GRBFN. Indeed, since FCM is used to define a first partition as an initialization step for UOFC, a GRBFN parameters can be derived without extra computational efforts. In high dimensions it is worth training both RBFNs when one doesn't know a priori the shape of clusters. IfERBFN performs better that GRBFN it implies that the data present highly elliptical clusters (elongated classes). However, ifGRBFN does better it means that the data present hyper-spherical clusters or sparse elliptical clusters that can be modeled by a number of hyper-spheres. 6.2. Benchmark data experiments



Our focus lies on domains of continuous data in relatively high dimensions. The selected benchmark data present non-linear decision surfaces (or oblique) and noisy/ overlapping data. Recently, Miroslav Kubat [40) presented interesting results on benchmark data obtained using RBFN. He used decision trees to initialize the RBFs centers and widths and derived analytically the weights ofthe output layer by means ofa pseudo inverse matrix to train the networks. His network is referred to as TB-RBFN (Tree Based Radial Basis Function Network). He compared the performances of a RBFN, initialized using decision trees, to the performances of the decision trees generated by Quinlan's C4.5. In this chapter, we compare the performances of the ERBFN to the performances given by Kubat in [40). The public-domain benchmark data were selected from the University of California, Irvine, repository [41): Liver disorder (bupa), diabetes (diab), glass and vehicles. Information about the data is given in Table 5. The rightmost column gives the percentage with which the majority concept is represented [40]. This number represents the performance that would be achieved if the system consistently labeled all testing examples with the label of the most frequently represented concept. Table 6 presents the results obtained by Kubat in [40] using TB-RBF and C4.5 and the results we obtained for the same number of units. Table 7 presents the best results obtained by our approach. The number of rules per class is given in Table 8. For the Bupa data set we obtained comparable results when using the same number of units (24). TB-TBF show a slightly better performance (2.53'Xl). However, ERBFN



 146



F. Admiraal-Behloul and]. H. C. Reiber



Table 6 Performances using the same number of hidden units as those derived for the TB-RBF and C4.5 approaches GRBF



ERBF



TB-RBF



C4.5



Data set



# units



acc



# unit



acc



# unit



acc



acc



Bupa Diabet Glass Vehicles



24 42 16 57



66.2 74.7 64.7 74.2



24 42 16 57



62.2 77.34 68,57 63.27



24.0 42.6 16.4 57.2



68.8 74.8 62.8 74.6



65.3 71.8 61.7 64.9



Table 7 Optimal number of hidden nodes GRBF



ERBF



Data set



# units



acc



# unit



acc



Bupa Diabet Glass Vehicles



53.0 39.0 21.0 57.0



66.86 75.78 75.3 74.29



54 55 21 57



73.2 79.4 73.3 63.2



Table 8 Number of rules (clusters) per class for each data set



Bupa Diabet Glass Vehicles



# nodes



Cl



C2



C3



C4



C5



C6



54 55 21 57



36 36 5 19



18 19 6 15



0 8



4 15



2



4



showed a better ascuracy when increasing the number of units. A significant improvement was obtained with 54 rules. GRBFN had the same performances and increasing the number ofclusters (node) didn't show any significant improvement. Thus, this data must present elongated structures. In Diabet, ERBFN showed the best performances while GRBFN showed comparable performances with those ofTB-RBFN when using the same number of hidden units. ERBFN showed significant improvements (4.62%) using 55 units instead of 42. This data presents concave classes and hyper-elliptical (elongated) structures, since ERBFN did better than GRBFN and the accuracy increased when the number of nodes was increased. With 55 nodes (rules) the ERBFN presented the best accuracy reported in the literature for the Diabet dataset. The glass data set presented the most spectacular results. ERBF showed a significant improvement (5.77%) over TB-RBFN while using the same number of hidden units. GRBFN presented comparable accuracy. Significant improvement is obtained with



 Fuzzy rule extraction using radial basis function neural networks in high-dimensional data



147



21 hidden units by both RBFNs (12.5% for GRBFN and 10.53 for ERBFN). The data doesn't seem to present elongated structures since GRBFN and ERBFN performances are nearly identical. However, both RBFNs didn't succeed to distinguish class c3 (see Table 8). No node (rule) is dedicated to this class. This is because few patterns (17) represent this class and it seems that it has an important overlap with other class Cl and C2. Class C4 groups less patterns (13) and so does class C5 (9). These two classes are still distinguishable, thus, they do not present overlap with other classes. In the vehicle data set, GRBFN presented comparable results to TB-RBFN however ERBFN didn't show any improvement compared to C4.5. This brings back in mind the known drawback ofFMLE algorithm: It can seek an optimum in a narrow local region and may show some time instability. Our approach was successful in most of the experiments we carried out. Besides the design of a good classifier one can derive some information about structure of the data in high dimensions (presence of convex or concave classes, presence of overlapping, the shape of the classes or their segments). 7. SUMMARY



In this chapter an efficient method to design RBFNs in high dimensional data sets for rule extraction is presented. Rules are extracted in the entire feature space in order to model the eventual correlation between the feature (arbitrary shaped and oriented classes). The method takes advantage of the functional equivalence between RBFNs and FISs pointed out by lang. Advanced fuzzy clustering is used to design an optimal RBFN. Optimality concerns the smaller number of hidden nodes possible with adequate shapes of kernel functions with high accuracy. Moreover, there is a straightforward mapping of fuzzy clustering output to RBF nodes: one node per cluster, RBF centers correspond to the cluster prototypes and the shape and width of the kernels are determined by the fuzzy covariance matrix of the corresponding cluster. Optimal fuzzy clustering allows optimal RBFN design, although it is well known that clustering may be sensitive to the type of similarity measure used to group the patterns. In this chapter we suggested the use of the estimated exponential distance, which is an approximation of the distance proposed by Gath and Geva [22]. Our distance is based on the Toeplitz covariance matrix estimator. This makes the distance insensitive to matrix singularity and does not involve inversion ofthe covariance matrix. Indeed the inverse and determinant of the Toeplitz estimator are analytically known. Hyper-elliptical radial basis functions are then used to model the extracted elliptical clusters. Since there is no best validity index, the optimal number of neurons (= clusters or rules) is defined using several validity indices. The optimal number is the one suggested by the majority in a voting procedure. Furthermore, in order to avoid blind clustering, a penalty factor is added to the validity index, reflecting the heterogeneity oflabels in clusters. The computation of such a factor is possible a supervised training application.



 148



F. Admiraal-Behloul andJ. H. C. Reiber



The higher the variance oflabels, the higher the penalty values. The "true" number of cluster may be higher than the number of classes in case of concave classes. The optimal fuzzy clustering can identify concave structures. A concave class may be split in several convex clusters and thus be modeled by several convex hyper-spherical or hyper-elliptical kernels. Since each cluster corresponds to a RBF node and thus to a fuzzy rule, optimal set of rules describing the structure of the data in the entire space is extracted. In this chapter, efficient and practical solutions to known problems encountered while designing RBFN networks are proposed and a general methodology is presented. The performances of the method are demonstrated on challenging synthetic and real world data sets. The focus lied on domains of continuous data presenting non-linear decision surfaces and noisy and/or overlapping areas. For each example small networks were derived to model the data with good accuracy. The structure of the data can be identified and described by few fuzzy rules defined in the entire space, which is more practical for high dimensional data sets. REFERENCES



[11 J. S. Roger Jang, and C. T. Sun, Functional Equivalence Between Radial Basis Function Networks and



Fuzzy Inference Systems, IEEE Transactions on Neural Networks, 4 (1) (1993) 156-158. [2] R. T. Yager, and D. P Filev, Unified Structure and Parameter Identification of Fuzzy Models, IEEE Transactions on System Man and Cybernetics, 23 (1993) 1198-1205. [3] J. Nie, and D. A. Linkens, Learning Control Using Fuzzified Self-Organizing Radial Basis Function Network, IEEE Transactions on Fuzzy Systems, 1(4) (1993) 280-287. [4] C. F. Juang, and C. T. Lin, An On-Line Self-Constructing Neural Fuzzy Inference Network and its Applications, IEEE Transactions on Fuzzy Systems, 6(1) (1998) 12-32. [5] F. Klaw, and R. Kruse, Constructing a Fuzzy Controller from Data, Fuzzy Sets and Systems, 85 (1997) 177-193. [6] K. M. Lee, D. H. Kwak, and H. Leekwang, Tuning of Fuzzy Models by Fuzzy Neural Networks, Fuzzy Sets and Systems, 76 (1) (1995) 47-63. [7J Z. Q. Liu, and F. Yan, Fuzzy Neural Network in Case-Based Diagnostic System, IEEE Transactions on Fuzzy Systems,S (2) (1997) 209-222. [8] T. Takagi and M. Sugeno, Fuzzy Identification of Systems and its Applications to Modeling and Control, IEEE Transactions on System Man and Cybernetics, (15) (1985) 116-132. [9] M. Delgado, A. F. Gomez-Skarmeta, and F. Martin, A Fuzzy Clustering-Based Rapid Prototyping for Fuzzy Rule-Based Modeling, IEEE Transactions on Fuzzy Systems, 5(2) (1997) 223-233. [10[ S. J. Raudys, and A. K. Jain, Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13 (3) (1991) 252-264. [111 M. Ramze Rezaee, B. Goedhart, 13. P. F. Lelieveldt, and J. H. C. Reiber, Fuzzy Feature Selection, Pattern Recognition 32 (1999) 2011-2019. [12] D. S. Broomhead, and D. Lowe, Multivariable Function Interpolation and Adaptive Networks, Complex Systems, 2 (1988) 321-355. [13] S. Chen, C. F. N. Cowan, and PM. Grant, Orthogonal Least Square Learning Algorithm for Radial Basis Function Networks, IEEE Transaction on Neural Networks, 2(2) (1991) 302-309. [141 Y.S. Hwang and S. Y. Bang, An Efficient Method to Construct a Radial Basis Function Neural Network Classifier, Neural Networks, 10(8) (1997) 1495-1503. [15J M. J. D. Powell, Radial Basis Functions Approximations to Polynomials, Proceeding of 12th Biennial Numerical Analysis Conference, Dundee, 1987, PI'. 223-241. [16] B. Mulgrew, Applying Radial Basis Functions, IEEE Signal Processing Magazine, (1996) 50-64. [17J M. Musavi, W. Ahmed, K. Chan, K. Faris and D. Hummels, On the Training of Radial Basis Function Classifiers, Neural Networks, 5(4) (1992), 595-603.



 Fuzzy ru le extraction using radial basis function neural network s in high-dimension al data



c.



149



[18] J. Bezdek , Pattern R ecognition with Fuzzy Obj ective Fun ctio n Algorithm s, N ew Yo rk: Plenum , (198 1). [191 S. Abe, and R . Thawonmas, A Fuzzy Classifier with Elliptical R egio ns, IEEE Transactio ns on Fuzzy Systems, 5 (1997) 358-368. [20] S. Abe, R. Thawnm as, and M. Kayama, A fuzzy Classifier wit h Ellipsoidal R egions for Diagnosis Problems , IEEE Transactions on System Man and Cy bernetics, part C : Application and R eviews, 29(1) (1999) 14(}-] 49. [21] M . S. Yang, Convergence Propert ies of the Generalised Fuzzy-C -M eans Clu stering Algo rithms, Compu ters & Mathematics with Applications, 25 (12) (1993) 3-1 1. [22] I. Gath, and A. B. Geva, U nsupervised Optimal Fuzzy C lusteri ng, IEEE Transact ions on Pattern Analysis and Machine Intelligence, 11(7) (1989) 773-78 1. [23] R . N . Dave, and R . Krishnapu ram , R ob ust C lustering Meth ods: A U nified View, IEEE Transactions o n Fuzzy Systems, 5(2) (1997) 270-293. [24] L. Bobrowski, and J. Bezedek, C- M eans Clustering wit h the L I and Leo No rms, IEEE Transactons on System Man and Cybernetics, 21(3) (199 1) 545-554. [25] P. J. R ousseeuw, L. Kaufma, and E. Trauwaert, Fuzzy C lusteri ng Using Scatte r Matr ices, Computational Statistics & Data Analysis, 23 (1996) 135-1 51. [26] D. E. Gustafson, and W C. Kessel, Fuzzy Clustering with a Fuzzy Cov arianc e M atrix, IEEE CDC, San Diego, (1979) 761-766. [27] P. ). Ro usseeuw, E. Trauwaert, and L. Kaufma , Fuzzy Clust ering with High Contrast, Journal of Computational and Applied M ath ematics 64 (1995) 81-90. [28] l. Gath, and A. B. Geva, Fuzzy Clustering for th e Estimatio n of th e Param eters of the Compo nents of Mi xture s of N ormal Di strib ut ion s, Pattern R ecognition Lett ers, 9 (1989) 77- 86 . [29] R . N. Dave, and R. Kr ishnapur arn, Robust C luster ing M eth ods: a U nified View, IEEE Transactio ns o n Fuzzy Systems, 5(2) (1997) 270- 293. [30] E. Trauwaert , L. Kaufm an , and P. R ou sseeu w, Fuzzy C luster ing Algo rith ms based on the Maximu m Likelihood Prin ciple, Fuzzy Sets and Systems 42 (199 1) 213-227. [31] Y. Hamam oto, Y. Fujimot o, and S. Tomita, On the Estim ation of a Covariance Ma tr ix in D esignin g Parzen Classifiers, Pattern Recogni tion, 29 (10) (1996) 175 1-1759 . [32] J. v. Ne ss, O n the D ominance of No n- Parame tric Bayes R ule Discriminant Algor ithms in High Dime nsion s, Pattern R ecognition 12 (1988) 355- 368. [33] F. Kimura, K. Takashima, S. Tsur uoka, and Y. M iyake, Modified Quadratic Discriminant Fun ctio ns and th e Application to Charac ter R ecognition , IEEE Transaction on Pattern Analysis and Mac hine Intelligence, 9(1) (1987) 149- 153. [34J K. Fuk unaga, Introduction to Statistical Pattern R ecognition , 2nd edn. Academic Press, N ew York, 1990. (35] X . L. Xie, and G. A. Beni, Validity Measure for Fuzzy C lustering, IEEE Transactio ns o n Pattern Analysis and Ma chi ne Intelligen ce, 13(8) (199 1), 84 1-846 . [36] R . P. Ni khil, and ). C. Bezdek , O n Cluster Validity for the Fuzzy C - Means Model, IEEE Transaction s on Fuzzy Systems 3(3) (1995) 370-379. [371 M . R amze Rezaee, B. P. F. Lelieveldt, and ). H . C. R eiber, A N ew Cluster Validity Index for the Fuzzy C- Mea n, Pattern R ecognition Letters 19 (1998) 237-246. [38] N . R . Pal, and ). C. Bezdek, On Clus ter Validity for Fuzzy c- M eans M od el, IEEE Transactio ns on Fuzzy Systems 3(3) (1995) 370-379. [39] N . R. Pal, and). C. Bezd ek , Correctio n to on C luster Validity for th e Fuzzy-C-Mea ns Model, IEEE Transactions on Fuzzy Systems, 5(1) (1997) 152-153. [40] M . Kubat , D ecision Trees can Initialize Ra dial Basis Function N etworks, IEEE Transactions on Neural N etworks 9(5) (1998) 813-821. [41] P. Murphy, and D. Aha, "UC I R epositor y of Ma chin e Learn ing Databases [machine- readable data resposiro ry],' Tech. Re p., U niv. Calif., Ivrine, CA. http :www.ics.uci.edu /A LIM LIMachi ne- Learnin g. hun \. [421 F. Behlo ul, B. P. E Lelieveldt , A. Bo udraa, and ). H . C. R eiber, " O ptimal design of radial basis function neural network s fo fuzzy ru le extra ctio n in high dimensio nal data," Pattern recogniti on, vol. 35, Pl'. 659-675, 2002. [431 D. Dubois and H . Padre. Fuz zy Sets and Systems: Theory and Applications. Academ ic Press, Inc, 1980. [441 L. A. Z adeh , " Fuzzy sets," Information and Contro l, 1965. vol. 8. pp. 333- 353.



c.



 150



F. Admiraal-Behloul and]. H. C. Reiber



[45]]. S. R. Jang, "ANFIS: Adaptive-Network-based Fuzzy Inference Systems," IEEE Transactions on Systems, Man and Cybernetics, 1993, vol. 23, no. 3, pp. 665-685. [46]]. Moody and C. Darken, "Fast learning in networks ofiocally-tuned processing units," Neural Computations, 1989, vol. 1, pp. 281-294. [47] Z.-Q. Liu and F. Van, "Fuzzy neural network in case-based diagnostic system," IEEE Transactions on Fuzzy Systems, 1997, vol. 5, no. 2, pp. 209-222.



 FUZZY DECISION MODELING OF PRODUCT DEVELOPMENT PROCESSES



JUITE WANG AND ANDREW KUSIAK



1. INTRODUCTION



Product development has become a focus of competition in many industries. Due to decreasing product life cycle, it is important to reduce the time and cost of product development. The product development process includes six main phases: product planning, concept development, system-level design, detailed design, testing and refinement, and production ramp-up [34]. Concept development process consists offour stages: identifying customer needs, establishing product specifications, and generating and selecting product concepts. Recent efforts have been made to improve the product development decisions [20], especially at the early stages of product development, since it has been recognized that nearly 75% ofproduct life-cycle cost is committed by the end of concept development [26]. However, it is difficult to make product design decisions at the early development stages, because decision makers have to consider all life cycle issues [21,22] with vague project information available. Since the decisions made at the early development stages significantly impacts downstream decisions and the product life cycle cost, it is important to develop decision models to assist decision makers in analyzing tradeoffs among life cycle issues and improve decision making. Most research attempts to address decision support in the domain of well-defined variables and specifications where all values are assumed to be known. A few methodologies and tools have been developed to support development decisions for early development stages. Some researchers modeled design uncertainties with probability distributions [30]. However, since new product development, especially for innovative products, is usually unique in nature and there 151



 152 Juite Wang and Andrew Kusiak



may be no historical data available to estimate the probability distributions for certain variables, the stochastic models may not be the best choice to improve design decisions at the early development stages. Fuzzy set theory [18] may provide an alternative to a convenient framework of modelling the imprecise project or product information. Wood et al. [47] concluded that fuzzy set theory is more appropriate for representing and manipulating imprecise design parameters, whereas probability theory is more appropriate for dealing with stochastic uncertainty of design parameters. The literature of applying fuzzy set theory to product development decisions is reviewed next. At the product specification stage, fuzzy set approach was used to assist development teams in setting engineering requirements that will be considered in developing a product. Most ofthe published frameworks are based on the Quality Function Deployment (QFD) approach [12]. Wang [38] developed a fuzzy ranking model to prioritize engineering requirements and applied sensitivity analysis to examine the quality of the selected requirements. Temponi et al. [33] developed a fuzzy logic-based requirements analysis tool to identify imprecise relationships between engineering requirements. Kim et al. [17] combined multi-attribute value theory with fuzzy regression and fuzzy optimization for making trade-off among various engineering requirements and to choose target values for engineering requirements. Karsak et al. [14] integrated the analytical network and goal programming approaches to determine design requirements. For other related research in QFD, refer to the review paper by Chan and Wu [4]. Most research in concept development is based on fuzzy set theory to evaluate and select design concepts for developing a product. Knosala and Pedrycz [16] utilized the Analytical Hierarchical Process methodology [29] to construct membership functions for performance and weight of each criterion, and then used the fuzzy weighted mean of the overall evaluation for ranking alternatives. Carnahan et al. [3] represented evaluation results and weights regarding to all evaluation criteria with linguistic terms and ranked alternatives based on the fuzzy weighted mean of distance from a fuzzy goal. Wang [37, 40] proposed fuzzy ranking decision models for selecting "best" design concepts that have the least possibility to be worse than other alternative concepts. Jiao and Tseng [13] adapted Wang's approach [37] and introduced customer satisfaction to concept evaluation using the concept of information content proposed by Suh [32]. Vanegas and Labib [36] proposed a new fuzzy-weighted average approach to select design concepts. Wang [41] develop a fuzzy set extension of Pugh's concept selection method and proposed three indices to examine the quality of the selected design concepts. A frequently used goal at the product planning stage is to identify a portfolio of products to be developed by the organization and the timing of their introduction to the market. Fuzzy set theory can be applied to improve decisions at this stage. The CPM and PERT networks have been extended with fuzzy durations of activities [23, 25]. In project scheduling, Hapke and Slowinski [11] applied twelve fuzzy dispatching rules to generate a number of schedules and selected the schedule with the minimum fuzzy makespan. Wang [39, 42] proposed the measure of"schedule risk" and



 Fuzzy decision mod eling of produ ct development processes



153



its dual measure "schedule robustness" [44] to evaluate schedule performance for the product development project and develop ed an efficient fuzzy beam search algorithm to determ ine the schedule that minimizes the possibility of being late. For portfolio management, Wang and Hwang [46J evaluated the value of a produ ct developm ent project using a fuzzy real options approach [1, 2] and developed a fuzzy optimization model to select the set of proj ects that maximizes the total project values and balances R &D strategic goals und er limit ed resources. T his chapter presents fuzzy decision models for robust decision-making in an imprecise and un cert ain produ ct developm ent environment. Section 2 presents basic concepts offuzzy set the ory and possibility theor y that are used in the developed decision models. Section 3 describ es a fuzzy Q FD technique for prioritizing design requirements. A fuzzy concept selection model is introduced in Section 4 to determ ine the best concepts for further developm ent . T he two decision models select the "best" design requirements or concepts with the least possibility to be wor se than other alternatives. Section 5 presents a fuzzy project scheduling approach for determin ing a schedule that minimizes the risk oflate project. Finally, Section 6 concludes the chapter. 2. MODELLING PRODUCT DEVELOPMENT INFORMATION WITH FUZZY SETS



2.1. Introduction to fuzzy set theory



In this section, basic elements of fuzzy set theory [1 8] related to the approach proposed in this paper are presented . If X is a collection of obje cts denoted by x , then a fuzzy set F in X is a set of ordered pairs F = {(x, u. f (x) Ix E X} , where 0 :::: f1 f (x) :::: 1, whi ch is called the memb ership function or grade of memb ership of x in F. A fuzzy set Fi s convex, if and only if Vx, Y E R, (1)



A fuzzy set



F is normal, if and only if



sup u pIx) = I,



VXER



(2)



Thi s means that the highest value of f1 f (x) is equal to 1. A fuzzy number Fis defined as a fuzzy set which is convex and normal and its membership function can be described as follows:



I



L ft (x) ,



/.L t (x )



=



I, Rft{ x ) ,



0,



a Sx Sb bS x S c



cS x S d otherwise



where: L f : [a, b] -+ [0, 1] and Rf : [c, d] -+ [0, 1].



(3)



 154 Juite Wang and Andrew Kusiak



Many different membership functions can be defined based on the definition in (3). Two types of special fuzzy numbers are introduced as they are frequently used in the literature to reduce the amount of computational effort. A trapezoidal fuzzy number f: = (a, b, c, d) is defined as:



fl/ (x) =



I



(x - a)/(b - a),



aSxSb



1,



bSxSc



(d _ x)/(d _ c),



cSxSd



0,



otherwise



(4)



with the [b, c] interval containing the most likely values for f: and values less than a and greater than d being impossible. A triangular fuzzy number is a specialized trapezoidal fuzzy number with b = c and is usually denoted as f: = (a, b, b, d) or (a, b, d). Fuzzy arithmetic operations, addition (EEl), subtraction (8), multiplication (0), and maximum (M~x) used later in this chapter are defined next. Let * denote three basic arithmetic operations EEl, 8, 0 and let Ii, B denote fuzzy numbers. Then we define a fuzzy set on ffi, Ii * B, by the equation: flA*fl(Z) =



sup min[flA(x), flfl(Y)]



(5)



z =x*y



for all Z E ffi. Similarly, the membership function of M~x{A, flMaxIA.Il}(Z) =



sup



z=Max{x.y}



B} is denoted: (6)



min[flA(x), flfl(Y)]



Please refer to Klir and Yuan [18] for details. 2.2. Representing imprecision and preference information with fuzzy sets



Fuzzy sets theory allows to model imprecision or preference information in a product development project. As fuzzy set is used to interpret imprecision information, u.f:(x) is the degree of possibility that a parameter D has value x, given that" D is f:." For example, a design engineer wants to represent the maintenance cost of a product with "about $190." Sh/e may not know the exact maintenance cost but can specify a possible range [160,230]. Therefore, the maintenance cost of the design can be described with a triangular fuzzy number (160, 190, 230). The value 190 has the possibility of one in the fuzzy set: "about $190," and values away from 190 have lower possibility (see Figure 1). As fuzzy set is used to interpret preference information, u.f:(x) represents the degree of preference in favor of value x. For example, a project manager may prefer that a project should be completed before e I, but no later than e2; otherwise, it may delay the product to enter the market. In this case, the preferred project deadline can be represented as a triangular fuzzy number = (el, el, e2) (see Figure 2). In the same



e



 Fuzzy decision mo deling of product developm ent processes 155



About S190



1.0



Cost



o Fig ure 1. U ncertain maintenance cost.



J1



__..__



_~~.~9'!.~!~ ~:~



o



p.~adline



Time



Figure 2. Preferred project ready- time and deadline.



way, th e preferred ready-time of a project can also be represented as a triangular fuzzy numb er b = (bI , bz , bz) (see Figure 2). 2.3 . Measures of possibility and ne cessity



Dubois and Prade [6] developed a set of four rankin g indices in the framework of Zadeh 's [49] possibility theor y. The four indices, based on possibility measure n and necessity measure N , can be used to compare two fuzzy numbers. Definition: Possibility measure Given the possibility distribution



F, the possibility of realizing fuzzy event A is: (7)



Definition: Ne cessity measure Given the possibility distribution NFVI) = inf min(I - Mj.(x ), MA(X)) .\



F, the necessity of realizing fuzzy event A is: (8)



Two sets of numbers having a fuzzy number A as a fuzzy bound should be defined (Dubois and Prade 1988). T he set of numbe rs possibly greater than or equal to A is



 156 Juite Wang and Andrew Kusiak



denoted as (LI ,4,+OO)(Y)



[Ii,



+00) with membership function:



= sup (LA(x)



(9)



x~y



The set ofnumbers necessarily greater than function:



Ii is denoted as] Ii, +00) with membership (10)



Given two fuzzy numbers relationships between them: PG(A, B) =



four indices are defined to assess the possible



rt ([13, +(0)) = sup min ({LA (u),



PSG(A, B) = NG(A,



Ii and B,



nA



(]B, +(0))



(La (v))



= sup u



inf min ({L4 (u), 1 - (La (v))



1';ll:::11



B) = N A (lB, +(0)) = inf sup max(1- {LA (u), II



NSG(A, B) = N A (]B, +(0))



(11)



U;II::::Y



1'; 1I'::::U



= 1-



(La (v))



supmin ({LA (u), (La (v))



(12) (13) (14)



The four indices take values in [0, 1]. PG(A, E) (respectively, PSG (Ii, E)) implies that the grade of possibility of the proposition "Ii is greater than or equal to if' (respectively, "Ii is strictly greater than ir'). It estimates the maximum chance that an event" Ii 2: B" (respectively, "Ii> B") will occur. Similarly, NG (Ii, B) (respectively, N S G (Ii, B)) implies that the grade ofnecessity ofthe proposition" Ii is greater than or equal to is: (respectively, "Ii is strictly greater than B"), It provides an index to estimate the minimum chance that an event" Ii 2: B" (respectively, "Ii > B") will occur. 3. A FUZZY SET APPROACH FOR PRIORITIZATION OF DESIGN REQUIREMENTS



3.1. Problem formulation



To remain successful, companies strive to develop products that satisfy customer needs. Poor product definition commonly leads to either failure of that product in the marketplace or extended product development time. Good understanding of customers' needs, when done early,leads to successful products and shortens the development time. Design requirements are generally established by a product development team at the early product development stage, according to the customer needs, company's strategic goals, government regulations, or specification practice standards. The development team needs to create or improve a product design in the downstream design process, based on the identified design requirements. The quality function deployment (QFD)



 Fuzzy decision modeling of product development processes 157



2. Design requirements (ORs)



1.



3.



Customer attributes (CAs)



Relationships between CAs and DRs



5.



4. Competitive survey



Competitive benchmark



7.



Estimated cost Technical difficulty Technical importance



Figure 3. The components ofa QFD.



[12] is a tool to systematically relate attributes that represent the overall customer concerns to the design requirements that represent technical performance specifications of a product to be developed. It has been used successfully by industries in both Japan and USA. The elements ofQFD are displayed in Figure 3. The QFD planning process is summarized into the following seven steps: 1. Obtaining the customer attributes and their relative importance. 2. Developing design requirements responsive to the customer attributes. 3. Relating design requirements to the customer attributes. 4. Completing the customer competitive survey. 5. Performing the competitive technical benchmarking. 6. Determining the relationships among design requirements. 7. Calculating the technical importance ratings of design requirements and evaluating their technical difficulties and estimated costs. A product development team creates or improves a product based on the design requirements represented by the QFD matrix. However, it is not possible to consider all design requirements, because a perfect product may require a longer time and higher development costs. Product designers need to know how to make tradeoffs in the selection of the design requirements that result in higher level of customer satisfaction and the balanced design of a product. A fuzzy ranking preference model [38] based on possibility theory [6] is presented next. The model establishes a preference structure ofdesign requirements and identifies critical requirements that are the focus at the later development stages. The purpose of the proposed model is not only to satisfy the customer requirements but also accomplish a balanced design.



 158



Juite Wang and Andrew Kusiak



Table 1 The linguistic scales for input data ofQFD Linguistic scale for relative importance



Linguistic scale for relationships between CAs and ERs



Linguistic scale for estimated cost



Linguistic scale for technical difficulty



Very unimportant unimportant Fairly unimportant Medium Fairly important Important Very important



Weakest Weak Fairly weak Medium Fairly strong Strong Strongest



Very expensive Expensive Fairly expensive Medium Fairly reasonable Reasonable Very reasonable



Very difficult Difficult Fairly difficult Medium Fairly easy Easy Very easy



JlJx)



L1



1.0



o



2



L4



L3



L2



3



4



L5



6



8



L1 L2



L3 L4 L5 L6 L7



L7



L6



7



Fuzzy scale



9



10



x



Figure 4. Membership functions of the linguistic terms defined in Table 1.



3.2. A fuzzy outranking preference model to prioritize design requirements



3.2.1. Representing imprecise information in QFD



Due to the imprecise and incomplete design information at the early product development stages, most approaches use the subjective numerical values to represent the inputs required by QFD [4]. For example, a 1-5-9 numerical scale is used to denote weak, medium, and strong relationships between customer attributes and design requirements. However, it is more natural to allow team members to describe the performance of each criterion with some linguistic terms, such as "Important," "Unimportant," "Very important," etc. Those linguistic terms can be interpreted asspecific membership functions or fuzzy numbers for performing tradeoff analysis among various criteria. Table 1 illustrates four linguistic scales used to assess the required input data for a QFD. The membership functions corresponding to the linguistic terms defined in Table 1 are shown in Figure 4. For simplicity, we assume that the same membership functions are used to characterize each linguistic scale. Based on the linguistic scales provided, each team member evaluates each criterion and provides the result ofhis/her evaluation. For example, the strongest (weakest) strength of relationship between a customer attribute and design requirement is given the highest (lowest) linguistic value "Very important" ("Very unimportant") in the scale.



 Fuzzy decision modeling of product development processes 159



In QFD, the technical importance of each design requirement is computed as the weighted sum of the relationships with customer attributes. As the relative importance of customer attributes and the relationship between customer attributes and design requirements are represented by linguistic terms, the technical importance for design requirement j is computed from (15): (15)



where: : technical importance of design requirement j, j = 1, ... , n. )ii : relative importance of customer attribute i , i = 1, ... , m



5' j



Jij



: relationship



between customer attribute i and design requirement j; i = 1, ... ,



m,j=1, ... ,n



3.2.2. A fuzzy outranking preference model for prioritizing design requirements



The design information collected at the QFD planning stage may be vague or incomplete, and therefore it may be difficult to compare and/or distinguish various design requirements. The proposition "requirement a is better than b" may not be determined precisely. There may be some degrees of agreement and disagreement about this proposition. To tackle this problem, the fuzzy outranking relation proposed by Roy [28] is used to model the imprecise preference relations between design requirements. Alternative a outranks b (a S b) if and only if there is a sufficient evidence to believe that a is better than b or at least a is as good as b.



Definition: Fuzzy outranking relation Given two alternatives a and b, the statement" a outranks b" signifies that the decisionmaker has enough reasons to admit that a is at least as good as b. A fuzzy outranking relation indicates the degree of outranking, denoted by S(a, b), associated with each pair of alternatives a and b, where S(a, b) E [0, 1]. S(a, b) = 1 implies that a outranks b with certainty. On the contrary, S(a, b) = 0 implies that there is no evidence that b is outranked by a.S(a, b) E (0, 1) indicates a credibility ofan existing preference of a over b. Therefore, applying the concept ofoutranking relation to model the imprecise preference relations between design requirements prevents some important requirements that have enough evidence to support their criticality from ignorance. Based on the fuzzy outranking relation defined above, the problem of prioritizing design requirements is described. Let A be a finite set of design requirements evaluated in QFD according to a set C of criteria. The set of criteria may include the technical importance, estimated cost, technical difficulty, and so on. The evaluation result ofa design requirement for each criterion can be defined as a vector g (a) = [g 1 (a k~ 2 (a), ... , g (a)], where the function g k(a) represents the evaluation result of requirement a E A for criterion ekE C. The objective of prioritizing design requirements is to determine the fuzzy outranking relations between individual requirements and to build a II



 160 Juite Wang and Andrew Kusiak



preference structure (L 1 , L 2 , Vh



E L q , 3a E



L,



=}



aSh,



•.• ,



L,,) of design requirements, such that:



where p < q.



The preference structure indicates the preferences among design requirements, where the requirements in set L p dominate the requirements in set L q for p < q. The degree of outranking can be obtained from two types of relation: concordance relation and discordance relation, namely degrees of agreement and disagreement. For two design requirements a and b, the concordance relation expresses the credibility of the hypothesis that a is at least as good as b with respect to a certain criterion. The discordance relation is used to express some doubt towards the hypothesis that "a IS not at least as good as b" with respect to some criterion.



Definition: Concordance relation Concordance relation that a is at least as good as b with respect to criterion defined:



C k IS



where g k(a) and g k(b) are the fuzzy rating regarding criteria C k for requirements a and b, respectively, and is the preference ratio, 0 :::: 1.



e



e ::



Both possibility and necessity measures are used to define the concordance relation. The necessity measure NG(gk(a), gk(b)) calculates the least chance that a is at least as good as b with respect to criterion C k from the conservative viewpoint. On the contrary, the possibility measure PG(g k (a), g k (b)) computes the best opportunity from the aggressive viewpoint. Therefore the concordance index that a is at least as good as b with respect to criterion Ck should be located between NG(gk(a), gk(b)) and PG(gk(a), gk(b)). We use the Hurwicz criterion [9] to take a middle course and a preference ratio is defined to incorporate the attitude of decision maker into the decision model. If the attitude of decision maker is toward to the optimistic, then the value ofe should be greater than 0.5. On the other hand, the value ofe should be less than 0.5, if the attitude is toward to pessimistic. The global concordance relation, GCI(a, b), that a is greater than or equal to b is defined as the aggregation of all single-criterion concordance relations:



e



CCl(a, h)



=L



wkCh (a, h),



(17)



k



where



Wk



is the weight used to express the relative importance of criterion



C k.



Definition: Discordance relation Discordance relation that a is at least as good as b with respect to criterion C k is defined as follows:



(18)



 Fuzzy decision modeling ofproduct development processes 161



The least chance that requirement b is better than or equal to a is used to define the degree of doubt that a is better than or equal to b with respect to criterion C k. Next, the concordance relation and discordance relation between a and b for each criterion are aggregated to establish the outranking relation between them. The aggregation function developed by Sisko et a1. [31] is used to obtain the degree of outranking:



S(a, b)



=



I



if GCI (a, b) ::: Dl.; (a, b), V Ck



GCI (a, b),



nr k'



1 - D h , (a , b)]



GCI (a b) " I-GCI(a,b)



E



C



for{k* IGCI (a, b) < DIp (a, b)}



(19)



If there is no single-criterion discordance relation greater than the obtained global concordance relation, then the degree of outranking between requirements a and b is set to the global concordance relation between them. Otherwise, the singlecriterion discordance relations that are greater than the global concordance relation for certain criteria will take effect to weaken the belief that a outranks b. If there is a single-criterion discordance relation D h, (a , b) = 1 with respect to criterion C i-, then S(a, b) 0, i.e. b is not outranked by a absolutely. To prioritize a set A of design requirements, the outranking relation used to establish the preference structure of design requirements is defined as follows:



=



Va,bEA:



(Outranking)



a S b,



if S(a, b) ::: 8



where 8 is the outranking threshold, 0 :::: 8



< 1.



(20)



The outranking threshold 8 is used to determine the existence ofoutranking relation between a and b. If the degree of outranking between a and b does not exceed outranking threshold, then it is not considered significant and requirement b is not outranked by a. The value of outranking threshold may range from the smallest value that allows to distinguish between two alternatives to the largest value that does not distinguish between them. The decision model presented does not force to distinguish the preference between two design requirements. Requirements a and b may be incomparable to each other, because the available design information is usually incomplete at the early design stage. Based on the outranking relation defined in (20), two other relations are defined: Va,bEA:



(Indifference)



a I b,



(Incomparability)



if a S band



aRb,



otherwise



b Sa



(21) (22)



 162 Juite Wang and Andrew Kusiak



Design Requirement



o '" <=:



'" t:: 0



E!"



0-



.§



'" f§" ;:;



.n



Customer attribute



"0



c



~



.~



L7



~ I



"e



"0



:E" ~



2



CD



E!"



~



3



1



Always get a copy



L7



2 3 4 5



No blank sheets



L4 L6



L4



L7



L5 L5 L5



L5



L5



L5



L5 L5



6 7



No jams to clear Medium speed Copies on cheap paper Copies on heavy paper



L2



Copies on lightpaper



[J



Easyto clearjams



L6



9 No paper darrage 10 Low cost



L4 L6



~



II



Estinated cost



12 Technical diffculty



"



~



"cs» " " ~ ~ E c-, U c, v ~ 0-~" ::JE "0



0



4



5



6



7



L7



L7



L5



L7



L7 L7



L3 L4



L5



L6



L2



L4



L4



L6



L5



L3



L3



L4



L3 L2



Figure 5, The linguistic QFD input data for the copier design.



According to (21) and (22), the requirements that are indifferent or incomparable to each other will be identified in the preference structure of design requirements. 3.3. Illustrative example



An example of a copier design adapted from [5] is used to illustrate the approach developed. Assume that the low-cost market segment is the development focus. The set of customer attributes includes "always get a copy," "no blank sheet," "no jams to clear," "medium speed," "copies on cheap paper," "copies on light paper," "copies on heavy paper," "easy to clear jams," "no paper damage," and "low cost" and the corresponding design requirements consist of "misfeed rate (r d", "multifeed rate (r2)," "jam rate (r3)," "copy rate (r4)," "jam clearance rate (rs)," "paper damage rate (r6)," and "unit manufacturing cost (r7)." The criteria involved in the prioritizing process include technical importance (c1), estimated cost (c2), and technical difficulty (c3)' Figure 5 lists the required input data for prioritization using the linguistic scales defined in Table 1. Assume that the preference ratio (8) is set to 0.5 and the weighting factors for technical importance (Wj), estimated cost (W2), and technical difficulty (W3) are equal to 0.3, 0.2, and 0.5, respectively. The individual concordance relations between design requirements for criteria of technical importance, cost, and technical difficulty can be computed according to Eq. (16) and the global concordance matrix can be obtained in Table 2, where element (i, j) aggregates the corresponding concordance



 Fuzzy decision modeling of product developme nt processes 163



Table 2 Global conco rdance relations between design requirements



rj (i, j)



r,



1 2 3 4 5 6 7



1



2



3



4



5



6



7



0.91 0.91 0.15 0.45 0.55 0.33



0.29



0.34 0.60



1.00 1.00 1.00



0.70 1.00 1.00 0.33



0.65 1.00 1.00 0.41 0.75 0.50



0.90 1.00 1.00 0.42 0.84 0.60



-



0.75 0.00 0.10 0. 10 0.02



-



0.00 0.00 0.05 0.02



-



0.95 0.80 0.85



-



0.53 0.50



-



Table 3 Fuzzy outranking relations between design requirements



,



.



.I



(i, j)



' j



1 2 3 4 5 6 7



1



-



0.91 0.91 0.00 0.00 0.00 0.00



2



3



4



5



6



7



0.00 0.00 0.00 0.00 0.00 0.00



0.00 0.00 0.00 0.00 0.00 0.00



1.00 1.00 1.00



0.00 1.00 1.00 0.00 0.00 0.00



0.00 1.00 1.00 0.00 0.00



0.90 1.00 1.00 0.00 0.84 0.00



-



0.95 0.76 0.00



-



0.00



-



relations between requirements ri and r} regarding three criteria according to Eq. (17). Ne xt, the discordance relation s bet ween design requirements can be computed using Eq. (18) and the outrankin g relations between requirements can be constructed in Table 3 according to Eq. (19). A preference graph is used to represent the preferen ce structure of design requirements. A node of a preference graph denotes a design requirement and a directed arc from node a to b den otes a relationship of a outrankin g b. An example of the preference graph with respect to = 0.5 and 8 = 0.5 is shown in Figure 6 and the preference structure of design requirements are as follows:



e



It is observed that design requirements Y2 and r 3 (i.e., "rnultifeed rate" and "jam rate" ) dominate other requirements and sho uld be chosen for designing the copier. If the budg et is allowed to incorp orate additional requirements for the copier design, then the requirements in L 2 have higher priority than L 3 for furth er consideration. The value of preference ratio may influence the degree of outranking berween design requ irem ents and change the preference struc ture. Figure 7 shows the relation ship between th e preference ratio to the degree of outranking for certain design requ irements. It indic ates that the requirem ents that are sensitive to the preferen ce ratio



e



 164 Juite Wang and Andrew Kusiak



Figure 6. The preference graph ((8



= 0.5 and (8 = 0.5).



I.2



(1,7)



I



)(



0/)



c: c: 0.8 E



.> .>



:.;;::



_x-~



'5 0 0.6



'-



~ 0



"



0.4



.1



-' - a- (2,1)



I



(3, 1)



1---iK- (5,4)



1---



-----~



0



."..



:::.i--------..........--~



- (5,7) (6,4)



0.2



0



Preference



o



0.1



0.2



0.3



0.4



0.5



0.6



0.7



0.8



0.9



RatIO



Figure 7. Relationship between the preference ratio to the degree of outranking for certain design requirements.



need to consider more carefully, since it may not be easy to determine the proper value of for decision makers. For example, the outranking relation between requirements r 6 and r 4 is most sensitive to the preference ratio and both of them should be examined carefully.



e



4. A FUZZY SET APPROACH FOR SELECTION OF DESIGN CONCEPTS



4.1. Problem formulation



The major task of concept development phase is to generate design concepts, evaluate them, and select one or more "best" concepts for further refinement in the latter development stages. Concept development is usually an iterative process. A set of design concepts is generated and evaluated, according to the design requirements



 Fuzzy decision modeling of product development processes 165



determined in the stage ofproduct specifications. The design concepts that are superior to others become the candidates for further improvement. Some new concepts may be improved based on old concepts, or be combined from other concepts with one or more new feature. Poor concepts will be eliminated from further consideration. The process continues until the design is well understandable and one or more "best" design concepts can be determined for further development for the next design stage. The concept selection problem is important, because selecting a poor design concept can rarely be compensated at later development stages and may lead to high redesign costs. The selection of the "best" design concepts from a set of concept variants can be expressed as a multi-criteria decision making (MCDM) model [8, 15]. A product development team needs to consider not only the required product functionality, but also other life-cycle issues (e.g., manufacturability, assembability, reliability, maintainability, etc). Some design criteria may contradict each other. The development team should analyze the trade-offs among various criteria and select the best alternative. However, it is difficult to assess the performance of each concept variant that is just a rough idea or sketch at this stage. Fuzzy set theory can be applied to assist decision makers in evaluating design concepts and selecting the "best" design concepts among them. There have been many studies that apply the fuzzy set analysis for design evaluation at different design stages [3, 13, 16, 36]. Most often it is assumed that a trade-off among various criteria can be made. Therefore, the original multi-criteria problem can be transformed into a single-criterion problem by aggregating the individual criteria. However, at the concept development stage, it may be difficult to distinguish between any two concepts as the design information is subjective or incomplete to make ajudgment. The incomparable design concepts may be required to remain in the design process until sufficient information is collected. Some researchers used non-compensatory operator, such as the minimum operator, for evaluating a design alternative based on its worst aspects [27]. In both cases, a complete order of all alternatives is built. The incomparability, which exists in practice, between alternatives is completely ignored. In addition, it is preferred to classify the set of concepts into different subsets from the "best" to "worst," rather than ranking the concepts into a complete order at the concept development stage. Using the above description, the concept selection problem is defined next. Let A be a finite set of design concepts evaluated, according to a set C of criteria. Let g (a) = 19l(a), g2(a), ... , gn(a)] be the performance of design concept a E A, where the function gk(a) represents its performance rating for criterion [k E C. The objective of the concept selection problem is to determine the set S,J) of non-dominated design concepts from A for continuous improvement in concept development or further development in the following design stages. A fuzzy outranking preference model developed in [40] is presented next to represent the imprecise preference structure among a set of design concepts and to determine a non-dominated set of design concepts for continuous improvement or further development at later development stages.



 166 Juite Wang and Andrew Kusiak



4.2. Fuzzy outranking preference model for concept selection



4.2.1. Construction offuzzy outranking relations



Since product life cycle information is incomplete it may be difficult to compare design concepts. It is undesirable to ignore any potentially "good" concepts. The incomparable design concepts may be required to remain in the design process until sufficient design information is collected. To tackle this problem, the fuzzy outranking relation presented in Section 3 is used to model the imprecise preference relations between design concepts. Concept a outranks b (a S b) if and only if there is a sufficient evidence to believe that concept a is better than b or at least a is as good as b. The fuzzy outranking preference model presented in this section applies the measures of possibility and necessity to model the imprecise preference structure of design concepts. The four indices PG, PSG, NG, and NSG defined in Eqs. (11)-(14) characterize different comparison situations from worst to best between two fuzzy numbers. However, four indices may not lead to the same ordering, and decision makers still have to make the final choice. This somewhat defeats the purpose of the ranking method that is supposed to derive a conclusion for decision makers. In this section, an aggregation function, called the ordered weighted averaging (OWA) operator [48], is applied to combine these four indices to assist the decision making in the concept selection process.



Definition: OWA Operator



An OWA operator of dimension n is a mapping \II: l" --+ I (where 1= [0, 1]) that has an associated weighting vector V = (VI, V2, ..• , v n ) T such as (1) Vi E [0, 1], 1 ::: ::: n, and (2) L~=I Vi = 1. Furthermore, (23)



where b j is the j-th largest element in the collection a I , a2, ... , an' The OWA operator provides a continuous transition from the "pure-and" to the "pure-or." The following illustrate three important special cases of OWA aggregation:



(1) The "Max" case: V* = (1,0, 0, ,0) \II*(al' a2, ... , an) = Max {aI, a2, , an} (2) The "Min" case: = (0, 0, 0, ... , 1) \11* (aI , a2, ... , an) = Min {aI, a2, ... , an} (3) The "Average" case: V* = (lin, lin, lin, ... , lin) \IIA~R(al, a2,···, an) = (al + a2 + ... + an)ln



v:



To avoid the cumbersome assignment of weights to the weighting vector V, the quantified guided aggregation function Q(r) = r a is used to guide V for the aggregation of PG, PSG, NG, and NSG [48]. The weights associated with V are obtained



 Fuzzy decision modeling of product development processes



167



as follows: Vi



=



Q(i/n) - Q((i - l)/n),



i



= 1, ... , n



(24)



The OWA operator allows decision makers to decide about the pair-wise comparison strategy.As mentioned above, four ranking indices characterize different situations from best to worst between two evaluation ratings represented by fuzzy numbers. For two fuzzy performance ratings, if the best situation is to be considered, then we can increase "orness' by decreasing the value of a towards zero (i.e., aggressive attitude). On the contrary, if the worst situation is to be considered, then we can increase a towards a large number greater than one (i.e., conservative attitude). Otherwise, we set a to one for the average situation. The degree of outranking between two concepts is determined by the concordance and discordance relations. In this section, the concordance relation is redefined by aggregating four indices PG, PSG, NG, and NSG with the OWA operator: Cl it«, b) = FQ(pG(gk(a),gk(b)), PSG(gk(a),gk(b)), NG(gk(a),gk(b)), NSG(gk(a),gk(b))),



(25)



=



where F Q is the OWA operator and the weighting vector VQ (Vj, V2, V3, V4) is determined by linguistic quantifier Q(r) = r a, according on Eq. (24). Parameter a allows team members to specify the comparison strategy from conservative to aggressive viewpoint. In the same way, the global concordance relation, the discordance relation, the degree of outranking between design concepts are computed from Eqs. (17), (18), and (19), respectively. 4.2.2. Determination



ofnon-dominated design concepts



After degrees of outranking between pairs of design concepts have been computed, the preference indices, indifference indices, and incomparability indices are defined to determine the non-dominated design concepts [8].



Definition: Preference index Given two alternatives a and b, the statement" a is preferred to b" reflects the presence of arguments strong enough to support the statements "a outranks b" but not "b outranks a." The credibility that a is preferred to b is defined: pea, b) = Max{S(a, b) - S(b, a), OJ.



(26)



Definition: Indifference index Given two alternatives a and b, the statement "a and b are indifferent" reflects the presence of arguments strong enough to support both the statements "a outranks b"



 168 Juite Wang and Andrew Kusiak



and "b outranks a." The credibility that a and b are indifferent is defined: I(a, b)



= Min{S(a, b), S(b, all.



(27)



Definition: Incomparability index Given two alternatives a and b, the statement" a and b are incomparable" reflects the absence of arguments strong enough to support the statements" a outranks b" and "b outranks a." The credibility that a and b are incomparable is defined: J(a, b)



= Min{1 -



(28)



S(a, b), 1 - S(b, all.



Concept a is preferred to b ifand only if P(a, b) > I(a, b) and P(a, b) > J(a, b); otherwise, a is indifferent or incomparable to b. aPb{} P(a,b»I(a,b)



and



P(a,b»J(a,b)



a I b or a J b {} otherwise.



Therefore, the non-dominated set Vb



(29) (30)



SND



can be identified, such that:



~ 5VD, 3a E S"'D =} aPb



(31)



The non-domination degree of concept a E A can also be determined [8]: /L,VD(a) = min {1 - P(b, a)l bEA



(32)



4.3. Illustrated example



Consider a preliminary selection of the splashguard design for the mountain bikes that can be used for both sports and street transportation [35]. The mountain bikes have no fenders, since mud would be easily trapped between tire and fender. When riding on trails, the rider generally does not care whether he or she gets muddy. However, as riding on the street, there is a need for an easily removable device to protect bike rider and baggage from road water. Assume that seven types of concepts are generated by the splashguard development team. The linguistic scale and the corresponding fuzzy numbers used to evaluate seven splashguard design concepts is shown in Figure 8. The performance ratings for the seven concepts corresponding to eight criteria are listed in Table 4. The outranking preference model is applied to rank these seven design concepts. The relative importance ofeight criteria is evaluated with the linguistic scalelisted in Table 5. The weight of each criterion can be obtained by transforming the assigned linguistic term to the corresponding quantitative value. The obtained normalized weights for all criteria are listed in Table 6. Assume that parameter a of the quantifier-guided aggregation function is set to 1.0; i.e., Vc,=I.O = (0.25,0.25,0.25,0.25). The degrees of outranking for pairs of concepts



 Fuzzy decision modeling of product development processes



/leX)



Very



poor



1.0



o



Poor



Fairly



Fairly



Medium good



poor



2



4



5



6



7



Good



8



169



Very good



9



10



x



Fig u re 8. Seven levels of a linguistic scale used to design a splashgu ard.



Table 4 Performance ratings for seven splashg uard design concepts correspo nding to eigh t criteria C riter io n



Conce pt



I II



III IV V VI Vll



Easy attach



N ot mar



N ot catc h water



Not rattle



Long life



Light-weight



Fits M ost bikes



stream- line d



fgood vgoo d goo d goo d fgoo d vpoor poor



vpo or goo d fgood fgood fgood medium mediu m



poor good poor fgood m ediu m medium medium



poor me dium fpoor fgood good fpoor medium



vpoor poor poor fgood fgoo d me dium me dium



good fgood fgood fpoor !poor me dium m edium



good fgood fpo or fgoo d me diu m mediu m medium



vpo or vgood poor poor good m edi um med ium



Table 5 Lingui stic scale o f relative imp ortance Lingui stic we ight Very important Fairly important Im por tant M ediu m Unimportant Fairly unimportant Very uni mp ortant



Q uantitative scale 7 ()



5 4



3 2



1



are determined in Table 7 from Eqs. (17), (18), (19), and (25) and then the cor responding preference indices, indifference indices and incomparability indices are computed from Eqs. (26)-(28) and shown in Tables 8- 10. Figure 9(a) shows the preference graph, where the con cepts in L p dominate the con cepts in L q for p < q . It is observed that the preference stru cture of seven concepts with



 170 Juite Wang and Andrew Kusiak



Table 6 Relative importance for eight criteria Criterion



Qualitative scale



Normalized weight



3 6 3



0.097 0.194 0.097 0.129 0.226 0.097 0.097 ().065



Easy attach/detach Not mar Not catch water Not rattle Long life Lightweight Fit most bikes Streamlined



4 7



3 3 2



Table 7 Degrees of outranking for pairs of concepts (as ex Concept



Concept



ai



1 2 3 4 5 6 7



0.80 0.00 000 0.00 0.00 OJ)O



= 1.0)



aj



2



3



4



5



6



7



0.00



0.00 0.79



0.00 0.00 0.00



0.00 0.00 0.00 0.00



0.00 0.00 0.00 0.00 0.77



0.00 0.00 0.00 000 0.77 0.18



0.00 0.00 0.00 0.00 0.00



0.00 0.00 0.00 0.00



0.11 0.00 0.00



Table 8 Preference indices for pairs of concepts (as ex



0.00 0.00



0.59



= 1.0)



Concept ai



1



Concept



ai



2 3 4 5



6



7



0.80 0.00 0.00 0.00 0.00 0.00



2



3



4



5



6



7



0.00



0.00 0.79



0.00 0.00 0.00



0.00 0.00 0.00 0.00



0.00 0.00 0.00 0.00 0.77



0.00 0.00 0.00 0.00 0.77 0.18



0.00 0.00 0.00 0.00 0.00



0.00 0.00 0.00 0.00



0.11 0.00 0.00



0.00 0.00



Table 9 Incomparability indices for pairs of concepts (as ex Concept



Concept



ai



1 2 3 4 5 6



7



020 1.00 1.00 1.00 1.00 1.00



0.59



= 1.0)



aj



2



3



4



5



6



7



0.20



1.00 0.21



1.00 1.00 1.00



1.00 1.00 1.00 0.89



1.00 1.00 1.00 1.00 0.23



1.00 1.00 1.00 1.00 0.23 0.41



0.21 1.00 1.00 1.00 1.00



1.00 1.00 1.00 1.00



0.89 1.00 1.00



0.23 0.23



0.4-1



 Fuzzy decision modeling of produ ct development processes 171



Table 10 Indifference indices for pairs of concepts (as ex = 1.0) Co ncept



1



Co ncept



aj



2 3 4 5 6 7



0.00 0.00 0.00 0.00 0.00 0.00



aj



2



3



4



5



6



7



0.00



0.00 0.00



0.00 0.00 0.00



0.00 0.00 0.00 0.00



0.00 0.00 0.00 0.00 0.00



0.00 0.00 0.00 0.00 0.00 O. IH



0.00 0.00 0.00 0.00 0.00



0.00 0.00 0.00 O.O()



0.00 0.00 0.00



o.oo 0.00



0.18



L2



(a)a= l.O



(b) a= 0.8



Fig u re 9. Preference stru cture of seven design concepts.



the credibility greaterthan 0.5 is: L I = {a2, a4, as}, L 2 = {a i , a3, a7}, and L3 = {a61. Co ncepts a2, a4 and as are included in the non -dominated set and they may be considered for fur ther improvem ent. According to Eq. (32), the non-dominatio n degrees of concepts a., i = 1 to 7, are obtained: 0.37, 0.51, 0.49, 0.23, 1.0, 0.36, and, 1.0. T his result is consistent with the obtained ranking order of design concepts. Figure 9 indicates that parameter a influences the preference stru cture of design concepts. The influen ce of parameter a on the ranking order of design concepts is shown in Table 11. The development team can select the con cepts which are less sensitive to the changes of parameter a. In this exampl e, concepts a2 and as may be the better cho ice for continuous improve ment or furth er development at the later development stage, because they keep in the non- do minated set as the value of parameter a varies. If the budget allows, concept a4 may be included for further improvement. Figure 10 shows the non-dominance degree of seven design concepts. It ind icates that con cepts a2 and as are the most robust and are insensible to the value of a. The conc epts a2, a4, and as strictly dominate other design concepts for any value ofa .



 172



Juite Wang and Andrew Kusiak



Table 11 The influence of parameter ex on the ranking order of seven valve types Orness



Parameter ex ex ex ex ex ex ex ex ex



= 0.20 = 0.40 = 0.60 = 0.80 = 1.00 = 1.20 = 1.40 = 1.60



Ranking order



Lj Lj Lj Lj Lj Lj Lj Lj



0.83 0.71 0.63 0.56 0.50 0.45 0.42 0.38



0.80



t>



0.60



0



o



e



·s 0



Tc:



z



{a2, as}, L2 = {aj, a3, a4, a6, a71 (a2, as}, L2 = {aj,a3,a4,a7}, L3 = {a6) {a2, a4, as}, L2 = {aj, a3, a7}, L3 = (a61 {a2, a4, as}, L2 = {aj, a3, a7}, L3 = {a61 = (a2, a4, as), L2 = {aj, a3, a6, a71 = {a2, a4, as}, L2 = {aj, a3, a6, a7}



/ _ x- x -



..,Ol> '"c:



= = = =



- -



1.00 t> ~



= {a2, as}, L2 = {aj, a3, a4, a6, a71



= {a2, as}, L2 = {aj, a3, a4, a6, a71



0.40



/ /



0



0.20 0.00



x .....



x »>



j



x-x -



,



/



Concept 3 -4-



-



0.4



Concept 5 Concept 6



j I



;-~~ 0.2



- a- Concept 2



- x- Concept 4



X....



o



Concept I



-+- Concept 7



0:



0.6



0.8



1.2



1.4



1.6



1.8



2



Figure 10. The non-dominance degrees of seven concepts. 5. A FUZZY SET APPROACH FOR SCHEDULING OF PRODUCT DEVELOPMENT PROJECTS



5.1. Problem formulation



A product development project may involve hundreds or even thousands of activities. Unlike the manufacturing process, a product development project is usually unique and "open-ended," especially for innovative designs. A major problem in managing a product development project is that the duration of an activity involved in the project is difficult to predict [39]. This is due to insufficient information available at the early stage ofproduct development. Uncertainty concerning what or how much work must be performed to complete an activity seriously complicates the task of accurately estimating the distribution of activity duration. In addition, there are several ways to realization of a selected project schedule, due to uncertainties in product development. For example, an activity cannot be finished on time, because the activity duration may be underestimated or some engineering changes occur during its execution. The delay of an activity impacts succeeding activities and may lead to the delay of the entire project. Under this uncertain scheduling environment, risk-averse project managers intend to consider the worst situation of a



 Fuzzy decision modelin g of prod uct development processes 173



schedule under various realization s, i.e., the schedule robustness, especially when the scheduling problem is uniqu e [44]. The fuzzy product developm ent project scheduling problem is formul ated next. A product developm ent proj ect p has a preferred fuzzy ready-time b and a fuzzy deadline between which all its activities i (l S i S 11 ) have to be performed. There may exist some precedence relation ships amo ng activities. The preceding activity produces information required by the succeeding activities. In order to be successfully executed, activity i has specific fuzzy duration di and its execution requires the exclusive use of a numb er of resources (e.g., engineers, teams, computer-aided design tools, simulation software, laboratory, etc.) defined by a vector N = ( IIi !, n n , . . . , Iliq ), whose elements determ ine the usage of resour ce types 1, 2, . .. , q. The resour ce availability for the project is also defined by a vector R = (ml , m2, ... , III q ) , where m ; indicates the availability of resource type k, k 1, . . . , q. The objective is to determine a project schedule with the least possibility of violating the predefined fuzzy ready-time and fuzzy deadline. Note that we use fuzzy sets to model uncertain and preferenc e information involved in a produ ct development project. Although the duration of an activity is difficult to predict accurately, it can be approximately estimated by project managers based on the past experience. Experienced managers are able to specify most and least possible values rather than to provide exact values. In addition, the ready-time and deadline associated with a project may be flexible, because they are often determined by the project managers. Both un cert ain and preference temporal information can be represented by possibility distributions [18] that can be further characterized by fuzzy numbers.



e



=



5.2. A fuzzy scheduling model to minimize schedule risk 5.2 . 1. Comparison of twoji lzzy temporal parameters



It is requir ed to compare fuzzy temp oral parameters to generate a feasible schedule. For example, the start tim e assignme nt for an activity sho uld satisfy the precedence constraints; i.e., the value that can be assigned to the start time of an activity should be greater than or equal to the finish tim e of each prede cessor. Possibility theory is used to compare fuzzy temporal parameters in the proposed scheduling algorithm . Given two fuzzy temporal parameters !VI and N, the degree that !VI is greater than or equal to N is defined as the weighted sum of PC (!VI, Ny and N C (!VI, Ny:



,R( M::: N) wh ere



= fJ



x PG (M,



N) + (1 - fJ)



x NG(M,



N) ,



(33)



f3 is the optimism-pessimism index, 0 S f3 S 1.



NC(i\l , IV) represents the least chance that ;(:/ is greater than or equal to N from the pessimistic viewpo int. O n the contrary, PC(i\:/, N) computes the best opport unity from the optimistic viewpo int . Parameter f3 is the optimism-p essimism index that is applied to determine the degree that !vI is greater than or equal to N, according to



 174 Juite Wang and Andrew Kusiak



the weighted average of the security and optimism levels. If the attitude of project manager is optimistic, then f3 is greater than 0.5. On the other hand, f3 is less than 0.5, if his/her attitude is towards pessimism. After g (Nt ::: N) is determined, the relationship between Nt and Nis identified by the following decision rule:



else if g(M~



N) >g(N~ M; N) 


else



M=N



g(M~



If



then



M~N



then



N~M



5.2.2. Peiformance measure ciffuzzy project scheduling



As the duration of each activity is represented with a fuzzy number, the computed project completion time is also a fuzzy number. A new performance measure is defined to evaluate the effectiveness of a schedule in an uncertain product development environment. Given schedule s, its performance measure, called the schedule risk, is defined in terms of the weighted sum of PSG and NSG: SR(s)



ee b) + (1 -



= fJ



x PSG(D(s),



= fJ



x sup min (J.LiJ(s) (u), J.Lje8h.+oo) (u)) + (1 - fJ)



fJ) x NSG(D(s),



ee b)



u



(34)



where:



D(s) is the fuzzy project duration with respect to



e is the predefined fuzzy project deadline,



5,



b is the predefined fuzzy project ready-time, and f3 is the optimism-pessimism index, 0 ~ f3 ~ 1.



SR(s) determines the chance of the project duration greater than the difference between the project deadline and ready-time regarding schedule s. PSG and NSG estimate the maximum and minimum chance that D(s) is strictly greater than b, respectively. The Hurwicz criterion is used to take a middle course, because few project managers would wish to be extremely pessimistic or optimistic. Wang [42] showed that a project manager with the risk-averse attitude will prefer a schedule with more precise project duration. On the contrary, a risk-seeking project manager will take the risk to select a schedule that produces more uncertain project duration, but may have chance to finish the project earlier. In addition, the proposed schedule risk can be interpreted by the qualitative decision theory developed by Dubois and Prade [7].



ee



5.2.3. Fuzzy scheduling with agenetic algorithm



A genetic algorithm (GA) is developed to produce a minimum risk schedule. Genetic algorithms are general search strategies and optimization methods based on



 Fuzzy decision modeling of produ ct development processes 175



some concepts from natural evolutio n [10]. The main compo nents of a genetic algorithm consist of a finite popul ation of solutions, a chromosomal representation, a fitness function , and genetic operato rs. The chromosomal representation is an encoding th at specifies a m ember of th e population of solutions. Each solution in the popul ation is en coded as a string of symbols called a chromosom e. The fitness function is used to evaluate a memb er in th e population of solutions. Th e members with the higher fitness values are assigned higher probabili ty of selection for survival and reprodu ction. In each genera tion, the algorithm creates a new population by applying geneti c operators selected individuals from previou s popul ation . This process continues until the stopping criterio n is met. Reproduction is achieved by geneti c operators: crossover and mutation . Th e procedure of geneti c algorithm is summarized as follows: Procedure: Genetic Algorithm



1. begin 2. t+---O 3. initiate P (t) / / P(t): population at generation t



4. evaluate P (t) according to a fitness function 5. while (not termination condition) do 6. begin 7. t +--- t + l 8. select P(t) from P(t - 1) 9. reproduce P(t) by crossover and mutation operations 10. evaluate P (t) according to a fitness function 11. end 12. end



In contrast to other local search algorithms that are based on manipulating one feasible solution, a genetic algorithm considers a set of random solutions called a populat ion . Working with th e popul ation permits explo ration and exploitation of the search space. It is especially helpful to examine multiple solutions whe n the scheduling evaluation function is fuzzy. Encoding of a solution is of imp ortance in genetic algorithms. There are two basic type s of encoding in the GA-based project scheduling literature. The priority value representation encodes a solution as a string of numbers representing priorities of activities. The other type of enco ding is the priority rule represent ation that encodes a solution as a series of pri or ity rules for generating a schedule. The priority value representation is used in this research as it characterizes all feasible solutions and is easy to implement comparing to the pri ority rul e representati on . An example of priority value representation is described next. C onsider a hypothetical project with eight activities numbered from 1 to 8. An example of a chromosome using th is representation can be wr itten as (3, 1, 7,2, 8,5, 6, 4). The number in the position i of the list represented th e pri ori ty of activity i. For example, th e priority of activity 3 is 7.



 176 Juite Wang and Andrew Kusiak



Accordi ng to a chro moso me representing the pri ori ty list, the fuzzy parallel scheduling proce dure [11] is used to generate a schedule. The procedure is descr ibed as follows. De no te: t: current time CS: com pleted set containing th e activities that have been scheduled and completed. D S: decision set containing th e un scheduled activities which are available for scheduling with respect to precedence constraints and resource capacity constraints. AS: active set co ntaining th e activities in progress. RA : a vector (r I, r2 , . . . , r q) indi cating th e availability of resource types k = 1, .. . , q . S: scheduled set that stores the sequence of activities being processed. Sj: th e fuzzy start time of activity i . .f; : th e fuzzy finish tim e of activity i . P(j) : priority of activity j Input: p, a priority list (chrom osome) Output: a schedule including schedule sequence and fuzzy start time for each activity Procedure: Fuzzy Parallel Scheduling Procedure 1. begin 2. t . . . . O,DS ........ {1}, CS ........ ¢ , A S ........ ¢ , RA ........ R; 3. while (number of activities in CS ~ n) do 4. begin 5. while (D S i= ¢ ) do 6. begin 7. i" min jED{p (i)}; 8. Sj t; 9. .f; t ffi dj . ; 10. rk r k -ni . k, forallk; 11. AS ASU {i·} ; 12. S S U {i' }; 13. determineDS; 14. end 15. j* minjEAs {jl ~ = m in{.f; li EAS}}; 16. t jj'; 17. rk rk+nj'k,for allk ; AS {j*}; 18. AS 19. CS CS U {j'}; 20. determine DS; 21. end Given a prior ity list, the fuzzy parallel scheduling procedure is starte d by identi fying th e decision set D S that contains th e unscheduled activities with th eir predecessors that have been completed and the resource capacity constraints are satisfied. At line 7, th e activity i" in DS with the highest priority is selected for processing. From line 8 to 10, the start tim e and finish time of activity i" are assigned and resource availability



 Fuzzy decision modeling of product development processes 177



is updated. At lines 11 and 12, the activity is inserted into the active set AS and the scheduled set S. The process is repeated until the set DS is empty. Then go to lines 15 and 16 to replace the current time with the earliest finish time of the activities in AS. Activity j * is the identified earliest finish activity in AS. Note that because the finish times of activities in AS are all fuzzy numbers, the fuzzy ranking method described in Section 5.2.1 is used to select the smallest activity finish time (line 15). If more than two activities have the same smallest finish time, then the activity with the smallest activity number is chosen. From line 17 to 19, the resource availability, active set, and completed set are updated, as activity j* is completed. The entire procedure continues, until all activities are completed. The genetic algorithm for solving the research problem is initialized with the initial population of chromosomes generated randomly. For each chromosome in the population, the fuzzy parallel scheduling procedure is applied and a schedule is generated correspondingly. As mentioned, three operations are performed to produce a next generation with fitter chromosomes. The chromosomes with higher fitness values have more chance to copy into the next generation. This can be done by randomly selecting and duplicating a chromosome with a probability that is proportional to the fitness value of the chromosomes. Since our objective is to produce a minimum risk schedule defined in section 5.2, the fitness function is defined next: J(s)



= 1.0 -



SR(s)



(35)



The linear scaling mechanism [10] is used to avoid the problem of premature convergence to a local minimum. After the reproduction operation is performed, the crossover operator is applied to introduce new chromosomes by combining individuals in the new population with a probability called crossover rate. In this paper, the order crossover operator [24] is used to maintain the permutation representation of the chromosomes. For the order crossover operation, two chromosomes are randomly selected from the new population. Two crossing sites are selected randomly along the chromosomes. These two points define a matching section that is used to effect a crossover. For example, consider two parents PI and P2 with the two cut points marked by 'I':



= (5341268171) pz = (1821573146) Pl



When PI maps to P2, P2 will leave holes in the chromosome: pz



= (lXXI57314X)



These holes are filled with a sliding motion that starts following the second crossing site: P2 = (573IXXX/41).



 178 Juite Wang and Andrew Kusiak



Figure 11. Precedence graph with seven design activities. (Reprinted from FUZZY SETS AND SYSTEMS, 127, J. Wang "A fuzzy project scheduling approach to minimize schedule risk for product development," pp. 99-116, with permission from Elsevier.)



Then the holes are then filled with the matching section from PI: p;=(5731 268141). Similarly, P1 can also be obtained:



p;



= (2681573114).



After the crossover operator is performed, the mutation operator is applied to each chromosome with a probability called mutation rate to introduce more variation in the current population. The swap mutation [24] that swaps the values of two randomly selected positions for each chromosome. For example, a new chromosome p~ can be produced from P1 by interchange the values of positions 2 and 7:



p;' =



(51426873).



5.3. lllustrative example



Assume that a project consists ofseven activities and is represented with the precedence graph shown in Figure 11. The corresponding activity information is listed in Table 12. The fuzzy project ready-time and deadline are set to (0, 1, 1, 1) and (57, 57, 57, 63), respectively. Only one type of resource is required for the project and its resource availability is 2. Assume that the optimistic-pessimistic index is set to 0.5 for both the fuzzy comparison method and the schedule risk. Applying the proposed GA, the best solution found is C = (45, 59, 75, 91) and b = (45, 58, 74, 90) with the schedule risk 0.55. Table 13 lists start times for the schedule produced. As shown in Table 13, the start time of each activity is a fuzzy number. Project managers need to select a crisp start time for each activity to execute the project. The crisp start time of an activity can be assigned by the smallest value of the a-level set of its fuzzy start time. For example, activity 3 can be started at time 8, if the value of a is set to 0.8 (see Figure 12). This is because a risk-averse project manager usually wants to



 Fuzzy decision modeling of produ ct develop ment processes



179



Ta b le 12 Activity information cor responding to the activities in Figure 11 Activity



Duration



at



(5. 7. 8. 10) (8, 10, IS , 18) (14, 17, 20,24) (9, 12, 16, 20) (3, 5, 7, 9) (5,9, 12, 15) (20, 24, 28, 33)



a2 a3 a4



as a6 "7



R esource usage



(R eprinted from FU ZZY SETS AND SYST EMS, 127, 1- Wang " A fuzzy project scheduling approach to minimize schedule risk for product development," pp. 99- 116, with permission from Elsevier.)



Table 13 Schedul e generated by the GA for the illustrative example Activity (i = 1



~



7)



(0, I, 1, I) (14, 20, 25,3 1) (5.9, to , 1\) (5,8,9, 11) (22, 30, 40, 49) (19 , 25, 29, 35) (25, 35, 47, 58)



1



2 3 4 5 6



7



Jls



3



1.0 r-- - --



- - - - -



r---



0.8



L-



-l£_-L.._



5



..L-_ L L - l



-.....L----. Time



8.2 9



Figure 12. Start time of activity 3 in the illustrative example.



initiate the activities asearly as possible to avoid projec t delays, even though its preceding activities may not be fully compl eted . This is called in concur rent engineering an overlapping approach [19]. The preceding activities release partial design information to the succeeding activities before they have been compl eted . This allows downstream activities to provide early feedbacks to the upstream activities to improve the design quality and redu ce the produ ct development cycle. Wang [42] prop osed an approach to select the activity start tim es that maximize the satisfaction degrees of all temporal constraints.



 180 Juite Wang and Andrew Kusiak



Table 14 Fuzzy project makespan and schedule risk for variable resource level No.



1 2 3 4 5 6 7 8 9 10 11



12 13



Additional resources



Resource avail.



Project makespan



Schedule risk



°



(5,4,5,4) (6,4,5,4) (5,5,5,4) (5,4,6,4) (5,4,5,5) (7,4,5,4) (5,6,5,4) (5,4,7,4) (5,4,5,6) (6,5,5,4) (5,5,6,4) (5,4,6,5) (6,4,5,5)



(180,239,294) (178, 235, 289) (180,239,294) (172,227,280) (175, 232, 284) (177,233,286) (180,239,294) (168,223,275) (174,229,280) (178, 235, 289) (172, 227, 280) (178, 235, 289) (163,214,264)



0.63 0.60 0.63 0.49 0.56 0.58 0.63 0.43 0.52 0.60 0.49 0.60 0.34



1 unit of rl 1 unit of r2 1 unit of r3 1 unit of r4 2 unit of r1 2 unit of rz 2 unit of r3 2 unit of r, 1 unit of rl and 1 unit of rz and 1 unit of r3 and 1 unit of r 1 and



rz r3 '4 r4



(Reprinted from FUZZY SETS AND SYSTEMS, 127, J. Wang "A fuzzy project scheduling approach to minimize schedule risk for product development:' pp. 99-116, with permission from Elsevier.)



The proposed fuzzy scheduling approach evaluates resource allocation to avoid the risk ofa project delay. Table 14 shows the fuzzy project makespans and the corresponding schedule risks under distinct resource availability for an electronics product development project. Assume that four resource types (rl - r4) are required in this project and the resource availability, the obtained project makespan, and the schedule risk for the original schedule are (5, 4, 5, 4), (180, 239, 294), 0.63, respectively. If a project manager feels that there is a great chance that the project will be late, s/he may consider allocating more resources to the project. For example, assigning one more hardware engineer (r3) to the project can reduce fuzzy project duration to (172, 227, 280) with the schedule risk 0.49. 6. CONCLUSION



Since decisions made at early product development stages significantly impacts on downstream decisions and the product life cycle cost, it is important to develop methods to improve decision-making at these stages. This chapter showed that fuzzy set theory may provide a framework to cope with imprecise and subjective information existing at the early product development. Three decision models based on possibility theory were presented to improve decision-making and performing sensitivity analysis to examine quality of the obtained solutions. Fuzzy set theory can be used to improve other product development decisions, such as configuration of product supply chains [43], configuration management [45], and portfolio management [46]. REFERENCES [1] F. Black and M. Scholes, "The pricing of options and corporate liabilities," Journal ofPolitical Economy, vol. 81, pp. 637-659,1973. [2] C. Carlsson and R. Fuller, "Real option evaluation in fuzzy environment," in Proceedings of the International Symposium of Hungarian Researchers on Computational Intelligence, 2000, pp. 69-77.



 Fuzzy decision modeling of prod uct development processes 181



v.



[3] J. C arnahan, D. L. Thurston, and T. Liu, " Fuzzing ratings for m ultiattri bute design decision-making," A SA'/E Jollrnal oj Mcchantco! Design, vol. 116, pp. 51 1- 52 1, 1994. [4] L. K. Cha n and M . L. Wu, " Q uality functionality deployment : A literature review," European j ournal of Operational Research, vol. 143, pp. 463-497, 2002. [5] D. Clausing, Total Qllality Mallagement: A Step-By-Step Guide to Vl0 rld-Class COllCUrretlt Engineerillg. New York: ASME Press, 1994. [6] D. Dubois and H. Prade, Possibility Theory: A n Approach to Computerized Processing cif Uncenainty. N ew York: Plenu m Press, 1988. 17] D. Du bois and H . Prad e, " Q ualitative possibility theory and its applications to co nstraint satisfaction and decision under uncert ainty," lnternational j ournal of illtelligellt Systems, vol. 14, pp. 45--61, 1999. [8] J. Fodor and M . R oub ens, Fllzzy Priferetlce Modelling alld Mu ltia iu ria Decisioll Support. Do rdrecht : KJuwer,1 994. [9] S. French, Decision Theory: All Introduction to the Mathematics of Rationality. New York : Ellis Horwood, 1993. [101 D. E. Goldberg , Getletic A lgorithms ill Search, Optimization, and Machine Learning. M assachuset ts: Addison-Wesley, 1989. [11] M. Hapke and R. Slowinski, " Fuzzy project scheduling system for software developm ent," Furz» Sets and Systems, vol. 67, pp. 101-117, 1994. [12] J. R . Hauser and D. Clausing, "T he house of quality," Harvard Business Review, vol. 66, no. 3, pp. 63-73, 1988. [13] J. X. j iao and M. M . Tseng, "Fuzzy rankin g for concept evaluation in configuration design for mass custo mization,' Concurrent Engineering: Research and Applications, vol. 6, no. 3, pp. 189-206, 1998. [14] E. E., Karsak, S. Sozer, and S. E. Alprekin, "P rodu ct plannin g in quality function deployment using a combined analytic network process and goal programmin g approach," Computers and industrial Engineerillg, vol. 44, pp. 171- 190,2002. [151 R . L. Keeney and H . R aiffa, Decisions with Multiple Objectil'Cs: Preiercnces and Vallie Tradeoffs, N ew York: John Wiley, 1976. [161 R . Knosala and W Pedr ycz, " Evaluation of design alterna tives in mechanical enginee ring," Fllz zy Sets and Systems, vol. 47, pp. 269-280, 1992. [17J K.J. Kim , H . Moskowi tz, A. Dhingra, and G. Evans, " Fuzzy multicrit eria mo dels for quality function deployme nt," Europeanj ournal of Operatiollal Research, vol. 121, no. 3, pp. 504-5 18, 2000. [18J G. J. KJir and B. Yuan, Fllzz y Sets 'llid Fllzzy Logic: 71leory and Applications. N ew Jersey: Prent ice H all, 1995 . [19] V. Krishnan, "Managing the simultaneous execution of cou pled phases in concurrent product development," IEEE TransactiotlS on EIlg illeerill.eMall a,eemeflt,vol. 43, no. 2, pp. 2 10-217, 1996. [20] V. Krishnan and K . T. Ulrich , " Product development decisions: A review of the literature," Manage",ell' Science, vol. 47, no. 1, pp. 1- 21. [211 A. Kusiak, ed., COllcurreflt Ellgilleerillg : A lltomatioll, Tools, and Tedmiques, N ew York:John Wiley & Sons, 1994. [22] A. Kusiak, Ellgilleerillg Design: Produas , Processes, and Systems, N ew York: Academi c Press, 1999. [23] F. A. Loots m a, "Stochastic and fuzzy PER T," Ellropean [ourna! ~r Operational Research, vol. 43, pp. 174- 183, 1989. [24] Z . Mi cha lew icz, Genetic Algorithms + Data Structures = Evolution Progr.ll1ls, Berlin: Spri nger-Verlag, 1992 . [25] S. H . N asution, "Fuzzy critical path method," IEEE Transactions on Systems, Mall, and Cybernetics, vol. 24, pp. 48-57, 1994. [26] J. L. N evins and D. E. Whitn ey, Concurrent Desion cif Products and Processes, N ew York : McGraw-Hill, 1989. [27] K. N. O tto and E. K. Antonsson, "Trade- off strategies in engineerin g design," Research in Engineering Dcsiyn, vol. 3, pp. 87- 103, 199 1. [28] B. R oy, " Partial preference analysis and decision aid: the fuzzy o utranking relation concept," in Co nflicting Objectives in Decisions, O . E. Dell, R . L. Keeney and H . R aiffa, Eds. N ew York: C hichester, 1977. (29) T. L. Satty, 71le Analytic Process, N ew Yo rk: HcGraw-HiII , 1980. (30) J. Siddall, Probabilistic Engineerillg Design: Principle and Applications, N ew York: Marcel Dekker, 1983. [31] J. L. Siskos,J. Loch ard, andJ. Lombard, " A multi cr iteria decision making me thod ology under fuzziness: application to the evaluation of radiological protection in nuclear power plants," in TIMS / Studies in the Mall'\eement Sciences, H . J. Zi mmer mann, Ed. Amsterdam : N orth- H olland , 1984, pp. 261-283.



 182 Juite Wang and Andrew Kusiak



[32] N. P. Suh, The Principle of Design, New York: Oxford University Press, 1990. [33] C. Temponi, J. Yen, and W A. Tiao, "House of quality: A fuzzy logic-based requirements analysis," European Journal of Operational Research, vol. 117, no. 2, pp. 34(}-354, 1999. [34] K. T. Ulrich and S. D. Eppinger, Product Design and Development, 2nd Edition, New York: McGrawHill, 2000. [35] D. G. Ullman, The Mechanical Design Process, New York: McGraw-Hill, 1992. [36] L. V Vanegas and A. W Labib, "Application of new fuzzy-weighted average (NFWA) method to engineering design evaluation," International Journal of Production Research, vol. 39, no. 6, pp. 11471162,2001. [37] J. Wang, "A fuzzy outranking method for conceptual design evaluation," InternationalJournal of Production Research, vol. 35, no. 4, pp. 995-1010,1997. [38] J. Wang, "A fuzzy outranking approach to prioritize design requirements in quality function deployment," InternationalJournal of Produetion Research, vol. 37, no. 4, pp. 899-916, 1999. [39] J. Wang, "A fuzzy set approach to activity scheduling for product development,"Journal ofthe Operational Research Society, vol. 50, no. 12, pp. 1217-1228, 1999. [40] J. Wang, "Ranking engineering design concepts using a fuzzy outranking preference model," Fuzzy Sets and Systems, vol. 119, no. 1, pp. 161-170,2001. [41] J. Wang, "Improved engineering design concept selection using fuzzy sets," International Journal of Computer Integrated Manufacturing, vol. 15, no. 1, pp. 18-27,2002. [42] J. Wang, "A fuzzy project scheduling approach to minimize schedule risk for product development," Fuzzy Sets and Systems, vol. 127, pp. 99-116, 2002. [43] J. Wang, "Developing robust inventory strategy for new product supply chain," Proceedings of the Fourth Asia-Pacific Conference on Industrial Engineering and Management Systems, Taipei, Taiwan, ROC, 2002. [44] J. Wang, "A fuzzy robust scheduling approach for product development projects," European Journal of Operational Research, vol. 152, no. 1, pp. 180-194,2004. [45] J. Wang and Y. I. Lin, "A fuzzy multicriteria group decision making approach to select configuration items for software development," Fuzzy Sets and Systems, vol. 134, no. 3, pp. 343-363, 2003. [46] J. Wang and W L. Hwang, "A compound-options-based approach to determine robust R&D project portfolio," Dept. of Industrial Engineering, Feng Chia Univ., Taichung, Taiwan, ROC, Tech. Rep. TR-2003-04, June, 2003. [47] K. L. Wood, E. K. Antonsson, andJ. L. Beck, "Representing imprecision in engineering: Comparing fuzzy and probabilistic calculus," Research in Engineering Design, vol. 1, no. 3/4, pp. 187-203, 1990. [48] R. R. Yager,"On ordered weighted averaging aggregation operators in multicriteria decisionmaking," IEEE Transactions on Systems, Man, and Cybernetics, vol. 18, pp. 183-190, 1988. [49] L. A. Zadeh, "Fuzzy sets as a basis for a theory of possibility," Fuzzy Sets and Systems, vol. 1, no. 1, pp. 3-28, 1978.



 EVALUATION AND SELECTION IN PRODUCT DESIGN FOR MASS CUSTOMIZATION



XUAN F. ZH A, RAM D. SRIRAM, WEN F. LU, AND FU J. WANG



1. INTRODUCTION



Today's highly competitive, global marketplace is redefining the way companies do business. Mass customization (Pine, 1993) provides a new paradigm for manufacturing indu stries, whereby variety and custo mization supplant standardized products, heterogeneo us and fragmented markets spring from once homogeneous markets, and produ ct life cycles and development cycles spiral down ward (Tseng and Jiao 1996, 1998). It has recently received mu ch attention and popularity from both industry and academia, and has been considered as a new battlefield for manufacturing enterprises (Wortmann et al. 1997). Mass custom ization aims at deliver ing an increasing produ ct variety to satisfy diverse customer needs while maintaining near mass produ ction efficiency (Tseng and Jiao 1996). Essentially, it is an oxymoron of variety to cater for customization and the low costs of variety fulfillment. To adopt the mass customiz ation paradigm, many comp anies are being faced with the challenge of providing as much variety as possible in the marketpl ace with as little variety as possible between produ cts in order to maintain econ omies of scale, wh ile satisfying a wide range of custo mer requ irements. A produ ct family (line) refers to a collection of pro duct variants that have the same or similar function s but with different combinations of attribute levels. In a market characterized by a large variety of customer preferences and with competitio ns, comp anies introduce a pro duct family to satisfy as best as possible the preferences of different customers and also achieve their business goals (Li and Azarm 2002). Familybased product design has been recogniz ed as an efficient and effective means to realize 183



 184 Zha et al.



sufficient product variety to satisfy a range of customer demands in support for mass customization (Tseng and ]iao 1996). Customized product development is resembled as configuration design, in which a family of products can widely variegate the selection and assembly of modules or pre-defined building blocks at different levels of abstraction so as to satisfy diverse customization requirements. The essence of configuration design is to synthesize product structures by determining what modules or building blocks are in the product and how they are configured to satisfy a set of requirements and constraints. Thus, productlfamily design evaluation plays an important role in this process, as a poor selection of either a building block or module or a configuration structure is difficult to be compensated for at later design stages and can give rise to expensive redesign costs (Pahl and Beitz 1996). Because of its paramount importance in configuration design, the alternative evaluation and selection problem has received enormous attention both in the academia and in the industry. Although a number of methods have been investigated, there is still much to be desired due to the hindrance inherent in the conceptual evaluation and selection process. Difficulties associated with such a task lie in problem solving complexity, various decision criteria, and product performance assessment (liao and Tseng 1998; Zha and Lu 2002a,b). Contemporary design has become increasingly knowledge-intensive (Tong and Sriram 1991a,b; Sriram 2002). Knowledge-intensive support becomes more critical in the design process and has been recognized as a key solution towards future competitive advantages in product development. To improve the product family design for mass customization process, it is imperative to provide knowledge support and share design knowledge among distributed designers. The aim of this chapter is to develop methodologies and technologies of knowledge support for modular product family evaluation and selection in customer-driven design for mass customization. The focus of this chapter is on the development of a comprehensive systematic fuzzy clustering and ranking methodology for product family evaluation and selection in the context of design for mass customization. The organization ofthis chapter is asfollows. Section 2 reviews the previous research related to product family design evaluation and selection. Section 3 addresses issues and technologies for customer-driven modular product family design for mass customization and its knowledge support framework. Section 4 discusses a knowledge support scheme for product family evaluation in design for mass customization. A fuzzy clustering and ranking methodology is proposed and discussed in detail. Section 5 provides a case study and a scenario of knowledge support for product customization in power supply family design. Section 6 presents the research results and discussesthe benefits or advantages ofthe proposed approach. Section 7 summarizes and concludes the chapter. 2. CURRENT STATUS OF RESEARCH



In this section, previous research work related to knowledge supported product family design for mass customization and design alternative evaluation and selection, is briefly reviewed. We first review the literature on design alternative evaluation and selection. Next we review the application ofdesign alternative evaluation and selection to product family design evaluation and selection.



 Evaluation and selection in produ ct design for mass customization



185



2.1. Design alternatives evaluation and selection



The literature on design altern ative evaluation and selection can be generally classified into five catego ries (Jiao and T seng 1998a): 1) multi - crit eria utility analysis, 2) fuzzy set analysis, 3) design analytic methodology, 4) hybrid approach, and 5) information conte nt approach. The first three approaches are generally used. The followin g review focuses mainly on the se first three approaches. Multi-crit eria utility analysis, originally develop ed by von N eumann and Morgenstern (1947), is an analytical meth od for evaluating a set of alternatives, given a set of multiple criteria. It has been widely applied in the areas of engineering and business for decision-making (Hwa ng and Yoon, 1981). Thurston (1991) has applied this technique to the materi al selection probl em that evaluates alterna tives based on utility function s that reflect the designer's preferences for multiple criteria. Mistree et al. (1992, 1995) modeled design evaluation as a compromise decision suppor t problem (DSP) and employed goal-programming techniqu es to make optimal selection decisions. While mathematical programming and ut ility analysis enhance algorithm- rigorous optimization modeling, such methods require the expected performance with respect to each criterion to be represented in a quantit ative form. They are not appropriate for use in the early design stages, where some qu alitative design criteria, i.e., intangible criteria, are involved and difficult to quantity (T hursto n and Carnahan, 1992). Fuzzy analysis, based on fuzzy set theory (Zadeh 1965), is capable of dealing with qualitative or imprecise inpu ts from designers by describing the performance of each criterio n with some linguistic terms, such as "goo d," "poor," "me dium," etc. Fuzzy analysis has proven to be quite useful in decision-making problems wi th multiple goals or criteria (Zimmermann 1987, 1996). Wood and Ant onsson (1989) have demonstrated its viability in performing computations with imprecise design paramet ers in mechanical design. Woo d et al. (1990) compared fuzzy sets with probability methods and concluded that fuzzy set analysis is most appropriate when there are imprecise design descript ions, while probability analysis is most appro priate for dealing with stochastic un cert ainty. Thurston and Carnahan (1992) revealed that fuzzy set analysis is more useful and appropriate at very early stages of the preliminary design pro cess. Knosala and Pedr ycz (1992) utilized the analytic hierarchical process method (Satty 1991) to construct membership functions for the performance and weight of each criterion, and then applied the fuzzy weighted mean of the overall evaluation to ranking alternatives. Carnahan et al. (1994) represented evaluation results and weights regarding each criterion with linguistic terms and ranked altern atives based on the fuzzy weighted mean of distance from a fuzzy goal. While fuzzy analysis excels in captur ing semantic uncertainty with linguistic terms, it require s discreet delibera tion in dealing with crisp information. A dom ain- specific method is nee ded to fuzzify each tangible criterion whose evaluatio n is naturally estimated as an ordinary real variable (Carnahan et al. 1994). Ano ther challenge is the incomparability between various criteria (Wang 1997, Siskos et al. 1984). This necessitates mechanisms to be capable of converting vario us types of performance evaluation with respect to different criteria to a commo n met ric so as to specify suitable memb ership functions for them.



 186



Zha et a1.



To reflect customer preferences in multi-criteria design evaluation, the relative importance or weighting factor for each criterion has been considered by numerous evaluation procedures (jiao and Tseng 1998). Frazell (1985) assigned weights to criteria on a 0-100 scale. Sullivan (1986) presented a similar method called the linear additive model, in which ranking is included. Huang and Ghandforoush (1984) presented another procedure for quantifying subjective criteria. They computed intangible criteria measures as the multiplication of the intangible criterion weights by the subjective customer rating. Dixon et al. (1986) measured the performance by degree of satisfaction, ranging from excellent to unacceptable. They combined this measure with priority categories of high, moderate, or low to evaluate a design. Nielsen et al. (1986) used factor-criteria to establish the level of importance of attributes. A priority level, i.e., absolutely necessary, important, or desirable, is indicated for each factor-criterion and is used to guide decision-making. The main drawback of these evaluation methods is that they ignore the inconsistency issue on the part of the decision maker (Saaty 1991), which occurs when the solution does not match the decision maker's preference and results from the randomness of the decision maker's judgments. The analytical hierarchy process (AHP) was developed to deal with the decision-maker's inconsistency and to mimic the human decision-making process (Saaty 1991). The AHP determines weights by means of pair-wise comparisons between hierarchical decision levels. It has been proven to be a more rigorous procedure for determining customer preferences, and has been approached from the fuzzy point of view by Boender et al. (1989). Carnahan et al. (1994) proposed an approach to fuzzify the weights after they have been obtained by the AHP. There are also many other product feasibility and quality assessment tools that are useful for planning the design ofproducts, such as quality function deployment (QFD) (Clausing 1994), concurrent function deployment (Prasad 1996), conceptual selection matrix (Pugh 1991), and Taguchi robust design method (Taguchi 1986). Quality function deployment (QFD) provides a set of matrix-based techniques to quantify the organizational characteristics and identify quality characteristics that would meet customer expectations and needs (Clausing 1994). While QFD addresses only the quality aspect, CFD deals with total life-cycle concerns from a concurrent engineering perspective. The concept selection matrix initially proposed by Pugh (1991) is another matrix-based approach to quantify and measure product quality characteristics. It is based on a list of product and customer requirements. The purpose ofTaguchi's robust design method is to reduce or control variations in a product or process (Taguchi 1986). Depending upon the complexity and stage of a design, there could be a large number of iterations required. While these methodologies provide high-level guidelines for design evaluation, detailed supporting techniques are essential. As Prasad (1996) pointed out, 4Ms (models, methods, metrics and measures) are the core in integrated product development. 2.2. Product family design evaluation and selection



In the literature, the problem on product family design evaluation and selection has received much attention ofresearchers from both engineering design (for designer) and



 Evaluation and selection in product design for mass custornization



187



management and marketing (for customer). From an engineering design perspective, multi-objective optimization models have been used to obtain a performance optimal product family (line) in order to satisfya range of customer requirements, and to quantifY the influence of a product platform (Nelson et al. 1999, Li and Azarm 2000,2002; Simpson et al. 1998, 2001). In addition, the engineering design literature reports on models that account for cost, expected profit, risks, and benefits of delayed decisions in producing a product family (line) (Fujita et al. 1998, Gonzale-Zugasti 2000). From the management and marketing perspectives, research efforts have been made mainly on product line positioning (Green and Krieger 1985; Kohli and Sukumar 1990; Dobson and Kalish 1993). In the product line-positioning problem, a line of products is selected from a set of already available design alternatives, considering cost, customers' preferences and market competition to optimize a business goal such as profit or market share. Li and Azarm (2002) proposed an integrated approach for a product line design selection based upon marketing potential of candidate product lines, those that have the best possible variants from an engineering design point ofview. The integrated approach accounts for a large variety of customers' preferences, market competitions, and commonality (i.e., multi-component variants that share one or more components across the product line). However, the previous work did not sufficiently account for uncertainties of parameters such as customer preferences, product's life cycle, market size, and discount rate, etc. The literature review indicates that several quantitative frameworks have been proposed for product family design evaluation and selection. They provide valuable managerial guidelines in implementing the overall platform-based product family development. However, there are very few systematic qualitative or integrated intelligent methodologies to support the product development team members to adopt the platform product development practice, despite the progress made in several research projects (Zha and Lu 2002a,b; Simpson et al. 2003). 3. CUSTOMER-DRIVEN PRODUCT FAMILY DESIGN FOR MASS CUSTOMIZATION



The approach advocated in this work isfor companies to realize a family ofproducts that can be easilymodified and quickly adapted to satisfya variety of customer requirements or target specific market niches. Details about the knowledge supported product family design for mass customization are discussed below. 3.1. Strategies and technical challenges for mass customization



The paradigm of mass customization is variety and customization through flexibility and quick responsiveness. The essence of mass customization is to satisfy customers' requirements precisely without increasing costs, regardless of how unique these requirements may be. That is, a manufacturer or company has to perceive and capture latent market niches and correspondingly develop its technical capabilities to meet diverse customer needs. Perceiving latent customization requires the exploration of market niches. The capture of target customer groups means emulating or outclassing competitors in either quality or cost or quick response or a combination of one or



 188 Zha et al.



more. Therefore, the requirements of mass customization lie in the following aspects: 1) time to market (quick responsiveness), 2) variety (customization), 3) flexibility, and 4) economies of scale (massefficiency). The oxymoron ofmass customization depends on the leverage of these requirements. There are eight identified strategies that have worked in many circumstances (Baudin 200 1):



(1) Analysis of the structure of customer demands. The premise is that it is only necessary to make what customers do order, not everything they might. Most of the actual customer demands tend to cluster around a few configurations, and production must be organized to take advantage of this structure. (2) Standardization of components. Customized products do not always have to be made from scratch. Instead, they can be made from a small number of standard components. (3) Use if products catalogs with a discrete set if sizes. Products made in size increments meet the needs of almost all consumers. (4) Postponement if customization. Customization is best employed at or near the end of the manufacturing process. Postponing customization, however, may require substantial process engineering efforts. (5) Identification of a common process. Treat customized products like options on standard products. (6) Maintenance of a design repository. A database of previous designs should help in rapidly determining an appropriate starting point for a new design. The challenge is finding ways to organize this data for easy retrieval of similar designs rather than exact matches. (7) Design a customized manufacturing process. (8) Setup of a simple production control system. Considering the above requirements, the main technical challenge in developing a coherent framework for mass customization is in the ability to simultaneously satisfy the following requirements within a single approach (Tseng and Jiao 1996):



(1) Reusability andcommonality. Optimizing reusability and commonality to achieve low cost and high efficiency, i.e. the economy of scale, an advantage characterized by mass production. (2) Product plaiform. Providing a technical foundation for realizing customization, managing varieties and leveraging core capabilities to optimize flexibility and foster a customer-focused and product driven business. (3) Integrated product development. Facilitating meta-level integration throughout the product development process and over the product life cycle to achieve quality and increased responsiveness. 3.2. Customer-driven design for mass customization



With regards to the challenges and strategies presented in the previous section, this research investigates masscustomization from a product development perspective, namely



 Evaluation and selection in product design for mass custornization



189



Product DesIgn Speclflca tloris



Product family



Product Platform



Product Variants



Figure 1. Framework for CD FMC based on the mo dule-based produ ct family design.



custo mer- driven design for mass customizatio n (CD FMC) . O ur approach is based on the belief that mass customization can be effectively approached from a design perspective (Tseng and Jiao 1996, 1998). Essentially, we attempt to include customers into the product development life cycle through proactively co nnecting custome r needs to the capabilities of a company. The main emp hasis of CD FMC is to elevate the cur rent practice from designing individual products to designing product families. In addition, CD FMC advocates extending the traditional boundaries of product design to encompass a larger scope, spanning from sales and marketing to distribution and services (Tseng and Jiao 1998). To support customized produ ct differentiation, a produ ct family platform is required to characterize custom er needs and subsequently to fulfill these needs by configurin g and modifying well- established building blocks. Figure 1 outlines the concept for C D FMC used in this research (this is an adaptation of the process model present ed in (Barkmeyer et al. 1997)). R ecognizing the rationale of family-based produ ct design with respect to mass custo rnization, the whole process of CD FMC ranges from capturi ng voices of customers and market trends for generating produ ct design specifications, designing product platform for generating produ ct variety or family, to deriving and customizing products (variant) by evaluating and selecting product family for custo mers' satisfaction . C D FMC can be divided into two major stages: 1) produ ct planning, and 2) family design. The prod uct planning stage



 190



Zha et al.



...



_-



--::-:.---



,



'"



./ :'\.. . ._.. ._~ ~ .il·/~ ',,·':>r\·.



/.



//



~/



1



:'



1



I



"\



l..;:.J DE



J



I



•



\



•



e--



2'1/0"" "'1' )\ \



\11



I l!\ I ~ i\*



\



\



1



~



Ii



~ \ L.-:..J



. \\



r;;l



\.1. ,



0



\



I



I~



\



'.



1



/



\\ i



I I



/ / ...... : :



I \~-/ .."'",~.



Ii!'



•



, \



,



Ccmmon_'.



DE



" \



<,



•



/



DlII-.onE_



\



- -



/



~



- ~-



:



• .



\



,



e.--



'1I""" 'lli ... J



•



7 \ ---./ ----'--- - ,



,PtocI..a~



••.



I IiI



e - - 1·I'o · ·'l· ,'.· 1



:



-



~PV m) I



/



. ....... .-.__ t./



! \\ i \~



I



'\



i



:



lj



\.



.. . ~_ :::':~- _



,:



. /'



Figure 2. Architecture of product family for mass customization.



embeds the voices of customers into the design objective and generates product design specifications. The product family design stage realizes sufficient product variety- a family of products to satisfy a range of customer demands. Figure 2 illustrates a product family architecture (PFA) to support mass customization (Du et al. 2000). From customers' point of view, products are functional features and the related feature values. A product family is designed to address the requirements of a market segment wherein the customers share some similar requirements and have their special requirements in the mean time. Customer requirements characterized by the different combinations of functional features can be satisfied by the product variants derived by the common bases and differentiation enablers of the product family. It is the configuration mechanisms that determine the generative aspect of a product family, which guarantee that the technically feasible and market-wanted product variants are derived. 3.3. Module-based product family design



Modular systems provide the ability to achieve product variety through the combination and standardization of components (Kusiak and Huang 1996). Fujita and Ishii (1997) decompose product families into systems, modules, and attributes. Under this



 Evaluation and selection in produ ct design for mass custorn ization



}



191



Module Attributes



~l!!Yn



Ml



Vjlnalll



M31 .M:J2



~ M2 1, M22. M4 1



Figure 3. Prod ucts. mo dules. and attributes.



hierarchical representation scheme, as show n in Figure 3, product variety can be implement ed at different levels within the produ ct architecture. The steps for creating a module- based produ ct family are as follows (Z ha and Sriram 2004): (1) Decompose produ cts into their representative functions; (2) Develop modul es with one-to-one (or many-t o- one) corre spondence with fun ctions; (3) Group common functional modules into a common produ ct platform; and (4) Standardize interfac es to facilitate addition, removal, and substitution of modules. The module-based product family design process is to develop a re-co nfigurable product platform that can be easily modified and upgraded through the addition, substitution , and exclusion of modul es to realize module-based produ ct family. The custom ization stage aims at obtaining a feasible architecture of produc t family memb er through reasoning product family modul e space according to customer requi rements (Meyer et al. 1997). There are two steps involved in this stage. First, customer requ irements such as function , assembly, and reuse need to be converted to constraints (Suh 1990). Then, the reasonin g is perform ed at two levels: namely module and attribute levels, to determine feasible product family member architecture.



 192



Zha et al.



Figure 4. Knowledge support framework for CDFMC.



3.4. Knowledge support framework for CDFMC The con ceptu al framework shown in Figure 1 dem on strates the process of customerdriven design for mass customization, which ranges from capturing voices of customer, analyzing market trends, generating design objectives and product design specification s (PDS) to customizing products for custo mer satisfactio n. To assist the design er during this process, a knowledge support framework is further developed based on the rationale of custom er-driven design for m ass customization, as illustrated in Figure 4. Product family design knowledge is classified into two catego ries: I) product /family information and kn owledge, and 2) produ ct/ family design pro cess kno wledge. These two catego ries of kn owledg e are utilized to suppo rt custom er- driven design for mass customi zation that has two application scenarios: product planning and product family design (Z ha and Sriram 2004). With understanding of the fundamental issues in m odular product family design, the knowledge supp ort scheme aims to provide support for customer requirements' modeling, produ ct architecture modeling, product platform establishme nt, product family generation, and product family assessment for customi zation . The knowledge support sche me for modular product family design and its key research issues are describ ed in (Z ha and Lu 2002b). As shown in Figure 5, the product family design process in the context of CD FMC can actually be divid ed into tw o major stages: 1) product platform building, and 2) product variant assessment. The generation of product platform and family in the product platform building stage is impl emented through product (family) planning for design specifications and modular and configuration design , while the evaluation and selection of produ ct family for customization is implemented by assessing product



 Evaluation and selection in product design for mass customization



i



<,



-- ---- -



/ ;/



Product Variant Assessment .......



193



§



Cua Product



Figure 5. Modular product family design for mass customization process_



variants generated from product platform. The fundamental issues involved in product family design process have been addressed in (Zha and Sriram 2004), which include a knowledge intensive support strategy and its implementation for platform-based product design and development. During the process of modular product family design for mass custornization, a family of products can vary widely by the selection and assembly of modules or pre-defined building blocks at different levels of abstraction so as to satisfy diverse customization requirements. The essence of CDFMC is to synthesize product structures by determining what modules or building blocks are in the product and how they are configured to satisfy a set of requirements and constraints: family generation, evaluation and selection. A wrong or even a poor selection of either a building block or a module can rarely be compensated for at later design stages and can give rise to a great expense of redesign costs (pahl and Beitz 1996). Thus, product family design evaluation and selection is crucial for CDFMC. The remainder of this chapter will focus on how the decision support knowledge in product family design knowledge base or repository (Zha and Sriram 2004) supports the designer to perform product family evaluation and selection. 4. PRODUCT FAMILY DESIGN EVALUATION AND SELECTION



This section begins with a summary of knowledge decision support scheme for product family design evaluation and selection. It then presents evaluation/customization me tries applied in product family design for mass customization process. Finally, it



 194 Zha et al.



Secondary Requ irement. &



Scenar io : Produc t Famil y Evalu ation and Selection



Con.lr~ lnts



Kn.R.!llul511 S ou rce alqulrl d • 011l0'lnll.llno F,alures • Dul,. bUHy , (Weig ht.)



Pr.'. rl nen & Imparlanct



• Trade·o rt. (Mlrklt tlnve . lml nl) • Utili ty Funcllon.



• Heu rl,tlc. • Ru ll'



Figure 6. Knowledge decision support for product evaluation.



describes a fuzzy clustering and ranking model for classification, evaluation and selection of product family design alternatives. 4.1. Knowledge decision support scheme



The product family evaluation and selection for customization stage aims at obtaining a feasible architecture of product family members through reasoning and decision support in the product family module space according to customer requirements. The customization process includes two steps. First, the customer requirements such asfunctions and assemblies need to be converted to constraints and rules. Then, the reasoning or decision support is performed at two levels, namely module level and attribute level, to determine the feasible product family member architecture at the conceptual level. The design space for product configuration during module reasoning is very large for a complex system. The designer is required to consider not only the product functionality, but also some other criteria including compactness and other life-cycle issues, such as assemblability, manufacturiability, maintainability, reliability, and efficiency. Some criteria may contradict each other. Designers should analyze the trade-off among various criteria and make the "best" selection from a number of design alternatives. In contrast to the traditional approaches (Pahl and Betiz 1996; Jiao and Tseng 1998), we propose a knowledge-based approach to product family evaluation and selection for customization. Figure 6 shows a knowledge decision support scheme. As shown in Figure 6, this stage characterizes a feasible set of products generated from product platform as an input to the final customized product as an output. It will experience the elimination ofunacceptable alternatives, the evaluation ofcandidates for customization, and the final decision under the customers' requirements and design constraints. The



 Evaluation and selection in product design for mass customization



195



knowledge resource utilized in the process may extensively include differenti ating features, customers' requirements, desirabilities, preferences and importance (weights), trade- otis (e.g. market vs investment), utility functions, and heur istic know ledge, rules, etc. The kern el of the knowledge decision suppo rt scheme is based on fuzzy clusterin g and rankin g algorithms for design evaluation and selection. T hese will be discussed below. 4.2. Customization/evaluation metrics



In order to evaluate a family of produ cts for mass custo mization, suitable metrics are neede d to assess the appropriateness of a produ ct platform and the correspo nding family of derivative products (Krishnan and Gupta 2001). The metrics should also be useful for measuring the various attributes of the product family and assessing a platform's mo dularity. With respect to the process of product family design and customization, we viewed the evaluation of product family design from three different level perspectives: product platform, product family and product variant (Z ha and Sriram 2004). The product variant level evaluation is actually the same as or similar to the individual produ ct design evaluation. Various traditional design evaluation approac hes are applicable, and the met rics for this level evaluation include cost, time, assemblabiliry, manufacturability, etc. T he platform and family level evaluation is focused o n the overall benefit of produ ct family development and the metrics at these levels reflect the main goal of designing products/ families is to maximize the benefi ts to the company. Cur rently, there are many marketing or business, econo -technical metrics that can be used for measuring perfor mance or evaluation in customer-driven design for mass customization on the first two levels (Simpson 1998; Zha and Sriram 2004). For example, platform efficiency and platform effectiveness defined by Meyer et al. (1997) can be used to measure R &D performance, focused on platforms and their follow-o n produ cts (variants) within a produ ct family. O ther methods include cycle time efficiency, techn ological compe titive responsiveness, and profit po tential (Meyer and Lehnerd 1997). In this research, the following two typicalmetrics have been used in platform- based family level evaluation (Zha and Sriram 2004): (1) Market Wicieney. This metr ic embodies a tradeoff between the marketing and the engineering design, w hich offers the least amount of variety to satisfy the greatest amo unt of customers , i.e., targets the largest number of market niches with the fewest products. (2) InvestmentWicieney. This metric embodies a tradeoffbetween the manu facturing and the engineering design , which invests a minimal amo unt of capital into machining and too ling equipment w hile still being able to produ ce as large a variety ofprodu cts as possible.



Therefore, they can be represented by the followin g two equation s:



= N T M /N M 111 = eM/Ny,



11M



(1) (2)



 196



Zha et al.



Table 1 Various combinations of solution principles, of which hatched areas belong to the same family (group) Solu tions sub-fimctions



2



"



1'1



2



j



//I



S II S21



SI 2



1'2



Szz



SI } S 2}



S ir. S2>.



1';



Si l



S j2



S i}



SiPPI



F"



S. I



Sn2



"i



s.,



where, N T M and N M are the number of the targetable market niches and the total market numbers, respectively; C M and N, are the manufacturing equipment costs and the number of the product varieties, respectively. Of course, a tradeoff also exists between the market efficiency and the investment efficiency as an increase in the investment efficiency through a decrease in product variety can cause a decrease in the market efficiency. 4.3. Fuzzy clustering and design ranking methodology



Due to the fuzziness of voice of customers (VoCs) or customer requirements/ preferences, it is difficult to model and assess the performance of a product platform/ family and product variants. In this section, a fuzzy clustering and ranking methodology is proposed for product family design evaluation and selection in the context of CDFMC. The algorithms are constructed using fuzzy sets theory to solve a fuzzy clustering/classification and multi-criteria decision-making (FMCDM) problem. The fuzzy clustering algorithm is used to classify design alternatives and determine similarity between modules and commonality between products and product families. The fuzzy multi-criteria decision-making problem can be defined as follows: given a set of design alternatives, evaluate and select a design alternative that satisfies customer needs, meets design requirements and complies with the technical capabilities of a company. 4.3.1. Fuzzy clusterino analysisfor design



Based on the systematic approach (Pahl and Beitz 1996), a reasonable number of possible design alternatives can be obtained using the design solution generation techniques at the conceptual design stage. Each sub-function usually corresponds to a collection of available solution principles. If there are a total of n sub-functions, each of them has m, possible solution principles. After a complete combination, we have several theoretically possible overall solution variants as schematically illustrated in Table 1. Clustering is a widely used method for pattern recognition (Kandel 1982). In this research, the use of cluster analysis is to sort a product data set, for example, a number of possible solution principles to sub-functions or their possible combinations, into families such that the members of the same family (or group) are similar in some respect



 Evaluation and selection in product design for mass custornization



197



Figure 7. Fuzzy matri x of similarity relatio ns between types of prod uct variants (I'V- A. PV-B) in a family.



and unlik e tho se from other families. This is very cru cial for determining similarity between modules and also common ality between products and product families. Assuming there are 111 patterns, aJ, a2,"" a m, cont ained in the pattern spaces S. The process of clustering can be formally stated as: to seek the region s 5 ] , 52, . . . , 5 k such that every a i , i = 1, , m fall into one of these regions and no a i falls into two regions, that is 5] U 52 U U 5 k = S, 'Vi :f. j , 5 i n 5 j = ¢ . This definition indicates that clustering algorithms are based on natural association accordin g to some similarity measures and the patterns are described by a set ofnumerical measures or linguistic variables. The similarity measure or dissimilarity measure is usually given in numerical form to indicate degree of resemblance between objects (or modules, or product variants) in a group (or family), or between an object and a group, or between object groups. The simplest way to measure similarity is to use Euclidean distance. A design object (module, product variant or produ ct family) in a design space may be viewed as a pattern point in a pattern space, described by a vector. The shorter the distance between two point s, the more they resemble each other. However, the concept of similarity is very fuzzy. Th e selection of variables and similarity measures often subjectively reflects the investigator's judgment , rather than rigorous mathematical guidelines. Another practical way to measure similarity is to predefine a fuzzy similarity matrix based on some concerns and the n to sto re it in computer. Matrix M F (A X B) shows a fuzzy matr ix to represent the similarity between types of modules or produ cts in a family (Figure 7). Each ent ry in the matr ix m ij indicates the degree offuzzy resemblance of



 198



Z ha et al.



L-R



L-R



R-l



L-l



R-R R-LA



L-LR



lR-l



LA·R



LA-LR



LF U ~



ce



TI U 1L il ~



R·L



LF U



L-L



~



ce TI U1Lil~ A-A



A-LR



L-LA



0.85



0.7



0.7



LA-l



LA-A



LA-LA



0.65



0.65



0.4



1.0



0.9



0.85



0.9



1.0



0.9



0.9



0.8



0.8



0.7



0.7



0.4



0.85



0.85



1.0



0.95



0.7



0.7



0.8



0.6



0.5



0.85



0.9



0.95



1.0



0.8



0.8



0.7



0.7



0.6



0.7



0.8



0.7



0.8



1.0



0.9



0.8



0.8



0.6



0.7



0.8



0.7



0.8



0.9



1.0



0.8



0.8



0.7



0.65



0.7



0.6



0.7



0.8



0.8



1.0



0.8



0.7



0.65



0.7



0.6



07



0.8



0.8



0.8



1.0



0.8



0.4



0.4



0.5



0.6



0.6



0.7



0.7



0.8



1.0



Figure 8. Fuzzy m atr ix of sim ilarity relatio ns between types of co ncep tu al layout variant s in a gear reducer pro duct family (L: Left, R : R ight).



produ ct variant i and j . The closer the number in the matr ix is to 1, the more similar the correspo nding module or produ ct is. Figure 8 gives an instance of fuzzy similarity matrix for conceptual layout variants in a gear reducer produ ct family (Gui 1993). Given any two modules or produ ct concepts at some level they will be grouped into the same cluster if these two are always kept within one group or family at all later levels. The clustering seque nce or procedures are said to be hierarchical, which are divided into two distinct classes, bottom-up and top- down . T he forme r starts with singleton clusters and forms th e sequence by successively merging clusters, whereas the latter starts with all th e objects in one cluster and forms the sequ ence by successively splitting clusters. The algorithm of clustering used in this research follows four steps:



(1 ) Find th e smallest element in the distance matrix (di) to merge corresponding to two objects. (2) Select a point as a reference in the merged group using an appropriate rule, e.g., nearest neighbor or centroid cluster. (3) Recalculate the distance matrix between the new group and these remainders, named di + 1 • (4) Repeat step 1 unti l all the objec ts merge into one group .



 Evaluation and selection in product design for mass customization



199



4.3.2. Fuzzy rankingfor design



Using the design solution clustering techniques discussed above, a reasonable number of possible design alternatives can be obtained. The remaining procedure is to examine the design alternatives against marketing, econo-technical and even ergonomic criteria as well as aesthetic criteria. This is actually a multi-criteria decision-making problem. One of the well-known methods for multi-criteria decision-making is the procedure for calculating a weighted average rating r j by use of the value analysis or cost-benefit analysis introduced in (Pahl and Britz 1996):



(3)



where, i = 1, 2, ... , m , j = 1, 2, 3, ... , n, r ij denotes the merit of alternative a i according to the criterion Ci; Wj denotes the importance of criterion C, in the evaluation of alternatives. The higher rj is, the better is its aggregated performance. However, this procedure is not applicable for the situations where uncertainty exists and the information available is incomplete. For example, the terms "very important," "good," or "not good" themselves are a fuzzy set. In what follows, the problem of fuzzy ranking a set of alternatives against a set of criteria is described. Let a set of m alternatives A = {a j , a 2, ... , a In} be a fuzzy set on a set of n criteria C = {C\, C2 , ... , ell} to be evaluated. Suppose that the fuzzy rating r ij to certain C, of alternative a i is characterized by a membership function fL R, (rjj), where, r j j E R, and a set of weights W= {Wj, W2, ... , w n } are fuzzy linguistic variables character2ri izedbYfLwj(Wj),Wj E R+.Considerthemappingfunctiongi(zi): R ---+ Rdefined by:



g,(z,) = t(wjri/)/tWj j=l /=1



where,



Zi



=



(4)



(WjW2'" W," rnr n>



riri)'



Define the membership function



fL(Zj)



by (5)



Thus, through the mapping g i (z,) : R 21l ---+ R, the fuzzy set Z, induces a fuzzy rating set R, with membership function flR;(r,) =



SUPZ ,g(z;)=,;



fl;dz,),



r,



E



R



(6)



The final fuzzy rating of design alternative a j can be characterized by this membership function. But it does not mean the alternative with the maximal fL dr i) is the best one. The following procedure further evaluates the two fuzzy sets as (Gui 1993):



 200



Zha



et al.



(1) a conditional fuzzy set is defined with the membership function:



,



fLljR(I!rj" .. , r m )



=



{I



o



ifr;>rk,VkE(1,2, ... ,m)



otherwise



(7)



(2) a fuzzy set is constructed with membership function:



(8) A combination of these two fuzzy sets induces a fuzzy set I which can determine the best design alternative with the highest final rating, i.e., (9)



Comparing with Eq. (3), the fuzzy ranking for design is more flexible and presents uncertainty better. Based on this method, the designer can use linguistic rating and weights such as "good," "fair," "important," "rather important," for design alternatives evaluation. Therefore it looks natural and attractive in practical use. 4,3,3, Simplified fuzzy rankingfor design



In some cases, a simplified model is employed in integrating linguistic terms and fuzzy numbers into the fuzzy preference model. The universe of discourse is a finite set of fuzzy numbers used to express an imprecise level of performance rating and weight of each criterion. A range of imprecise levels is the linguistic terms, such as, "very low," "low," "fairly low," "medium," "fairly high," "high," and "very high." The linguistic scale is used to transform these linguistic terms of partial performance ratings R;j, and weights Wj of the criteria into triangular or trapezoidal fuzzy numbers defined in the interval [0,1]. R;j denotes the linguistic performance rating with respect to a criterion C, for a retrieved product variant PV j; W j denotes the linguistic weight of a criterion



c..



The aggregation offuzzy numbers in an analytic form requires a complex arithmetic process. Thus, in this research, an approximate centroid-based defuzzification method is used to defuzzify the fuzzy numbers into crisp values early on, and then the defuzzified results can be aggregated easily and the execution is very fast (Zhang et al. 2002). For example, if a fuzzy set is represented as a trapezoidal fuzzy number, see Figure 9, then it can be parameterized by a quadruple (Xl, Xl, X3, X4) and its defuzzied crisp value using approximate centroid is (Xl + Xl + X3 + x4)/4. A triangular fuzzy number (Xl, Xl, X3) can also be represented as (Xl, Xl, Xl, X3) by a trapezoidal fuzzy number form with its crisp defuzzied value becoming (Xl + Xl + Xl + x3)/4. With the approximate centroid-based defuzzification method the fuzzy linguistic performance rating R;j and fuzzy linguistic weight Wj can be respectively transformed into the crisp performance rating rij E [0,1] and crisp weight Wj E [0,1]. Therefore, the numerical weighted performance rating r i E [0, 1] of a design alternative can be calculated simply using the classic weighted average aggregation method. The key



 Evaluation and selection in product design for mass customization



201



Figure 9. Linguistic scale representation for fuzzy customer preferences and perform ances.



point s of this simplification mo del can be understood as: it is a simple fuzzy rankin g scenario; it can also be used for defuzzification. 4.4. Evaluation of product family design alternatives



4.4.1. Heuristic evalllation[unaion



With respect to the traditional approaches (Pahl and Betiz 1996;Jiao and T seng 1998), we propose an approach to concept evaluation and selection for product customization from the kn owledge suppo rt viewpoint . The kn owledge resour ce utilized in the process includes differenti ating features, customers' requirements, desirabilities, preferences and imp ortance (weights), trade- offs (e.g. market vs investment), utility functions, and heur istic knowledge, rules, etc. It is imp ort ant to have a powerful search strategy that will lead to a near optimum solution in a reasonable amo unt time. A* search (Sriram 1997) provides a method to achieve this. The system first calculates the weight ed performance rating aggregation of each retri eved altern ative by analyzing the trade- off among variou s criteri a. Then it calculates the evaluation ind ex of each design altern ative used as the heur istic evaluation function by considering all the weighted performa nce ratings ofproduct variants. Figure 6 shows a knowledge decision support scheme for product evaluation and customization proce ss. T he kernel of the knowledge decision support scheme is fuzzy clustering and ranking algorithms for design evaluation and selection that will be discussed below. 4.4.2. Evallliltion index



After calculating the numerical weighted performance ratings of all design alternatives, the evaluatio n index is calculated and used as a heuri stic evaluation functio n J" , by co nsidering all the weighted perform ance ratings r i (i = 1, 2, 3, . . . , Ill) of its constituent members and th e numb er k of its unsatisfied customer requ irements, as follows: m



Jil = L (1 l' i)+ k ;= 1



(10)



 202



Zha et al.



where, r i E [0,1] is the numerical weighted performance rating of product variants PVi; 1jrj = (1, +00) is defined as the performance cost of product variants PV j. A higher weighted performance rating of a product variant corresponds to a lower (l/rj) represents the accumulated performance cost of performance cost. a design alternative along the search path so far. k is a heuristic estimate of the minimal remaining performance cost of a design alternative along all the possible succeeding search paths. j" is the estimate of the total performance costs of a design alternative, also called the evaluation index. In the above formula, a higher r i, i.e., the better-aggregated performance of each retrieved product variant PV j , and lower m or k, i.e., higher compactness of a design alternative, will result in a lower j" (lower evaluation index of a design alternative). Thus, at each step of the A* search process, the best design alternative, i.e., the one with the lowest value of the heuristic evaluation function is selected, by taking into account multi-criteria factors including design compactness and other life-cycle issues, such as manufacturability, assemablility, maintability, reliability, and efficiency (market vs investment).



2::::1



4.5. Neural network adjustment for membership functions



Due to the complexity and uncertainty of design problems, there is a need to improve the above comprehensive fuzzy clustering and ranking methods. This improvement can be achieved through a learning technique such as neural networks. In a fuzzy set, a variable v can belong to more than one set, according to a given membership function f.Lx(v). Standard membership function types as Z, A, tt and S-type can be mathematically represented as piecewise linear functions (Zimmermann 1986, 1996). It can be easily implemented and adjusted by using neural networks (Zha 2001). The fuzzy system (e.g. rule block) is the kernel of the whole fuzzy neural network model. It forms the basic scheme of knowledge representation exploited in the fuzzy evaluator. The neuro-fuzzy hybrid approach uses neural network to optimize certain parameters of an ordinary fuzzy system, or to preprocess data and extract fuzzy rules from data (Zha 2001). The fuzzy evaluator described above is reflected in three basic elements: fuzzification, fuzzy inference and defuzzification. The fuzzification in the input interfaces translates analog inputs into fuzzy values. The fuzzy inference takes place in rule blocks that contain the linguistic control rules. The output of these rule blocks is linguistic variables. The defuzzification in the output interfaces translates them back into analog variables. Each of fuzzy rules can be interpreted as a training pattern for a multi-layer neural network, where the antecedent part of the rule is the input and the consequent part of the rule is the desired output of the neural network. There are two main approaches commonly used to implement fuzzy if-then rule blocks above by standard error back propagation network. One is to represent a fuzzy set by a finite number of its membership values (normally by linear functions). The other is to represent fuzzy numbers by finite number of a-level sets. With simplicity, but without loss of generalizity, the former approach is adopted in this research. Suppose that [aI, a2] contains the support of all the Aj we might have as input to the system, and [,81, ,821 contains the support ofall the B, we can obtain as outputs from the system,



 Evaluation and selection in prod uct design for mass custornizarion



203



Multilayer Neural Network



Ym Figure 10. A nerwork trained on membership values for fuzzy numbers.



i Xj



Y;



= 1, 2, . .. , n, If m :::: 2 and n :::: 2 be positive integers, then = a t + (j = f3 1 + (i -



1)(a2 - a l )/ (tl - 1),



1)(f32 - f3 1)/ (m - 1),



where, 1:s i :s 111 , and 1:S j :s n, T hus, a discrete versio n of the continuo us training set can be composed of the following input/o utput pairs: {(A;(Xt ), . .. , A; (XII)) ' (B;( lI) , ... , B;(Y,Il))}' i = 1, . . . , n. Using the notations a; j = Aj(xj ), bij = B; (Yj) , the fuzzy neural netw ork turn s into an /1 inputs and m outputs crisp net work , which can be trained by the generalized delta rule. Figure 10 shows a network trained on membership values of fuzzy numbers.



 204



Zha ct al.



5. CASE STUDY AND SYSTEM PROTOTYPE



This section provides a case study of the power supply family design evaluation and selection for mass customization and introduces a prototype advisory system for product family design decision support. 5.1. Case study



Power supplies are necessary components of all electronic products. Because of diverse requirements, power supply products (http://www.artesyn.com/) are often customized (Maurice, 1993; Jiao and Tseng 1998). To illustrate and validate the proposed knowledge support scheme, a scenario illustrating the knowledge support for power supply family design evaluation and selection for customization is provided. From a customers' point ofview, a power supply product is defmed on the following required features (RFs): power, output voltage (OutV), output current (OutC), size, regulator, mean time between failure (MTBF), etc. From an engineers' point of view, the power supply product is designed by determining these parameters (DPs): core of transformer (Core), coil of transformer (Coil), switch frequency (SwitchF), rectifier, heat sink type (TypeHS), heat sink size (SizeHS), control loop (Control), etc. Figure 11 shows the relationship between RFs and DPs, configurations and topologies. Three product families I, II and III are generated based on three different topologies, which have 4, 5 and 3 base products (BPs) respectively. Each topology has its own range/limitation with regard to particular product features and/or design parameters. The modularization process and modular design ofpower supply products are based on the work in (Zha and Sriram 2(04). When the product configuration is carried out, the design requirements and constraints are satisfied especially in terms ofproduct functions or functional features. Of course, from the assembly or disassembly/maintenance points ofview, it had better that the parts with low exchange rate are placed at inside of product, but the locations of some parts are fixed in advance due to design constraints. With reference to the knowledge decision support scheme for product evaluation (Figure 6), a scenario illustrating the knowledge support for power supply product evaluation for customization in Family I is shown in Figure 12. The customers' requirements for Family-I power supplies include AC/DC, 45 W, 5 V & ±15 V, 150 khrs, $20-50, etc. The knowledge decision support system first eliminates unacceptable alternatives and determines four acceptable alternatives, NLP40-7610, NFS40-7610, NFS40-7910, and NFS 42-7610. The final design decision can be reached based on the knowledge resources given in Figure 13, including customer preferences, differentiating features (MTBF, price, and special offer) and their utility /membership functions, fuzzy rules, and etc. The final design decision made by the system is NFS42-761O as it has maximum MTBF, medium price and special offer of auto-start function and it is acceptable based on the rules. 5.2. System prototype



To verify and validate the knowledge support scheme, a prototype of a product family design decision support (evaluation and selection) system, called Design Advisor, has



 Evaluation and selection in product design for mass custornization



205



(a) Configuration



(b) Topologies Figure 11. Co nfigurations and topologies of power supply products (a) Co nfiguration (b) Topologies.



been developed based on the fuzzy clustering and ranking model described above. It is a web-b ased multi-tier system, written in JavaTM, incorporating Java Expert System Shell,Jess/FuzzyJess (Ern est 1999; NRC C 2003 ; Samuel and Bellam 2000), consisting of the cluster analysis modul e, ranking modul e, selection modul e, neural-fuzzy module, and visualization and explanation facilities. The De sign Advisor system is a su bsystem of the knowled ge int ensive support system for product family design described in (Z ha and Lu 2002 a,b; Zh a and Sriram 2004). The cur rent capabilities of the prototype include capturi ng and browsing of the evolution of product families and of product variant configurations in produ ct families, rankin g and evaluatio n and selection of produ ct variants in a product fam ily.



 206



I!:



..,



~



Zha et al.



.



---



Requlremonls (ACIDC, 45W, 5V& ~15V,I5Okhr .. $30- 40)



,.



..



Knowledge DecIsIon



' --



Support Subsyal8m



------.



Alternatives



(DesIgn Advisor) (I)



:>



NLP40-7610 NSF40-7610 NSF40-7910 NSF42·7610



(I)



DlfferenUatlng features



..•



(MTBF, Price . Speclal Offer)



(II)



> -~"



Knowledge Dec:lslon



Support Subsystem (Design Advisor) (n)



..



>



Anal DesIgn



~



NFS42·7610



Knowledge Source (From Knowledge Repository)



Figure 12. Scenario of knowledge SUPP OTt for product evaluation and selection .



-~-Lu- ~ I



i



IIT1Il'



!



i



Price



MTBF



Price



150



36



NLP40-7810 NSF40-7810



170



32



NSF40-7910



170



40



NSf42.781 0



230



20



~r>0< ... - ~r>0< . .



s--..



Special



Aulo-R. .tart



Loroo



0.0



100



......



I



Loroo



0..



..



,. ...-



I



I



FuuyRuleo: IF MTBF Is high end Price Is medium and wltll SpecIal 01lw THEN R _ C_ _) IF MTBF 'S small end Price Is high and wlt/loul Speclal Oflw THEN N...._ .. (n.l_r.tllo,



....-



Figure 13. Knowledge used in power supply product evaluation and selection for customization.



 Evaluation and selection in product design for mass customization



207



""" ........ Figure 14. Screen snapshot for product family evaluation session.



The comprehensive fuzzy decision support system can visualize and explain the reasoning process and make a great difference between the knowledge support system and the traditional program. In this subsystem, a tracing approach using linear chain list (Rule.Used.No) is adopted for addressing the explanation facilities: 1) How to reach the conclusions? 2) How many rules are used in reasoning? 3) Does it use Rule X? 4) Why use Rule X? and 5) When does it use Rule X? A linear chain list records the rule number of successful rules during reasoning process and stores them in a knowledge unit. The designer/user consultation is answered by a backtracking mechanism like Prolog. With this subsystem, the designer can represent the design choices available as a fuzzy AND/OR tree. The fuzzy clustering and ranking algorithms employed in the system are able to evaluate and select the (near) overall optimal design that best satisfies customer requirements. The selected design choice is highlighted in the represented tree. Figure 14 gives a screen snapshot of the prototype system used for power supply family evaluation and selection. 6. DISCUSSION



The developed approach, which differs from existing methods and systems (e.g. Jiao and Tseng 1998), is knowledge supported and embodies an effective and efficient



 208



Zha et a1.



method and mechanism to evaluate and select design alternatives or product variants in a product family. The system described in this chapter can provide advisory service for the design ofmass customized products and explain the results and what-ifs. Specifically, it is able to provide a common language at the concept level, allowing a designer to describe a design alternative or product variant so that an expert advisory system can decide and select which design alternative and product variant can satisfythe customers' requirements. This means that the system is designed as a tool for finding a "good" concept/solution for a product/product family while still at the conceptual level of design, and making a diverse catalog of design alternatives/product variants available to designers/users so that they can experiment with different requirements/technologies in business. The "web-top" (web-based) product families can be achieved by using the technologies of e-commerce and mass customization to design and set up the mass customized systems on the web based on the remote-site customers and task requirements for reconfigurable modular systems. The widespread use of these systems is likely to lead many companies to put their products database searches on-line, allowing users to filter inventories/catalogs based on user entered requirements/preferences. Also, the system allows developers to provide intelligent knowledge services and an open environment to support and coordinate highly distributed and decentralized collaborative design and modeling activities for designers/users. Web-based interface lets designers/users customize products and submit them for review if necessary. Thus, the system provides the remote users advice that: 1) indicates which product variant is the most suited to the customers' requirement; 2) how the design could best be modified to satisfy the customers' requirements and constraints. As a result, converting a product from one task/customer to another can be very fast in order to keep up with the rapidly changing marketplaces or applications. 7. SUMMARY AND CONCLUSIONS



This chapter presented an approach to a knowledge decision support for product family evaluation and selection. A comprehensive fuzzy knowledge support scheme and the relevant technologies were developed for product family evaluation and selection in customer-driven design for mass customization. The developed systematic fuzzy clustering and ranking methodology can model the imprecision inherent in design decision-making with fuzzy preference relations and carry out fuzzy analysis and evaluation which is capable ofhandling linguistic as well as ordinary quantitative information thus solving the multi-criteria decision making problem. The employment of neural networks can adjust membership functions of evaluation and selection criteria, rationalize the determination ofcustomer preferences, and incorporate them into fuzzy analysis. Thus, typical barriers to decision-making processes, including incomplete and evolving information, uncertain evaluations, inconsistency of team members' inputs, can be compensated. The results obtained from the case study illustrate the potential and feasibility of the knowledge intensive decision support scheme and the fuzzy clustering and ranking methodology in product family design evaluation and selection. This work can help bring products to market faster, and with more certainty ofsuccess.



 Evaluation and selection in produ ct design for mass custornization



209



Based on the results of assessment, indu stry best practices are identified that can help improve product quality, cost, and time-to-market and right-to-market. The developed methodology is generic and flexible enough to be used in a variety of decision problems, e.g., con cept evaluation and selection. Disclaimer



Co mmercial equipment and software, many of whi ch are either registered or trademarked, are identified in order to adequately specify certain procedures. In no case does such identifi cation imply recommendation or end orsement by the National Institute of Standards and Techn ology, nor does it imply that the mater ials or equipment identified are necessarily the best available for the purpose. REFERENCES [11 Baudin, M. (2001). Eight strategies for mass customization, Manufa ctur ing Management & Technology Institute , httpi/vwwwmmt-inst.com/ , USA. [2] I3arkmeyer, Edward, Christopher, Neil , Feng, S., Fowler, James E., Frechette, Simon , Jones, Albert, ju rrens, Kevin K., McLean, C harles, Pratt, Mike, Scott, H . A., Senehi, M . K., Sriram, R. D., and Wallace, Evan. (1997). SIMA Reference A rchitectu re Part I: A ctivity Models, N ISTI R 5939, National Institut e of Standards and Techn ology , Gaithersburg, MD. [3J Boender, C. G., de Graan, J. G. and Lootsma, E A. (1989). Mult i-crit eria decision analysis with fuzzy pairwise comparisons. Fuz z y Sets and Systems, Vol. 29, 1'1'.133-143. [41 Carnahan, J. V, Thurston , D. L. and Lit;, T. (1994). Fuzzy rating for multi-attrib ute design decisionmaking. [ oumol ~f Mechanical Design, Transaction of the ASME, Vol. 116, No. 2, PI" 511-52 1. [5J Chen , S. J., Hwang, C. L. and Hwang, E I'. (1992). Fuz z y Multiple Attribute Decision MakilJ~: ,Hethods and A pplications. I3 erlin: Springer-Verlag. [6] C lausing, D. (1994). Total Quality Development: A Step-bv-Stcp Guide to ~V?illeerin>?, N ew York : ASME Press. [7J Dhin gra, A. K., R ao, S. S. and Kumar, V (1991). No nlinear membership functions in the fuzzy optimization of mechanical and structur al systems. AAAI 31st Structural D ynamics alld Materials Conierencc. Long Beach, CA , pp. 403-4 13. 181 Dixon ,J. R. , Howe, A., Co hen, P. R ., and Simmo ns, M. K. (1986). Dominic I: Progresstowards dom ain independence in design by iterative redesign. Proceedino: if the A SIHE 1986 Computers in En>?ineeri/~~ Conjcrence, C hicago, IL, Vol. 1, pp. 199- 2 12. 191 Dobson, G., and Kalish, S. (1993). Heuristics for pricing and position ing a product line using conj oint analysis and cost data. Manaxemellt Science, 39(2), pp. 160-1 75. 11 0] Du , X . H ., Jiao, J. X., and Tseng, M . M . (2000). Architecture of product family fin mass customization, Proceedings if the 2000 IEEE lnternational Conj erenceon, vol. 1, PI'. 437- 443. [11J Ern est, J. Friedman-Hill. (1999). T he Java Expert System Shell, http :/ /herzberg.ca.sandia.gov/j ess, Sandia National Laboratories, U SA. [12] Frazell, E. (1985). Suggested tech niques enable multi-criteria evaluation of material handling alternatives. Industrial En>?ineerin>?, Vol. 17, No. 2. [13] Fujit a, K. and Ishii, K. (1997). Task structuring toward computation al approaches to product variety design. Proceedinos ofthe 199 7 ASME Desiyn ElIgilleerillJi Technical Corferences. Paper N o. 97DET C! DAC3766, ASME. [14J Fuj ita, K., Akagi, S., Yoneda, T., and Ishikawa. M. (1998). Simultaneous optimization of product family design sharing system structure and co nfigu ratio n. CD-ROM Procecdinos if the 1998 ASMF. Dcsixll EIIx illeerillX Technicai Conjerences, Atlanta, Geo rgia. [151 Gaithen, N. (1980). Production and Operations MallaJlelllellt: A Problelll-Solvillg and Dedsion-Mokino Approach, Th e Dryden Press. New York . [16J Gilmore. ] . H. and Pine, B. J., II. (1997). Th e four faces of mass customizatio n. Harvard Business Review, Vol. 75 (lanuary- Pebruary): PI'. 9 1- 101. (17] Gonzale-Zugasti, J. P. (2000). Models for Platfonn-Based Product Family Dcs(~II , PhD T hesis, MIT, Cambridge.



 210



Zha et al.



[18] Green, P. E., and Krieger, A. M. (1985). Models and heuristics for product line selection. Marketing. Science, 4(1), pp. 1-19. [19] Gui, J. K. (1993). Methodology for Modeling Complete Product Assemblies, PhD Dissertation, Helsinki University of Technology. [20] Hwang, C. L. and Yoon, K. (1981). Multiple AttributeDecision Making: Methods and Applications, Berlin: Springer. [21] Huang, P and Ghandforoush, P (1984). Procedures given for evaluating, selecting robots. Industrial Engineering, Vol. 16, No.4. [22] Jiao, J. X., and Tseng, M. M. (1998a). fuzzy ranking for concept evaluation in configuration design for mass customization. Concurrent Engineering: Research and Application, Vol. 6, No.3, pp. 189-206. [23] Jiao, J. X., and Tseng, M. M. (1998b). Design for mass customization by developing product family architecture. Proceedings of the 1998 ASME Design Engineering Technical Conierences, Paper No.: DETC98/DfM-5717.



[24] Knosala, R. and Pedrycz, W. (1992). Evaluation of design alternatives in mechanical engineering. Fuzzy Sets and Systems, Vol. 47, No.3, pp. 269-280. [25] Kandel, A. (1982). Fuzzy Techniques in Pattern Recognition, John Wiley & Sons. [26] Kickert, W J. M. (1978). fuzzy Theories on Decision Making: A Critical Review, Martinus Nijhoff Social Sciences Division [27] Kohli, R., and Sukumar, R. (1990). Heuristics for product-line design using conjoint analysis. Management Science, 36(12), pp. 1464-1477. [28] Kotler, P. (1989). from mass marketing to mass customization. Planning Review, Vol. 17(5): pp. 10-15. [29] Krishnan, V. and Gupta, S. (2001). Appropriateness and impact of platform-based product development. Management Science, 47(1): pp. 52-68. [30] Lee, H. L. and Tang, C. S. (1997). Modeling the costs and benefits of delayed product differentiation. Management Science, Vol. 43(1): pp. 40-53. [31] Li, H., and Azarm, S. (2000). Product design selection under uncertainty and with competitive advantage. Journal of Mechanical Design, Transactions of the ASME, Vol. 122, pp. 411-418. [32] Li, H., and Azarm, S. (2002). An approach for product line design selection under uncertainty and competition. Journal of Mechanical Design, Transactions of the ASME, Vol. 124, pp. 385-392. [33] Maurice K. (1993). Trends in ACIDC Switching Power Supplies and DCIDC Converters, IEEE. [34] Martin, M. and Ishii, K. (1996). Design for variety: a methodology for understanding the costs of product proliferation. 1996 Design Theory and Methodology Conference (Wood, K., ed.), Irvine, CA, ASME, Paper No. 96-DETC/DTM-1610. [35] McKay, A., Erens, E and Bloor, M. S. (1996). Relating product definition and product variety. Research in Engineering Design, Vol. 8 (2): pp. 63-80. [36] Meyer, M. H., Tertzakian, P. and Utterback, J. M. (1997). Metrics for managing research and development in the context of the product family. Manaoement Science, Vol. 43(1): pp. 88-111. [37] Mistree, E, Bras, B., Smith, W E, and Allen, J. K. (1995). Modeling design processes: a conceptual, decision-based perspective. Engineering Design & Automation, Vol. 1, No.4, pp. 209-321. [38] Mistree, E, Hughes, O. E and Bras, B. A. (1992). The compromise decision support problem and the adaptive linear programming algorithm. Structural Optimization: Statusand Promise, M. P. Kamatt (ed.), AIAA, Washington uc., Chapter 11, pp. 247-286. [39] NRCC (National Research Council of Canada). (2003). fuzzy Logic in Integrated Reasoning, webpage: http://www.iit.nrc. cal 1R._publiclfuzzyI. [40] Nelson, S. A., Parkinson, M. B., and Papalambros, P Y. (1999). Multi-criteria optimization in product platform design. CD-ROM Proceedings of DETC99, 1999 ASME Design Engineering Technical Conferences, September 12-15, 1999, Las Vegas, Nevada. [41] Nielsen, E. H., Dixon, J. R.. and Simmons, M. K. (1986). GER.ES: a knowledge-based material selection program for injection molded resins. Proceedings ofthe ASME 1986 Computers in Engineering Conference, Chicago, IL, pp. 255-262. [42] Pahl, G. and Beitz, W (1996). Engineering Design-A Systematic Approach, New York: Springer. [43] Pfaltz, J. L. and Rosenfeld, A. (1969). Web grammars. Proceedings of First International Joint Conjerence on Artificial Intelligence, Washington, nc. pp. 609-619. [44] Pine, B. J. (1993). Mass Customization-The New Frontier in Business Competition, Boston, MA, Harvard Business School Press. [45] Prasad, B. (1996). Concurrent Engineering Fundamentals, Vol. 1-2, NJ: Prentice Hall PTR.. [46] Pugh, S. (1991). Total Design: Integrating Methods for Succes~ful Product Engineering, Addition-Wesley Publishing Co. Inc.



 Evaluation and selection in prod uct design for mass customization



211



[47] Samuel, A. K. and Bellam, S., (2000). http : / /www. glue. umd. edu / ~sbellam/ [481 Sanderson, S. and Uzu meri, M. (1995). Managing product families: the case of the Sony Walkman. Research Policy, Vol. 24: PI'. 761-782. [49] Sanderson, S. W (199 1). Co st models for evaluating virtual design strategi es in multi- cycle produ ct families. j Ollrtlal of Eng;'leeri'lg alld TecJllllllo,~y Mallagemellt, Vol. 8: PI'. 339- 358. [50] Schile, T. and Goldhar, ]. D. (1989). Prod uct variety and time based manufactu ring and business management: achieving compe titive advantage through C IM . J'v[atll!(actllrillg Red ell', Vol. 2(1): PI'. 3242. [51] Simpson, T. W. (1998). A Concept Exploration Methodfor Produa Family Design. Ph.D Dissertation , System R ealization Laboratory, Woodruff Schoo l of Me chanical Enginee ring, Geo rgia Institute of Techn ology. [52) Simpson, T. W , Maier,]. R . A., and Mistree, E (200 1). Produ ct platform design: method and application. Research [ II Ellgilleerillg Desigll, Vol. 13, PI'. 2-22. [53J Simpson, T. W , U mapathy, K., Nanda,]. , Halbe, S., and Hodge, B. (2003). Development of a framework for web- based produ ct platfor m customization.joronal ofComplltillg andIr!formatioll Science ill Engineering, Transactions of the A SME, Vol. 3, PI" 119-129. [54J Sriram, R . D. (1997). Illtelligm t Systemsfor Engineering: A Knowledge-based A pproach, Springer. [55] Sriram, R . D. (2002). Distributed and Integrated Collaborative En gineering Design, Sarven Publishers, Glenwood, MD 21738, USA. [56J Suh, N . P. (1990). The Principles ~( Des(I!Il , N ew York: Oxford Un iversity Press. [57] Sullivan, W (1986). Models IEs can be used to include strategic, no n-m onetary factors in automat ion decisions. Industrial Engineering, Vol. 18, PI'. 42- 50. [58] Saaty, T. I.. (1991). The A llalytic Hiemrchy Process, McGraw- Hill, N ew York. [591 Siskos, J., Lochard, ]. and Lomb ard,]. (1984). A multi-criteria decision-m aking me thodology under fuzziness: application to the evaluatio n of radiological pro tection nuclear power plants. Tl MS / Stlldies ill Alallaxemellt Sciences, H. ]. Zimmermann (ed.), Amsterdam: North-H olland, Pl'. 261-283. [60J Taguchi, G. (1986). lnuoduaion to Quality Ellgilleerill.~, Tokyo, Japan: Asian Prod uctivity O rganization . [61] Tanino, T. (1988). Fuzzy preference relation s in group decision making. Non-Conventional Preferellce Relations in Decision Makillg , ]. Kacprzyk and M. R oub ens (eds.), Berlin: Springe r, PI'. 54-7 1. [62] T hursto n, D. I.. (1991). A formal method for subjective design evaluation with multiple attr ibutes. Research in Ellgilleering Design, Vol. 3, No. 2, PI'. 105-122. [63] Thurston, D. I.. and Carna han, ]. V. (1992). Fuzzy rating and utility analysis in preliminary design evaluation of multiple attri butes. jou rnal on Mechanical Design, Transaction ~( the ASME, Vol. 114, N o.4, PI'. 648-658. [64] Thurston, D. I.. and Locascio, A. (1994). Decision the ory for design economics. Special Issue, Ellg ilIeerill)? Economist, Vol. 40, No. I , PI'. 41-72. [65) T hurston, D. I.. and Cr awford, C. A. (1994). A method for integ rating end-use r preferences for design evaluation in rule-b ased systems. j ournal ofMechanical Desion, Transaction of the A SM E, Vol. 116, N o. 2, Pl'. 522-530. [66] Ton g, C. and Sriram, D. (Eds.), (199 1a). Artificial Intelligence in Engineer ing D esign: Volume 1R epresentation: Stru cture, Fun ction and Cons traints; Routine D esign, Academic Press. [671 Tong, C. and Sriram, D. (Eds.), (199I b). Artificial Intelligence in Engi neering Design: Volu me 11IKnowledge Acquisition , C ommercial Systems; Integrated Environments, Academic Press. [68] Tseng, T. Y. and Klein, C. M . (1989). N ew algorithm for the rankin g procedure in fuzzy decisionmaking. IEEE Transactions on Systems, Man and C ybernetics, Vol. 19, N o. 5, PI'. 1289-96. [69] Tseng, M . M. and Jiao, j . X . (1996). Design for mass customizati on. ClR P AIllWls, Vol. 45, No.1, 1'1'. 153-156. [70] Tseng, M . M. and Jiao, ]. X . (1998). Product family mod eling for mass customization. Computers in lndustrv, Vol. 35(3-4): pp 495-49R. [7 1J von Neumann, ]. and Mo rgenstern , O. (1947). Theory of Games alld Economic Behavior, Princeton U niversity Press. [721 Wang,]. (1997). A fuzzy outranking method for conceptual design evaluation. lnternational j ournal oj Production Research, Vol. 35, No.4, pp. 995-10 10. [731 Wood, K. 1.., Antonsson , E. K. and Beck, ] . I.. (1989). R epresenting imprecision in engineering: comparing fuzzy and probability calculus. Research ill Ellg illeerillg Desiyn, Vol. I , N o. 3/4, Pl'. 1 87~203 . [741 Wood, K. L. and Antonsson, E. K. (1989). Computations with imprecise parameters in engineering design: backgro und and theo ry. ASA1E [ ouma! of Mechanism, TransmissiollS, and A utomatioll ill Design, Vol. 111, Pl'. 616-625.



 2 12



Zhaetal.



[75) Wortmann , H . c.. Muntslag. D. R . and Timmermans, PJ. M . (1997) . Customer-Driven ManuJacturin,l1, C hapman and Hall, Lo n do n. [761 Zh a, X. E (200 1). N euro-fuzz y co m prehensive assernblabiliry and assem bly seq ue nce evaluatio n. A rti./icial lllte/ligencef or En,l1illecril \~ Desigll, Analysis alld Mmlllfactllrill,l1 (A IEDA AI), Vo l. 15(5), pp. 367-384. [77] Zh a, X . E and Lu, WE (2002a). Kn owl edge sup port for customer- based design for ma ss custo mizatio n. AID'02, J. S. Gero (ed), Kluwer Academ ic Press, pp. 407--429. [78] Z ha, X . E and Lu, W E (20 02b). Kn ow ledge inte nsive suppo rt for product family design . Proceeding of 2002 A SME DET C02. Paper N o. DETC /DAC 34098 . [79 J Zha, X . E, and Sri ram , R. D. (2004). Platfor m-based product design and developm ent : knowledge suppo rt strategy and impleme ntation . llltell(~fllt Knowledge-Based Systems: Busilless .md Teclmology i" N ell' Millennium, Cornelius T. Leondes (ed), vol. I , chapter I. Kluwer Academi c Publi shers, USA . [80 1 Zimmermann, H . J. (1987). Fuzzy Sets, Decision ,'vtaking, and Expert Systems, Boston : Klu wer Academic Publis hers . [811 Z im mermann, H . J. (1996). Fllzz y Set Theory and Its Applications, 3rd ed ., Boston : Kluwer Academ ic Publishers. [82J Zadeh , L. A. (1965). Fu zzy Sets. lnjormation and Control, Vol. 8, pp. 338 - 353 . [83J Z han g W Y, Tor, S. B., and Britton , G. A. (200 2). A heur istic state- space approach to the fun ctional design of mechanical system s. International[ourno! of Advanced Mam!facturill,l1 Technology, Vol. 19, pp. 23 5- 244.



 GENETIC ALGORITHM TECHNIQUES AND APPLICATIONS IN MANAGEMENT SYSTEMS



CAR L K. CHAN G AND YUJIA GE



1. INTRODUCTION



1.1. Resource-constrained scheduling problem



Scheduling problems are NP-hard with extremely complex comb inato rial optimization issues. The resou rce- constrained scheduling problem in its gene ral form can be described as "Given a set of activities, a set of resources, and a measurement of performance, what is the best way to assign the resources to the activities such that the performance is maximized?" [Wall 96]. Scheduling is based on many different kind s ofdata, such as mod els oftasks, resourc es, processes with underlying constrains, objectives, and perfor mance measures. Tasks can be any activity, such as assembling a product, authorin g a paper, and developing a system. Machines, engin eers, developers, tools, and materials can be considered resour ces. Objectives are some measurements for us to evaluate a sched ule to determine wh eth er it is "good" or not. For example, typical objectiv es of project management include minimizing the dur ation of the proje ct, the numb er of projects which can not be completed before the deadline, and cost of the who le proj ect. The con cept of scheduling can be illustrated in Figure 1. In a more formalized way, resource- con strained scheduling problems can be described simply as a qu adrupl e (R, P, J, C): R is a set of resources; P is a set of produ cts; J is a set of jobs which are to be scheduled subjec t to several constraints defined in C.



213



 214



Carl K. Chang and Yujia Ge



Evaluate



Figure 1. Concept of scheduling.



1.2. Classes of the generalized problem for resource-constrained scheduling



There are several classes in the generalized resource-constrained scheduling problems. Because of the high complexity of computability of real-world problems, they are often scaled down and simplified, such as the job-shop problems, flow-shop problems, production scheduling problems, and project scheduling problems. First, let us examine the simplified models of those problems: • Job-shop model: there are a finite set of n jobs and a finite set of m machines. Each job consists of some tasks and each machine can handle at most one operation at a time. Each task needs to be processed during an uninterrupted period of time with a given length on a given machine. The objective of the problem is to find a schedule with minimal length of time interval to complete all jobs. For example, Table 1 gives a job-shop problem, and an optimal scheduling is illustrated in Figure 2. Table 1 Example ofjob-shop problem



Job



Task



Jobl Jobl Job2 Job2 Job2 Job3



Taskl-l Taskl-2 Task2-1 Task2-2 Task2-3 Task3-1



Machine



Duration



Ml M2 Ml M2 Ml M2



4 5 3 5 6



4



• Flow-shop model: there are a series of machines numbered 1, 2, 3, ... , m. Each job has exactly m tasks with varying durations. The first task of every job is done on machine 1, second task on machine 2 and so on. Every job goes through all m machines in a unidirectional order.



 Genetic algorithm techniques and applications in management systems 215



Machine 2 Machine I



o _



Jobl



5 _



10



Job2



_



15



Job3



Figure 2. An optimal scheduling.



• Project scheduling model: a project consists of a set of tasks, or activities. Tasks have precedence relationships. Tasksalso have estimated durations and may include various other measures such as cost. The objective is to determine a schedule with minimal makespan such that both the precedence and resource constraints are fulfilled. Mathematical programming formulations of the resource-constrained project scheduling problems were studied by Demeulemeester and Herroelen [Demeulemeester 92] and Mingozzi et al. [Mingozzi 98]. For example the task precedence relation ofa project scheduling problem is depicted in Figure 3.



Figure 3. Example of task precedence relation.



The project consists of 6 tasks with constrains including: the required skills to finish Task1 are Java and Microsoft Project; Tom's salary is higher than Mike's; Jenny works on Monday, Tuesday, and Friday, etc. In many industrial applications, Project Scheduling problems occur. An overview about the different models is given in the survey article "Resource-Constrained Project Scheduling: Notation, Classification, Models, and Methods" [Brucker 01]. It seems that the simplified model can be solved quite well by some exact algorithms. However, in the real-world situations, the problem is not assimple as one may think. For example, software management is a peculiar area in project scheduling. To the software



 216



Carl K. Chang and Yujia Ge



development process, scheduling is so tedious, error-prone, but important that it can greatly influence the success of a project. Experiences show that project scheduling can be influenced by a lot of dynamics elements, such as skills of engineers, growth of skills and experiences, cooperation and leadership, etc. However, software project management is paid relatively little attention in the software engineering research community. The main problem is that what kind of model is sound and applicable in software engineering process. The model requires specific project information to support decision and optimization. It is clear that we cannot model every single element of the entire software engineering process. What do we need to model and how? In section 5, we will discuss it in more details. 1.3. Structure of this chapter



In section 2, we will discuss two major approaches (exact solution and heuristics) to solve scheduling problems. From section 3, we begin to focus on Genetic algorithm. Section 3 gives an introduction to genetic algorithm. Section 4 contains a survey of GA techniques on scheduling. Section 5 discusses software project management problems employing the GA solution. Finally, section 6 concludes the chapter. 2. RELATED WORK ON SCHEDULING



Research on solution of the resource-constrained scheduling problem has progressed for several decades. The methods form two distinct classes: exact methods and heuristic methods. These solutions may be categorized further into stochastic and deterministic approaches [Wall 96]. But usually exact solutions are to solve simplified problems while real problems cannot be solved exactly because of complex constrains and large scale. Therefore oftentimes we try to find "good" solutions instead the "best" one. As a result heuristic methods were introduced that contains rules that drive the optimization process in finding solutions. Stochastic methods include probabilistic operations so that they may not operate the same way twice on a given problem. Deterministic methods operate the same way each time for a given problem. Genetic algorithm combines heuristic and stochastic methods in its operation. In fact, many hybrid methods exist that combine the characteristics of these classes. 2.1. Exact solution methods



When resource-constrained scheduling solutions were first proposed, simple models were used with exact methods for solving problems. Exact methods try to find optimal solutions through some intelligent exhaustive search. They include backtracking, branch and bound [chap. 9, Murty 95], critical path method and its variations, dynamic programming, and implicit enumeration. Given a problem, the exact methods can find the best solution if it does exist. However, when constraints are added, the difficulty of solving a problem increases. In addition, significant problem size affects a lot on the feasibility of those methods. For example, critical path method was devised for finding the shortest time to complete a project, given estimates of task durations. But, the critical path method cannot solve problems that include restrictions on the number of resources that are available.



 Genetic algorithm techniques and applications in managem ent systems 217



2.2. Heuristic solution methods



Althou gh heuristi c methods cannot find optimal solutions , it still can find good solution with less time, but mor e space, comp ared to exact meth ods. Heuristics are rules to help make a decision given a particular situation. Heuristics in scheduling are usually referred to as scheduling rules or dispatch rules. Heuri stics could be deterministic or stochastic. There are several common heuristics approaches on scheduling problems. Simulated Annealing (SA) introduced by Kirkpatri ck et al. [Kirkpatrick 83] is originated from the physical annealing process. It requires a schedule represent ation as well as a neighborhood operator for moving from the cur rent solution to a candidate solution. In [Vidal 93], SA has been proved to be a good technique for lots of applications. Tabu Search (T S) developed by Glover [Glover 89] is essentially a steepest descent /mildest ascent method for guiding known heuristic to overcome local optimality. Genetic Algorithm (GA), inspired by the process of biological evolution, was introduced by Holland [Holland 75], and has been extensively applied in scheduling problems. 3. INTRODUCTION TO GENETIC ALGORITHM



Genetic algorithms belon g to the stochastic search method introduced in the 1970s by John Holland [Holland 75] and it belongs to the evolutionary strategies developed by lngo Rechenberg [Re chenberg 73]. They are inspired by natural geneti cs and evolution with Darwin 's idea. Based on simplifications of natural evolutionary processes, genetic algorithms operate on a popul ation of solutions rather than a single solution, and employ heuri stics such as selection. crossover, and mut ation to evolve better solutions. 3.1. The concept of genetic algorithms



Genetic algorithms begin with a group of initial solution individuals and execute iteratively to create better offspring. They can be evaluated by some criteria, such as an objective function. An individu al is a candidate solution of the problem whi ch is called genome. The genetic algorithms apply genetic operators such a mutation and crossover to evolve the solutions in order to find the best one. T here are many different variations to improve perfor mance or parallclize the algorith ms. The three most important aspects of using genetic algorithms are: (1) definition and implementation of the genetic representation, (2) definition and implementation of the genetic operators, (3) definition of the objective function . For the representation ofindivid ual genomes, Holland worked primarily with strings of bits. There are other kinds of representation: arrays, tress, lists or any other object. The critical step is to define genetic operators (initialization, mutation, crossover, comparison). For example, selection of the parents and crossover (sometimes combined with mutation) is the con struction of a child solution from the parent solutions. The selection process should choose individuals with better performance. A selection algor ithm that gives little weight to performance will tend to search widely but usually will not converge quickly. A crossover operator mimi cs the step to produ ce children inheriting certain traits of both parents. Mutation is a random process that is to randomly perturb



 218



Carl K. Chang and Yujia Ge



List crossover



Parent I Parent 2



Child I Child 2



Array crossover



Crossover piont



Parent I



Parent 2



~~:~



Li



n~



'V



~~



n



:r-::



~~ Child I



Child 2



Figure 4. Crossover operator examples.



some of the solutions in the population. In the absence of mutation, no child can ever acquire parametric values that were not already present in the previous population. Let us look at the examples of crossover operators on List and Array as Figure 4: The objective function provides a measure of how good an individual is but can be considered for either an individual in isolation or within the context of the entire population. The objective score is a measure used to evaluate the performance of the genome. The fitness score is computed from the objective score using a scaling strategy, such as those introduced by Goldberg [Goldberg 89]. Figure 5 shows the main stages of a GA: 3.2. A simple example of genetic algorithms



For a simple example, we use simple binary string as representation of the population. Our objective is to derive 1011. The fitness function is the number ofbits that matches the objective. The initial populations are 0011, 1100, 1010. They are evaluated with the score 3, 1 and 3, respectively, so 0011, 1010 are selected. By the crossover operation on 0011 and 1010, the offspring 1011 and 0010 are produced. When we found score of 1011 is 4, it signals the end of our calculation. The procedure is shown in Figure 6. Although it is a very small example, it clearly shows that genetic algorithm can compute the result very quickly for we do not need to search the entire search space. 3.3. Packages of genetic algorithm components



Example implementation packages include GAucsd (University of California at San Diego), GALOPPS (Michigan State University), IlliGAL (UIUC Illinois Genetic Algorithms laboratory), and GAlib (MIT), LibGA (eMU Artificial Intelligence Repository). GAucsd [GAucsd] is a genetic algorithm software package based on GENESIS. Major additions include a wrapper that simplifies the writing of evaluation functions, a facility to distribute experiments over networks of machines, and Dynamic Parameter



 Genetic algorithm techniques and applications in management systems 219



Calculate & evaluate fitness value of each individual



New population set t = t+ I



Perform genome crossover and mutation



Check stopping criteria



Figure 5. Main stages of a genet ic algorithm.



Objective = 101 1 Initial popul ation



0011 11 00 1010



selec tio n '" ..-



score(OOI I )=3 score(ll 00) = 1 score(lOI0)=3



0011 0011 1010



mu t at Ion



..-



I'\.. crossover



./



V



<,



scor e(OOl l )=3 score (I 0 10)=3



0010 1011 0010



sco re(00 10)=2 sco re(l 0 11)=4



Figure 6. A simplified GA example.



Encoding, a techniqu e th at improves GA performance in continuous search space by adaptively refini ng the representation of real-valued gen es. Exp er iments can be executed in parallel, and distr ibuted to several hosts. GALO PPS [GALOPPS] (the "Gene tic Algorithm Optimi zed for Por tability and Parallelism System") is a gene tic algorithm tool w riting in C language that pro vides a lot of options for genetic algorithm exp eriments. It is available for bot h PC and U nix systems.



 220



Carl K. Chang and Yujia Ge



Wall's GAlib [GAlib] provides rich types of Genomes and Operators. Each type can be customized to meet more complicated requirements, for instance, deterministic crowding, traveling salesman, Dejong, and Royal Road problems. Also, new genetic algorithms can be derived from base genetic algorithms class in the library. The LibGA software package [LibGA] was developed primarily because of the noticeable deficiencies of existing GA packages at the time. LibGA is a collection of routines written in the C programming language. It can run on a variety ofworkstations and PC's. 3.4. Applications of GA and when not to use



Because genetic algorithm works well on certain complex practical problems compared to some other methods, many practical applications have implemented genetic algorithms and obtained quite good results. For example, 1) Optimization andplanning



Genetic algorithms are naturally optimization methods. A lot of applications utilized GA for planning and scheduling. 2) Economics



Genetic algorithms are applied in game theory to find equilibrium points in non-zero sum and non-cooperative situations. 3) Business and their supportive role in decision making



Most research work focused on engineering and technologies with GA. GA can also be used in Business. For example, some papers reported work on applying GA to the design of organization. 4) Computer-aided design



Genetic algorithms use the feedback from an evaluation process to select the better designs and can generate new designs by the combination of the selected part designs. When NOT to use a GA? There are no common standard for evaluating this question. But we can answer it partly according to some general rules for it. When global optimization is required, GA is not satisfactory because sometimes global optimization can only be satisfied by other techniques. One of GA disadvantages is that there is no guarantee for finding exact global minimum since no mathematical proof exists. When the problem is highly constrained, much power of GA will be reduced. When the problem is smooth, we can use other optimizer, such as gradient-based optimizer. When the search space is very small, we can use exact solution to get the result. 4. SURVEY OF GA TECHNIQUES ON SCHEDULING



In the past two decades, an increasing number of research efforts have investigated the application of Genetic Algorithm techniques for the solution of scheduling problems.



 Genetic algorithm techniques and applications in management systems 221



Problem



Genetic Representation



D



Encode Fitness function



Figure 7. Representation issues of GAs.



In one of the earliest published works on the application of GAs to scheduling, Davis [Davis 85] outlines a basic scheme applied to a simplified toy problem in flow-shop scheduling. Davis also points out that there are layers ofill defined constraints in real-life scheduling problems which are very difficult to represent in the formal frameworks of operational research. Knowledge-based solutions are typically deterministic and thus easily lead to sub-optimal solutions. To avoid the sub-optimal solutions, Davis suggested genetic algorithms by their stochastic nature and used list as an indirect representation for genetic algorithm, from which the actual schedule can be derived by decoding. Since Davis' paper, numerous implementations have been suggested not only for the job-shop problem but also other variations of the general resource-constrained scheduling problem. Recently, a sharply increased number of papers are published every year. For example, in an indexed bibliography of Genetic Algorithms Papers of 1996 (in proceedings) compiled by Jarmo T. Alander [Alander 96], the statistical results shows 1470 GA papers with approximately 40% average annual growth during last twenty years. 4.1. Representation issues



Representation is the most important aspect in GAs. The operators are related to the representations as depicted in Figure 7. Usually problem-specific representations often improve the performance of GA. In some cases, a representation for one class of problems can be applied to others as well. But in most cases, modification of the constraint definitions requires a different representation. Ralf Bruns summarized the production scheduling approaches in four overlapping categories: direct, indirect, domain-independent, and problem-specific representations [Bruns 93]. 4.1.1. Indirect representation



Most genetic algorithms for scheduling use an indirect representation of solutions. Indirect representation means that a schedule is not directly represented, but encoded in a certain representation. It requires transformation from genome to schedule, and in some cases requires a schedule builder as well [Bagchi 91]. Domain-independent



 Carl K. Chang and Yujia Ge



222



Resource 4/1



4/l



~~ ~rr 3/2



3/2



6



5



1



3/2



time required /source required



4



2



2



3



o



7



10



13 t



Figure 8. Example of indirect representation.



representation means domain-specific knowledge solutions. There are two typical approaches:



IS



not required for representing



(1) Binary representation: A solution to a scheduling problem is represented by a bit string. In [Cleveland 89], the release time of eachjob in a flow shop is represented as a binary integer. All such times represented in binary integers are then concatenated into one long bit string as a solution. Similar work can be found in [Nakano 91], [Tamaki 92]. (2) List or order based representations: The list of alljobs to be scheduled is represented as an individual. The ordering of the list represents the scheduling priority of the jobs. Thus, the scheduling problem is regarded as a sequencing problem, such as sequence of jobs representation [Cleveland 89, Starkweather 91, Fang 93]. For example, in sequence of operations representation, a solution (4, 5, 6, 4, 2, 1) means that the sequence of scheduling starts from the first part ofjob4, followed by jobS, job6, the second part ofjob4, job2 and job1. In problem-specific indirect representation, we need to define the operators and the evaluation functions based on domain knowledge. Domain knowledge was used to improve performance, such as sequence of job-process plan representation, sequence of job-process plan-machines representation [Uckun 93], and preference list representation [Cleveland 89]. Most of recent representations are order based: In [Hartmann 98], considering the resource-constrained project scheduling problem with makespan minimization as objective, an individual is represented as (J~ .... ]~). It is a feasible activity sequence. For example, there are 6 tasks in the project with maximum source 00. One feasible permutation is (1,2,3,5,4,6) asshown in Figure 8. [Hartmann 98] also proposed a new genetic algorithm approach to solve this problem. The approach made use of a permutation based genetic encoding that contains problem-specific knowledge. In his paper, he also compared their GA with the other two algorithms employing priority-value based and priority-rule based representations.



 Gene tic algor ithm techniques and application s in management systems 223



4. 1.2. Direct representation



In direct representation , an individual in a complete representation is a feasible schedule itself, so a schedule builder is not necessary. A direct representation by Kanet and Filipic used a list of order- machine- time triplets for the single-operation j obs problem. H owever, as noted by Bruns, it was not extensible [Kanet 91] [Filipic 92]. Both of them designed unique crossover and mutation operators. A representation by Yamada [Yamada 92] used a list of completion times for each operation for the j ob shop problems. In [Bruns 93], he proposed a direct, problem-specific representation using a list of orde r assignments in which the sequence of orders was not important. In [Husbands 93], an individu al is represented as: "Task! M!1i Task2 M2 72 C"TaskJM J 73 "Task4 M4 5 4 C .. " in which "G" is used to separate interrelated groups. The whole schedule can be describ ed as " Taskl uses machine Ml in time Tl, meanwhile Task 2 uses machine M2 in time T2, etc. In [Wall 96], a genome consists ofan array ofrelative start times. Each time represents the duration from the latest finish time of all predecessor tasks to the start time of the corresponding task. An example is illustrated in Figure 9.



Itl



It2



I t3



It4



It5



t 1=0 Figure 9. Example of direct representation.



4.2 . Operators



Usually, operatorsfor genetic algorithm are related to solution representation. Problemspecific operators often imp rove th e performance of genetic algorithms. Several different sequencing operator s were developed and the schedule builders used range from fairly simple ones to complex know ledge-based systems. lniualiz ation



Besides representation and operators, ano ther imp ortant topics common in scheduling is non-ra ndom initialization of the population . In general it is true that using heuristics



 224



Carl K. Chang and Yujia Ge



to choose better individuals for the initial population can lead to significantly faster convergence to a good solution. Therefore, using heuristic initialization [Bruns 93] is a good try. Burke, Newall and Weare [Burke 98] address the issue of heuristic initialization for the problem of timetabling. They compared a variety of different heuristics for generating good initial solutions. Based on statistical measurements of diversity, they showed that the best initialization methods can produce populations of high fitness individuals with diversity and also determine how much randomness in the heuristics is optimal. Reordering methods



Another factor was the reordering method. In the genetic algorithm implementation, different choice of crossover can significantly affect the performance. Better performance can be achieved with appropriate operators. Some papers investigated the effectiveness of crossover techniques. From the traditional I-point crossover, many different crossover algorithms have been devised. [Goldberg 89] reported that usually 2-point crossover can improve the performance, but more crossover reduces the performance of GA. Other practical topics, such as mutation, can be found in [Beasley 93]. 4.3. Comparison between different approaches



Some empirical evaluations have been reported to compare different evolutionary computation solutions for scheduling. Lee and Kim [Lee 96] compared a genetic algorithm (GA), a simulated annealing heuristic, and a tabu search method. Because of the variation of different GA representations and implementations on different problems, such comparisons entail a lot of work. Most papers are on the comparison of genetic algorithm with other methods. There are few papers on comparison of genetic algorithms per se. Here are two papers on comparison: One is reported by Bruns for job shop problems: In [Bruns 93], a direct representation with knowledge-augemented operators was compared with domain-independent representation. Several GAs were developed for the n *m (minimum-makespan) job shop problem, where n denotes the number ofjobs and m the number of machines. Experiments were conducted using the well-known 10* 10 and 20*5 benchmarks introduced by Muth and Thompson (1963). It was found that different GAs were able to obtain very good results (close to the optimum) for these difficult scheduling problems. The best performance was achieved with a direct problem representation by Davidor 93 as shown in Table 2. Comparisons between domain-independent and problem-specific evolutionary algorithms were conducted as well. And those results are compared with the result in [Bagchi 91]. Bagchi compared three different representations and concluded that the more problem-specific information was included in the representation, the better the algorithm would perform [Bagchi 91].



 Gene tic algorith m techniq ues and applications in management systems 225



Table 2 Taken from [Bru ns 931 GA



R epresent ation



10*10



20*5



N akano 91 Yamada 92 Davidor 93 Fang 93 Optimal



binary direct direct- parallel sequen ce of op erations



96 5 975 963 977 930



1215 1236 1213 12 15 1165



Ta ble 3 Patter son instance set- taken from [Hartmann 981 Paper



R epresentati on



Avg.dev



optimal



Hartrnannvx-pe rm utation Harrmannvx-P riority value Hartmann98- Pri ority rule Leon, R amam oorthy 1995



permutation priority value pri ority rule problem space



0.00%



100. 0%



0.78% 0.74'%



74.6'){, 75.5%



0.2 5%,



88.4%



Another is reported in [Hartmann 98]: [Hartmann 98] presented the experimental results on resource- con strained project scheduling as shown in Table 3. The priorit), value GAs can be viewed as modified versions of the GAs propo sed by Lee and Kim. Priority Rille based Genetic Algorithm is based on a priority rule representation. T his representation type was developed by [Dorndorf 95] for the job sho p scheduling problem . Permutation based GA has basic scheme and the operator variants. The permutation based encoding yielded best results. Priori ty value based representation is in the second place and pri ori ty rule encoding has a little worse performance. Leon and Ramamoorthy [Leon 95] suggested a GA and two local search procedures. Their approach is based on the so- called problem space based representation which, similarly to (he priority value representation, encodes a solution using an array of real numbers. 4.4 . O ther methods and issu es



Hybrid solutions



Because of the popul arity and limitation of GA (for example, sometimes GA is too expensive in terms of time), a lot of hybr id GA solutions have been proposed for improving better performance for some specific problems. Most of them com bine GA with other heuristic methods, such as heuristic rules, gradien t-b ased method or tabu search. [Fang 94] describes a hybrid approach to this problem which combines a geneti c algorithm with simple heuristic schedule building rules. In [AI 94), GA is not used to optimize the parameters directly but to optimize the parameters of heuri stic problem solving strategies. [H illiard 88] implemented a classifier system to improve heuristi cs for determining the sequence of activities and found some heuri stics for the simple machine shop scheduling prob lem . Similar works can also be found in [N akamura 98).



 226



Carl K. Chang and Yujia Ge



Distributed parallel GAs



From the beginning of GA research, the potential for parallelization was determined. Nowadays, with the tremendous progress in computing power of hardware, researches on this direction go deeper recently. Performance can be improved dramatically using parallel computing. For example, standard sequential GA uses global population statistics to control selection, so the processing time can be shortened. Various papers have been written describing parallel genetic algorithm implementations such as [Tamaki 92], [Yamada 96]. But most of these are straightforward extensions of serial genetic algorithms and offer small-step improvement in algorithmic performance. Parallelization by distributing computation will speed up execution, but additional evolutionary operations such as migration are required for improvements in solution quality. Kohlmorgen et al. [Kohlmorgen 96] summarize their experiences with an implementation of a GA for the RCPSP on parallel processors. [Mcilhagga 97] describes a Distributed Genetic Algorithm which has been used to solve generic scheduling problem. It discusses more generalized large resource allocation problems. Their primary aim has been to investigate the application of GAs to a wide range of very large non-specific scheduling problems. They used GPDGA-the Generic Parallel Distributed Genetic Algorithm tool kit-to implement the evolutionary search aspect of this study. 5. APPLY GENERIC ALGORITHMS TO SOFTWARE ENGINEERING



5.1. Software project management



Software project management environment can be very dynamic where a large team of engineers work together. It is a tedious job to track project activities and the cost for tracking is quite high. Project management (PM) ensures the delivery ofsoftware that is of high quality and delivered within time and budget constraints. It requires a manager to estimate and assign the resource intelligently. Basic project management techniques include Gantt Chart, CPM and PERT chart. However, most existing project management tools can only support passive project management. Even if equipped with modern PM tool such as Microsoft project, managers still are not adequately supported to manage resource allocation, task assignments, and scheduling. The state of the art of PM requires managers make own decisions on who does what, and when, before engaging the PM tools to "track" the status. When project evolves, as that will definitely happen, such smart decisions will need to be re-made and the tools need to re-compute to track project status, often with a lot of delay and patch work. These "modern" tools are simply without a "brain". The main problem is that we need to more faithfully model the very nature of software development. Software development is a constantly evolving organic entity where software engineers grow as project moves on and the environment changes all the time. The first paper on this research is SPMNet model [Chao 95]. SPMNet is a formal software management model offering a theoretical foundation to address a variety of issues in software project management. One of the major advantages of this model is that it provides a formal means for customers, software managers, and



 Genetic algorithm techniques and applications in managem ent systems 227



software developers to communicate with each other. Based on this formal model, the customers and the developers will be able to visualize and monitor the progress of the software proje ct. Ano ther advantage of SPMNet is that it provides the software developers a foundation to build too ls that will support and enhance the software engineering process. Fur thermore, a set of advanced features can be der ived based on SPMNets, including: visualizing the progress of a software project , construc ting a structural project plan, automated resource allocation and scheduling, and status prediction of a project. 5.2. Introduction to SPMNET



In order to capture concurrency in the software development process, SPM Ne ts borrowed some concepts from Petri nets [Mur ata 89]. SPM Ne t closely followed the terminology used in Petri nets. However, the firing rules and information carried by tokens are different from traditional Petri nets. For brevity we only briefly introduce SPMNet here. For its full definition, please refer to [Chao 95]. A SPMNet consists if a set of places, a set if transitions, a set of constraints, and a set of arcs. There arefour dijferent place types including abstract activity, atomic activity, product, and decision. A rl abstract activity is a collection of atomic or abstract activities. By groupillg several related lower-level atomic orabstract activities together, abstract oaivit»provides a high-level view to seiftware managers by hiding the underlying details in large-scale software proj ects. The atomic activity (Pat) is represented as a circle. The decision place (Pd ) is representedas a diamond and it is used to represent the iterative nature '!f the soft ware development, Product place (Pp ) is drawn as a rectangle. Abstract activity (P"b) is usedfor abstract activity. Each activity is associated with a set of constraints, which specifies the requirementsfor completing this activity. The constraints can befurther classified as resource constraints (CT ) and complexity constraints (CT ) . Resource constraints spedf y what kinds oi resources are required and complexity constraints describe how much iffort is neededfor that activity. The dependency relations amon,R activities are linked by transitions, product places and decision places. There are three types q( transitions, 1[ , To and TDo . 1[ denotes the input transition if an activity. To represents the output transition if an activity. TDo represents the output transition ofa decision place. Each transition type has dijferent meaning andfiring rules. An example ofSPM-Net is shown in Figure 10. Based on these constraints in each activity, genetic algorith ms are used to determine the resource allocation for each activity automatically. O nce the resource allocation is decided, the execution time for each activity can be calculated according to the com plexity constraint on that activity. Software managers may pre-execute the SPMNet to visualize the project progress in advance. 5.3. GA for softw are project management 5.3.1. Task-based model



Genetic Algorithms, already pop ular in many domains of optimization research, provide a good solution to the software project management problem . T he following



 228



Carl K. Chang and Yujia Ge



System specification Construction plan



oin



specification



Figure 10. An example SPMNet.



model is reported in "Genetic Algorithms for Project Management" [Chang 01]. The model is proposed by improvements to the original model (i.e. SPMNet).



1. Model The representation of the problem consists of: 1) Representation of project: TPG



=



(V, E)



• The project is represented as a task precedence graph. • A directed a cyclic graph where the nodes represent the tasks and the edges represent the task precedence. • Each task is associated with an estimated effort (based on COCOMO [COCOMO] ) and the required skills. 2) An employee database D emp with information of skills and salary. 3) An objective function. Genetic representation is an orthogonal 2D array with one dimension for tasks, the other for employee. GA operators are adopted from GAlib [GAlib]. In our approach, there are two stages for scheduling the project: the first stage evaluates how the genome satisfies the constraints; the second stage evaluates the schedule performance of the genome. For the simplest objective function, it can be defined as: Composite objective function



= Validity * (OverLoad Weight/OverLoad + MoneyWeight/CostMoney + TimeWeight / CostTime)



 Gen etic algor ithm techn iques and application s in managem ent systems



229



Figure 11. Task prece den ce graph of the test example.



Table 4 Employee table Emp. [D PI 1'2 1'3 1'4 1'5 1'6 1'7 1'8 1'9



Skill 0



Skill 1



Skill 2



Skill 3



2 2



4



4 1 2



4 3 3



4



3



4 2



3 4



3



4



Skill 4



3 3 2 2



Salary 7000 .00 4000 .00 4000 .00 5000 .00 5000.00 6000 .00 5000.00 6000 .00 7000.00



2. A Test Problem and its Results The example is shown in Figure 11: • T he test project consists of 17 tasks, each marked with Person-Month (PM) requirement. • T here are 9 employees available to work on this project as shown in Table 4. • The skill proficiency (Skill R.equired or SR ) is rated on scale (0-5). • A Sample Obje ctive and its Schedule In this test, the objective is to "Find the optim um valid schedule, satisfying a composite objective function , including money cost, time cost and overt time limits for loading" . T he GA solution after 1000 generations is shown in Table 5.



 230



Carl K. Chang and Yujia Ge



Table 5 Result of GA test



T3



Emp



TO



Tl



T2



Pl P2 P3 P4 P5 P6 P7 P8 P9



0.75 0.5 1 0.5 0.25 0.75 0.75 D 0.5



0.5 0.75 0.75 0.75 1 1 0.75 0 0



0 0 0.75 0.25 1 0 0.75 0 0.25 0.5 1 0 D.5 1 0 0.75 0.5 0.5



T4



T5



1 0.25 0.75 0.25 0.5 0 0 0.5 0.25



0.75 1 0.5 0.25 1 0.75 0.75 0.75 0 1 0 1 0.5 0.75 0.5 0.75 0.75 0.5



T6



T7



T8



T9



T10



TIl



T12 T13



T14 T15



Tl6



0 0.75 0.5 0.25 1 0.75 1 0.75 0.5



1 0.75 1 0.75 1 0.5 0.75 0.5 0.75



1 0.75 0 0 0.75 0.5 0.75 0 0.75



1



0.25 D.25 1 0.75 0 0 1 0.25 1



0.5 0.25 0 0.5 0.75 1 0.75 0.75 1



0.75 0.5 1 1 0.25 0.5 1 1 0.25



0 0.75 0.5 1 0 0.25 05 0.5 0



1 0.75 0.25 1 0 0.5 0.75 0.75



0 0.75 0.75 0.5 D.25 0.25 0 0 0.5



0 0.5 0.25 1 0.75 0.25 0.75 0.75 1



5.3.2. Timeline-based model



In [Di 01], a time line was introduced to improve the original model in [Chang 01]. The time line expands the two-dimensional (task and employee) model to a threedimensional one which shows the effort of each employee applied to each task in each time unit. The time line helped capture the dynamic nature of software management, such as reassignment of employees, learning, scheduled vacation, unexpected leave, suspension and resumption of tasks, and the introduction of hard, intermediate deadlines. 1. Model The computational complexity is now sharply increased because of the introduction of time dimension compared to the original task-based model. In order to achieve realism, the elements, such as employee's proficiency scores, employees' technical experiences, task deadlines and task penalties are considered. Models related to employees were added, such as Employee Model (an employee represented by a numerical identifier and some properties) with Employee Compensation Model, Employee Skill List, Employee Training Model, Employee Experience Model and Availability Model. Models related to tasks (a task represented as a numerical identifier and some properties) include Task Estimated Effort, Task Importance Model, Skill List and Ancestor Tasks and Maximum Headcount. Genetic representation is a 3D array. Figure 12 illustrates the Employee-Task Assignment Scheme. In [Di 01], the algorithm for calculating fitness function was also provided. 2. Numerical Experiments



Additional numerical experiments were reported in Di's thesis [Di 01] to evaluate the correctness and performance of the model. The time-line based model was implemented in C++ with the GALib, an open-source Genetic Algorithms package [GAlib]. 6. CONCLUSION AND FUTURE WORK



Genetic algorithms have been applied successfully in some applications. In particular, plenty of research work was found in dealing with the scheduling problems. However, most of them are on simplified scheduling, such as job-shop, flow-shop and highly



 Gen etic algorithm tech niques and applications in manageme nt systems 231



time



T



Employee I



T



Training S Training I TaskN Task I



EmployeeK



Figure 12. Assignment schem e in tim eline based mod el.



simplified project scheduling problems. Realizing that handling various constraints encountered in the real-w orld scheduling problem is so difficult yet important, more and more researchers focus on modelin g real-world scheduling prob lems at present. C hang's research group primarily focused on the software engi neer ing domain and found interesting issues peculiar to the dynamic nature of " peo ple", namely the software eng meers. Studies can go furth er on the following directions: 1. Model th e real-world scheduling problem with more realistic constraints, such as employee's trainin g and experience gaining. Those studies will involve other research fields, such as psychology, education , etc. We are yet to determine the realistic learnin g curve for software enginee rs to grow professionally and pick up both hard and soft skills. 2. Studie s on mor e efficient genetic algorithm on problems peculiar to software engineer ing managem ent probl ems. 3. Developing more flexible and powerful software proj ect manageme nt tools with support on automated scheduling and consideration on capability of resources. In sum, we believe that the results reported in this book chapter only outlined a research direction for software engin eerin g researchers who traditi onally had not paid much attention to the field of "soft com pu ting" where genetic algorithm s play a visible role. We int end to raise the awarene ss of good results in "soft com pu ting" and bridge the gap betwee n soft ware engineering and 5~fi computing. REFERENCES [AL 94) A. AI-Attar, " A H ybrid GA- Heuristic Search Strategy," AI Expert, Sept. 1994. [Aland er 96] J. Alande r, " Indexed Bibliography of Geneti c Algo rith ms Papers of 1996," R epo rt Series No . 94- 1-96PR O C , Uni versity of Vaasa, 1996 . [Bagchi 9 1] S. Bagchi, S. Uckun , Y. Miyabe and K. Kawamura, " Explor ing Problem- Specific R ecombin ation O perato rs for j ob Shop Sche duling ," Proc. of rhe -tth Int'l Conf. on Genetic Algorithms, pp. 10-17 . San D iego, CA, j uly 1991.



 232



Carl K. Chang and Yujia Ge



[Beasley 93) D. Beasley, " An Ove rview of Genetic Algori thms : Part 2, R esearch Topi cs," University Computing, vol. 15, no. 4, pp. 170-1 81, 1993 . [Bruc ker 0 1) P. Brucker, A. Drexl, R . H . Mohring, K. Neumann and E. Pesch , " R esource- C onstrained Proje ct Scheduling : Notation, Classificatio n, Models, and Me thods ," Europe an Journal of Operational R esearch 112, pp. 3- 41, 1999. [Brun s 93] R . Bru ns, " Direct Chromosome Representation and Advanced Gen etic Operator s for Prod uct ion Sch eduling ," Proc . of th e 5th Int'l Conf. on Gen etic Algor ithms, pp. 352- 359, 1993 . [Burke 98] E. Burke , J. Newall and R. Weare, " Initialization Strategies and Diversity in Evolutionary Timetabling,' Evolutionary Computation, pp. 8 1- I03, 1998. [Chang OIl C. Chang, M. Christensen , T. Zh ang , " Genetic Algorithms for Project Management," Anna ls of Software Engin eering 11, pp. 107-139,2001. [Chao 95] C. Chao, "SPM N et: a N ew Meth od olo gy for Software Man agement," Ph .D. T hesis, the University of Illinois at Chi cago, 1995. [C leveland 89J G. C leveland and S. Smith, " Using Genetic Algori thm s to Sch edu le Flow Shop R eleases," Proc. of the 3rd Int'I Conf. on Gene tic Algori thm s, pp. 160-169, 1989. [C O C OMO] "COCO MO II Model Definit ion Manual," http: / /sunset.usc.edu [Davis 85] L. Davis, "Job Shop Scheduling with Genetic Algo rithms," Proc. of an Int 'l Conf. o n Genetic Algor ithms and their Applications, Pittsburgh, Lawrence Erlbaum Associates, 1995 . [Dem eulemees ter 92] E. D em eulemeester and W H erroelen, "A Branch-and -b ou nd Proce dure for the Mu ltiple Resource Constrained Proj ect Scheduling Problem," M anagem ent Scienc e, vol. 38, pp. 18031818,1992. [Di 0 1] Y. Di, "T im eline Based M odel For Job Scheduling wi th Genetic Algorithms," MS T hesis, U niversity of Illinois at C hicago, 2001. [Dorndorf95] U. Dorndorfan d E. Pesch , " Evolution based learni ng in a job sho p sche duling environment," Co mputers and Operations R esearch , vol. 22, pp. 25-40, 1995. (Fang 93] H . Fang, P. Ross and D. Corne, " A Promising Genetic Algorithm Approa ch to Job-Shop Scheduling, R escheduling, and O pen-Shop Schedul ing Prob lems," Proc. of the 5th lnt'l C o nf. on Genetic Algor ithms, pp. 375-382, San Mateo, C A, 1993 . [Fang 94) H . Fang, P. R oss and D. Corn e, " A Promising H ybr id GA l Heuri stic Appro ach for Open-Shop Schedu ling Problems," Proc. of th e 11th Europe an Conf. on Artifici al Int elligence, pp. 590-594, 1994 . IGAlib] http :/ /l ancet.mit .edu / ga/ [GALO PPS] http :/ / garage.cps.msu .edu/ software/ software- inde x.html [GAuc sd] http:/ / www.cs. bham .ac.uk/ Mi rrors /ftp.de.uu.n et/ EC / clife/ www/ Q 20_gaucsd.htm [Glover 89] F. Glover, " Tabu Search [ Part I. O RS A," Journal on Computing, vol. 1, pp. 190-206, 1989 . [Goldberg 89] D. Goldb erg, " Genetic Algorith ms in Search, Optimization , and Machi ne Learn ing," Addiso n- Wesley, Reading, Massachu setts, 1989. [Hartm ann 98J S. Hartmann, " A Competitive Genetic Algori thm for Resource- Constrained Project Scheduling," N aval Research Logistics, vol. 45, pp. 773-750, 1998. [Hilliard 88] M . Hilliard, G. Liepins and M . Palme r, " Machine Learning Appl ication s to Job Shop Scheduling," Proc. AAAI-S IGMAN Workshop on Production Planning and Scheduling, St. Paul, 1988. [Holland 75] H . H olland, "Adaptation in N atura l and Artificial Systems," University of Michigan Press, Ann Arbor, 1975. [Hu sbands 93] P. Hu sbands and F. Mill, "Sch eduling wi th Genetic Algorithms," University of Edinburgh, UK,1993 . [Kirkpat rick 83] S. Kirkpatrick, C. Gelatt , et aI., " O ptimization by Simulated Annealing," Science 220, pp. 671-680, 1983. [Kohl mor gen 96J U. Kohl morgen and H . Schme ck, "Experiences with Fine-G rained Parallel Genetic Algor ithms," Proc. of Parallel Optimization Colloqui um (PO C' 96), Versailles, pp. 217- 226, March 1996. [Lee 96] J. Lee and Y. Kim , "S earch H euri stics for Resource-constrained Proj ect Scheduling," Jou rnal of th e O perational Research Society, vol. 47, pp. 678--689, 1996. [Leo n 95] V. Leon and B. R amamorthy, " Strength and Adap tability of Prob lem -space based Nei ghborhoods for R esource-c on strain ed Sch edu ling," O R Spe krrum, vol. 17, pp. 173-1 82 ,1 995 . [LibGA) http :/ / www.cs.bh am.ac.uk /Mirror s/ftp.de.uu .net /EC/clife/www/Q20Jibga.htm [Murata 89J T. Murata, "Petri Nets: Prop ertie s, Analysis and Application s," Proc. of the IEEE, pp. 541- 580, Apr. 1989. [Mi ngo zzi 98] A. Mingozzi, V. Maniezzo, S. Ri cciarde lli, and L. Bianco, "A n Exact Algorithm for the R esource Cons trained Project Sched uling Problem Based on a Ne w Mathematical For mulatio n ," Managem ent Scien ce, vol. 44, pp. 7 14-729, 1998.



 Genetic algorithm techniques and applications in management systems 233



[Mcilhagga 97] M. Mcllhagga, "Solving Generic Scheduling Problems with a Distributed Genetic Algorithm," Evolutionary Computing, AlSB Workshop, pp. 199-212, 1997. [Murty 95] K. Murty, "Operations Research: Deterministic Optimization Models," Prentice Hall, 1995. [Nakamura 98] M. Nakamura, B. Ombuki, K. Shimabukuro, and K. Onaga, "A New Hybrid GA Solution to Combinatorial Optimization Problems-An Application to the Multiprocessor Scheduling Problems," Journal for Artificial Life and Robotics, Springer Verlag,vol. 2, pp. 74-79, 1998. [Nakano 91] R. Nakano, "Conventional Genetic Algorithm for Job Shop Scheduling," Proe. of the 5th lnt'l Conf. on Genetic Algorithms, Morgan Kaufmann Publishers, pp. 474-479, 1991. [Rechenberg 73] J. Rechenberg, "Evolution Strategy: Optimization of Technical Systems by Means of Biological Evolution," Fromman-Holzboog, 1973. [Ramat 97] E. Rarnat, G. Venturini, C. Lente, M. Slimane, "Solving the Multiple Resource Constrained Project Scheduling Problem with Hybrid Genetic Algorithm," Proe. of the 7th Int'l Conf. on Genetic Algorithms, pp. 489-496, San Mateo, CA, 1997. [Starkweather 91] T. Starkweather, S. McDaniel, K. Mathias, D. Whitley and C. Whitley, "A Comparison of Genetic Sequencing Operators," Proe. of the 4th Int'l Conf. on Genetic Algorithms, pp. 67-76, 1991. [Tamaki 92] H. Tamaki and Y. Nishikawa, "Paralleled Genetic Algorithm Based on a Neighborhood Model and its Application to Job Shop Scheduling," Proc. of Parallel Problem Solving From Nature II, pp. 573-582, North-Holland, 1992. [Uckun 93] S. Uckun, S. Bagchi, K. Kawamura and Y. Miyabe, "Managing Genetic Search in Job Shop Scheduling," IEEE Expert, pp. 15-24, Oct. 1993. [Vidal 93] R. Vidal, "Applied Simulated Annealing," Springer-Verlag, 1993. [Wall 96] M. Wall, "A Genetic Algorithm for Resource-Constrained Scheduling," PhD Thesis, MIT, June 1996. [Yamada 92]T. Yamada and R. Nakano, "A Genetic Algorithm Applicable to Large-ScaleJob-shop Problems," Proe. of 2nd Conf. on Parallel Problem Solving From Nature, pp. 281-290, Brussels, 1992. [Yamada 96] T. Yamada and R. Nakano, "Scheduling by Genetic Local Search with Multistep Crossover," Parallel Problem Solving From Nature (PPSN) IV, vol. 1141 of Lecture Note in Computer Science, pp. 960-970. Springer-Verlag, Berlin, 1996.



 ASSEMBLY SEQUENCE OPTIMIZATION USING GENETIC ALGORITHMS



LEE H. S. LUONG, ROMEO MARIN MARIAN AND KAZEM ABHARY



1. INTRODUCTION



Assembly is part of manufacturing and an obligatory process for all multi-component products. For a long time most effort in manufacturing research was directed at primary manufacturing processes. It was recognised that one virtually untapped source to reduce costs was assembly which, with limited exceptions, was still being carried out in the same way as almost a century ago (Redford and Chal 1994). There are two trends in today's manufacturing. The first one, due to diversified tastes of the consumers (,everything comes in at least 31 flavours' (Naisbitt 1982)) is towards mass customisation, with the direct consequence of reduction of production series. Another trend, due to globalisation ofmarkets, is the continuous and significant pressure to reduce price and time to market. The direct result of those trends is that assembly, identified for generally being the bottleneck of the production process, will increase its share in the lead time and cost of a product. Assembly planning offers a broad scope for improvements in assembly. Assembly Planning tries to determine a feasible method and layout to assemble a product from its components. Assembly Sequence Planning (ASP) is part of Assembly Planning. An assembly sequence is the most important part of an assembly plan and it affects other aspects of the assembly process-resources, assembly line layout, efficiency and cost-as well as various details in the product design. Automating the generation of assembly sequences and their optimisation can ensure the competitivity of manufactured goods.



234



 Assembly sequence optimization using genetic algorithms



235



This paper presents an approach to automatically generate and then optimise the assembly sequence of a mechanical product. The approach comprises an analysis of the problem, modelling aspects of the assembly processes and products from the point of view of assembly, and a series of algorithms designed to solve the ASP problem (automatic generation of feasible assembly sequences) and then to optimise it. The paper presents aspects of assembly planning and optimisation, then the ASP problem in the next sections. A brief literature review shows a number of previous techniques attempting to S/O the ASP problem and pinpoints critical topics, not properly adapted to a constrained combinatorial problem. Section 5 introduces the approach used to S/O of ASP problem in this research. A proper modelling of the ASP problem for solving and optimising it involves and reqUIres: - modelling the assembly processes-what assembly planning means in mathematical terms (Section 6); - modelling of assembly sequences-encoded as chromosomes (Section 7); - modelling the product for assembly purposes-encoding and storing constraints in assembly (Section 8); - defining a framework to encode quality measures in assembly (Section 9). All those aspects fully model and prepare the ASP problem to be solved and then optimised. The quality and extent of the modelling of the problem directly translates in the quality of the output and this justifies the ample space dedicated to modelling aspects in this paper. The optimisation algorithm-a Genetic Algorithm (GA) specially designed to accommodate the specific requirements of the ASP problem-is a population-based search algorithm in the space of solutions and its output is a population of optimal or near-optimal assembly sequences from which the best ones are selected. It is detailed, along with an example, in section 11. 2. BACKGROUND TO ASSEMBLY PLANNING AND OPTIMISATION



Assembly is product-oriented. An assembly plan and layout have to be developed for each product. Even a minor change in the configuration ofa product usually completely changes the assembly plan and layout, with important implications in costs and lead time (Nof, Wilbert et al. 1997; Tichem, Storm et al. 1999). As a result, it is necessary to formalise the process of automatically generating feasible assembly plans, a task which, to this day, has been essentially manual. On the other hand, due to the importance of assembly in the final cost and lead time of a product, generating good/optimum assembly plans for assembled products becomes a necessity. Optimising the ASP problem is considered to be more an art than a science. A feasible assembly plan can be easily devised for a product, but, due to the character of the problem, it is virtually impossible to determine its real value and how it compares with other plans. It is, therefore, essential to define a methodology to optimise assembly sequences and plans.



 236



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



The ASP problem is a large-scale combinatorial problem and is highly constrained. The number of potential assembly sequences is proportional to the factorial of the number of parts in the assembly (Combinatorial Explosion (CE) (Wolter 1991)). Absolute constraints-geometrical, precedence, accessibility and other types of constraints-severely limit the number of feasible assembly sequences. In general terms, automatically generating feasible assembly sequences is, in its full generality, an extraordinarily difficult task, shown to be NP-complete (Wilson and Watkins 1990) in both two-dimensional and tri-dimensional cases (Kavraki, Latombe et al. 1993; Wilson, Kavraki et al. 1995). As a result, most of the past and present work in this area focuses on restricted variants of the problem (Kaufman, Wilson et al. 1996), (Romney, Goddard et al. 1995). The ASp, as a theoretical problem, has to cope with the extraordinarily diverse character of assembly. Assembly, as shown below, can address sequential or non-sequential, linear or non-linear, monotone or non-monotone, coherent or non-coherent assembly sequences and plans or any combination of those, involving rigid, elastic, non elastic, solid, liquid or gaseous components or subassemblies. To be applicable in practice and useful, an assembly sequence planning and optimisation system has to be general enough to accommodate or offer the provision to consider any type of assembly plan and component, quality measures and requirements. Often, previous attempts to optimise the ASP problem simplified the nature of the assembly process to render it manageable by the techniques used. The adaptation of the problem has been done to such extent than the results are often non-representative for the initial problem and requirements. This paper presents an approach to Solve and Optimise (S/O) the full-scale, unabridged ASP problem. To achieve this, it is necessary to thoroughly analyse and model the assembly processes, the products of which the assembly sequences are to be optimised and the conditions and criteria for optimisation. Solving the ASP problem is an essential step prior to its optimisation. The ASP problem is solved by generating a feasible sequence to assemble an n-part product given its description and a number of supplementary constraints. To optimise it, Genetic Algorithms (GA) were used in this research to search an optimum or near optimum assembly sequence. The ASP problem is related to other two well-known combinatorial problems: the Travelling Salesman Problem (TSP) and the Job Shop Scheduling Problem (JSSP) (Gen, Zhou et al. 1998). Comparatively, the ASP problem is much more constrained than the TSP. Considering the analogy part (ASP problem)/city (TSP), the assembly sequence cannot start with any part (tour can start with any city), not all parts may be reached from/connected to any other part (any non-visited city may be) and not any valid connection between two parts can be done at any time (any valid connection between two cities can be done at any time). Also when comparing TSP andJSSP with ASP problem, the quality measures are well defined and invariant for TSP (distances between cities) and JSSP (the processing time for operations are known and do not change), whereas for ASP problem they are dependent on the operations already carried



 Assembly sequence optimization using genetic algorithms



237



out. In conclusion, there are essential differences between the three problems, which makes difficult, if not impossible, the direct transfer of the algorithms and approaches to S/O the TSP andJSSP to the ASP problem. Hence, S/O the ASP problem requires a special approach. 3. THE ASSEMBLY SEQUENCE PLANNING PROBLEM



Different authors have different definitions for the ASP problem. A formal definition for the ASP problem, as considered in this paper, is given below (Marian, Luong et al. 2003): Given: A mechanical product A composed of n components, A = {c1, c1, ... en}, that is to be assembled in a finite number m of Assembly Operations, 0 = (01, 01, ... om), m 2: n, and described in sufficient detail to permit the extraction and definition of the following information: 11. The contact/connection information: on 0 a set of relations C is defined, so as if (oi, oj) E C, it is said that oi and oj have a connection; Also, (oi, oj) E C {} (oj, oil E C; 12. The precedence information: on 0 a set of relations P is defmed: a binary relation (oi, oj) E P means oi has to be performed before oj. If (oi, oj) E P, then (oj, oil fJ. P; 13. The optimisation criteria: on A is defined a set of optimisation criteria QMk, kEN; 14. The optimisation function (or objective function) to be maximised, F; Determine: An Assembly Sequence (sequences) S = {sl, ... si, ... sn/si EO} so as the following conditions: Cl: V si E S, i =1, ... n, :3 sj E S, j E Nand j < i, so that (si, sj) E C (coherence condition); C2: V si, sj E S, i, j = 1, ... n, (si,sj) E P (precedence condition); C3: An assembly sequence S that maximises the value of F (optimisation); are fulfilled, in the following manner: a. Generating an assembly sequence S that satisfies conditions C 1 and C2 means solving the ASP problem (a feasible assembly sequence); b. Finding an assembly sequence that satisfies conditions C 1, C2 and C3 means optimising the ASP problem (an optimum assembly sequence).



 238



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



If the assembly sequence concerns only assembly of components-i.e. if an assembly operation represents a complex of elementary tasks that add a component to the partial assembly-the number of assembly operations equals the number of components: m=n. 4. LITERATURE REVIEW ON ASP



S/O the ASP problem has been attempted, using various approaches, with mixed results. Due to CE, classic optimisation methods failed to optimise industrial-size problems. AND/OR graphs and a hybrid A* algorithm were used to represent and select, through an iterative and interactive process, the optimal assembly sequence in Archimedes, a software package (Kaufman, Wilson et al. 1996; Jones, Wilson et al. 1998). The planner uses Non-Directional Blocking Graphs-NDBG (Wilson 1992)of each subassembly to determine the assembly operations that might be performed to construct a subassembly then it adds constraints. AND/OR graphs with weighted hyperarcs were also used to store feasible assembly operations for assembly in HighLAP-High Level Assembly Planning, another assembly software package. It selects an assembly plan minimising the costs from the initial nodes up to the goal node (Rohrdanz, Mosemann et al. 1996). The representations used for the optimisation approaches presented above utilize the AND/OR graph to store and evaluate all assembly sequences for a given assembly or to interactively select an assembly sequence by applying a number of constraints. They are limited either by the maximum number of sequences that can be stored (CE) or risk to lose valuable solutions by artificially limiting the extent of the graph through a number of initial artificial, simplifying, hypotheses. Simulated Annealing has been proposed by Milner et al. (Milner, Graves et al. 1994) to search a best assembly sequence. Being based on the totality of assembly sequences (to be generated, and represented in a directed-diamond-graph), and due to CE, the method is limited to reduced search spaces. Sebaaly and Fujimoto (Sebaaly and Fujimoto 1996) used a Genetic Planner for assembly automation. The information for assembly is stored in an implicit, and very compact, form in a reference and a connectivity matrix. To overcome the constrained character of the ASP problem, the complete search space consisting of all possible parts combination is clustered into families of similar sequences, where every family contains only one feasible sequence satisfying the problem constraints. The assembly sequences are generated without searching the complete domain of solutions. Even if CE is overcome using this approach, valuable solutions in the optimisation process are lost. From this literature review it can be concluded that the ASP problem has only been considered and S/O in a reduced format, generally for Sequential, Linear, Monotone and Coherent (SLMC) assembly plans (Wolter 1989) and/or for reduced size problems (generally less than 15 components). Often, due to massive simplifications, the results obtained are non-representative and can not be translated for real-life and industrial size products.



 Assembly sequence optimization using genetic algorithms 239



I



SOLUTION SPACE



I



MODELLING OFASP PROBLEM (REPRESENTATION ISSUES)



ASSEMBLY SEQUENCES



ASSEMBLY SEQUENCES



ABSOLUTE CONSTRAINTS



ABSOLUTE CONSTRAINTS



REPRESENTATION OF



REPRESENTATION OF



OPTlMISATlON CRITERIA REPRESENTATIONOF



CHROMOSOMES



REPRESENTATION OF



ASSEMBlYSEOUENCES



OPTlMISED ASSEMBLY SEQUENCES



REPRESENTATION OF



CHROMOSOMES



Figure 1. The approach used to solve and optimise the ASP problem using genetic algorithms (adapted from (Marian, Luong er al. 2003)).



To optimise the ASP problem , a special Genetic Algor ithm working on the whol e popul ation of feasible assembly sequences is proposed. 5. THE APPROACH USED TO S/O THE ASP PROBLEM



The methodology used in this research to S/O the ASP problem is presented with reference to Fig. 1. All elements and operations are briefly explained here and will be detailed in the remaining of the paper. Prior to optimising, the ASP problem has to be solved. The ASP problem is solved by generating an initial population ofsolutions or feasible assembly sequences (a succession of operations to assemble the produ ct from its components). An assembly sequence is feasible if it satisfies absolute constraints. O ptimising the ASP problem involves searching for an optimum or near optimum feasible assembly sequence according to an objecti ve fun ction based on optimisation criteria. GA were chosen for the optimisation of ASP problem for two reasons. The first reason is their ability to handle large-scale problems (population based, ergodic ,



 240



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



with probabilistic transition rules). The second reason is the flexibility in defming an objective function (use of payoff information, not derivatives or other auxiliary information to define the fitness function) (Goldberg 1989), (Haupt and Haupt 1998). The structure of the proposed GA is classic (Gen and Cheng 1997). To handle the CE and the constrained character ofthe problem, modified genetic operators that work only with feasible chromosomes were used (detailed in (Marian, Luong et al. 1999), (Marian, Luong et al. 1999), (Marian, Luong et al. 2000), (Marian, Luong et al. 2000). Other approaches, including penalty, reject and repairing strategy, were attempted by the authors in earlier stages of the research (Marian, Luong et al. 1999). They proved to be effective only for assemblies with a reduced number of components «15), and/or highly artificially constrained problems. In both cases, the solution space was relatively limited. GA work, in an iterative process, alternatively in the coding or model space and in the solution space. Thus, modelling and representation are the interface between the real problem, defined in the solution space, and the abstract replica of the problem defined in the model space. Prior to S/O the ASP problem, a number of representations have to be considered and defined: the representation of the assembly sequences (as chromosomes), the representation of the product for assembly purposes (representation of constraints) and a framework for the definition of quality measures (fitness function). The generation of the initial population of chromosomes is performed through a guided search (Marian, Luong et al. 1999), (Marian, Luong et al. 2000). It is a modified genetic operator, designed to overcome the CE by transforming the combinatorial problem of randomly generating an assembly sequence in a polynomial one (by generating and working only with feasible sequences). By generating the initial population, the first part of the problem (solving the ASP problem) is resolved. The initial population undergoes crossover operations, (Marian, Luong et al. 2000), also a modified genetic operator. The crossover heavily relies on the guided search and is designed to produce only feasible chromosomes. After crossover, the chromosomes are translated back to the solution space and are evaluated using a fitness function based on the optimisation criteria and quality measures defined on the solution space (Marian, Abhary et al. 2000). Once the assembly sequences have been evaluated, the best ones are selected. The selection is a classical operation, through a weighed roulette algorithm. It operates on an extended population of parent and child chromosomes (Gen and Cheng 1997). The optimisation is an iterative process. The result ofthe optimisation is a population of assembly sequences with a high fitness value (corresponding to better assembly sequences) from which the one(s) with the highest fitness can be selected. 6. A MODEL OF THE ASSEMBLY PROCESS



This section shows how a product and its associated assembly process can be modelled with graphs and the associated table of liaisons, what a feasible assembly sequence is in mathematical terms and how this can be converted in steps leading to S/O the ASP



 Assembly sequence optimi zation using genetic algorithms 241



problem . Being a compl ex problem, the modelling stage is of paramount importance for the ASP probl em . The mod elling of ASP problem with graphs will facilitate the representation and generation of assembly plans, generalisations for different degrees of detail and initial hypoth esis and the definition and generalisation of precedence relations. Mod elling of the TSP (Wilson and Watkins 1990) was used as a starting point for modelling the ASP problem: the cities to be visited are vertices in a graph and pairs of cities are connected with edges. In mathematical terms, solving the TSP requires findin g a Hamiltonian cycle for each vertex of the correspo nding graph. Optimising it means finding the smallest length Hamiltonian cycle (Ro nald 1995). 6.1. The graph of liaisons and the table ofliaisons



The graph model of the assembly and assembly process is an intuitive method to represent the ASP problem (Bourj ault 1984). The graph ofliaisons can be extracted, in a more or less automated mann er, from the solid model of the assembly (Golabi 1996). Th e graph ofliaisons (GL) is defined as a simple, connected, undire cted graph, GL = G(A, L) where A = {ai, i = 1 . . . n} is a non empty set of vertices ai representing the compo nents ci of the assembly. A vertex is, initially, the graphical representation of a component or a subassembly con sidered from the point of view of assembly as a part lij ) are the edges of the graph, w here (e.g. a roller bearing). L



={



..



IIJ =



{1



o



if there is a liaison between ai and aj otherwise



(1)



The term "liaison" will be used in a flexible manner throughout the paper. A liaison, in this case, represent s a connection between comp onents that touch each other. The mo del refers, initially, to the assembly of a mechanical produ ct from its n components in a sequential, linear, mon oton e, coherent (SLM C) mann er. An assembly plan is Sequ enti al if it can be decomposed int o a sequence of steps such that, during each step, only one component is added (a so-called two-handed plan). A plan is Coherent when each component that is inserted (except the first) will effectively touch some oth er previou sly placed component. A plan is Linear when all compo nents are added to the assembly one at a time (i.e. it never forms subassemblies). A plan is Monotone when each component is inserted imm ediately into its final position relative to the rem ainder of the assembly and an n- compon ent assembly is execured in exactly n - 1 oper ation s. An assembly plan can be cohere nt o r non-coherent, sequential or non-sequential, linear or non-linear, monoton e or non -monotone , or any combination of those situations (Wolter 1991). When a component is added to the partial assembly,all its correspo nding liaisons with the partial assembly are established. An assembly operation in stepj,j = 2 .. . n involves the addition of a compo nent ci (vertex ail and th e establishment of all liaisons between ci and all components assembled previou sly, in steps 1· . . (j - 1) (establishment of all



 242



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



c1



OJ



c2 c3



al



-(JDJ_--~un_-*f- __-aP c5



c4



a2



a4



a3



c7



c6



as



c9



a9



a6



a7



c8



a8



Figure 2. The electric torch and its graph of liaisons.



edges-but at least one (coherence)-between ai and the corresponding sub-graph of the partial assembly). In graph terms, this means that the graph ofliaisons is connected. This model will be generalised in the Section 7. The graph ofliaisons has the following properties (Wilson and Watkins 1990): • GL is simple-between two vertices there is only one edge representing a liaison that includes all contacts; • GL has no loops-there is no liaison joining a vertex to itself (it is impossible to assemble a component to itself); • GL is connected-if a component belongs to the product it has at least a liaison with another component of the assembly; as a result, a vertex belonging to GL is connected through at least one edge to another vertex in GL; • GL is an directed-if a component ai can be assembled to ai, the reverse is also true (a liaison is commutative). Figure 2 presents the electric torch, a classic example encountered, with some variations, in the assembly literature, and its graph of liaisons (c1-cap, c2-lens, c3-bulb, c4-reflector, cS-connector, c6-handle, c7, c8-batteries, c9-rear cap). The contact information between components is also stored in the table ofliaisons. The table of liaisons is the translation in table format of liaisons of a product, in fact of the adjacency matrix of the graph of liaisons (Wilson and Watkins 1990). The graph representation of the liaisons is very intuitive for a human but difficult to process by a computer, which, in turn, can easily handle the information in matrix form. Cell ij contains the value oflij, as defined in (1): if there is a liaison between components ai and aj in the graph of liaison, it will be designated by "1" in the corresponding cell ij (intersection of row ai with column aj) otherwise it is O. The table of liaison is



 Assembly sequence optimization using genetic algorithms



243



Table 1 Table of liaisons for the assembly in Fig. 2. (the electric torch). al al a2 a3 a4 as a6 a7 as a9



()



1 0



1 0 0



a2



a3



a4



as



a6



a7



1 1 1 0 1 0 0



()



()



()



()



()



0 0 1 0



1



()



()



()



0



0 1



1 0 0



0 0



()



()



()



0 0



0 0



0 0



() ()



1 0 1 1 0 0



as



a9



()



()



()



()



()



()



()



()



0 1



()



0 0 0



()



()



()



()



1



()



0 1



1



()



()



1



1 0



()



1



symmetric: if there is a liaison between ai and aj, there is also one between aj and ai. Table 1 is the table ofliaisons for the electric torch. 6.2. The wave model of the assembly process



The assembly process can be modelled as building up the graph ofliaisons. This model of the assembly process of a product is derived and transformed from the Huygens' Principle in Physics (Young 1992) for the waves. Huygens' Principle offers a geometrical method for finding, from the known shape of a wave front at some instant, the shape of the wave front at some later moment. Let's consider the analogy: fluid contained within the closed wave front-partial subassembly (subgraph). This model, adapted from the graph representation (component = vertex, liaison = edge) to assembly processes, permits to determine, at each stage, what components can be added to the partial assembly in the next stage so that the assembly process is feasible (coherence condition). In case only liaisons are taken into account, the wave model for assembly can be expressed, in the solution space, as follows: Each component of a partial assembly may beconsidered the element to which is added, in the next step, a component with which it has a liaison and has not yet been assembled. All liaisons between the newly added component and those of the partial assembly are established during this operation. The cycle is repeated until all components are assembled. The assembly process is equivalent to the addition of vertices and edges to thegraph of liaisons until all vertices and edges have been included. This process, when represented graphically, is similar to the propagation of a wave. The wave model of assembly is the foundation for the guided search algorithm, used to generate feasible assembly sequences, and is widely applied within the genetic operators proposed below. The assembly process starts with the first component to be assembled. The graph ofliaisons is, at this stage, a null graph-has only the first vertex-and no edges (the graph has no loops). The second component is added to the partial assembly (consisting of the first component) and their liaison is established in the graph of liaisons. The corresponding vertex and edge are added. The process is repeated; in each step all liaisons between the added component and the partial assembly are established. The assembly sequence is this succession of components added until the product



 244



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



is assembled. Relevant information for this model and for generating assembly sequences comprise the components to be assembled, liaisons between components and a number of conditions to be satisfied-constraints or precedence relations. The process is presented graphically in Fig. 5. and the assembly process can be visualised, in the graph representation, as the propagation of a wave. A generalisation of this process and the conditions in which it can be done will be presented later. This model and the graph of liaisons permit the definition of the framework to represent assembly sequences and the model of the product for ASP purposes (representation of precedence relations) . 7. A MODEL FOR THE ASSEMBLY SEQUENCES



The goal for modelling assembly sequences is to provide a framework able to represent and encode any solution of the ASP problem, namely sequential or non-sequential, linear or non-linear, monotone or non-monotone and coherent or non-coherent. Also, the possibility to encode components with variable geometry, solids, liquids and gases is considered. Next section presents the representation of an assembly sequence for a simple case, involving SLMC assembly sequences. This representation will, then, be generalised. 7.1. Representation of SLMC assembly sequences



Let's consider for the beginning the representation of an assembly sequence only as the order in which n components are assembled together to form a product, in a SLMC manner (m = n, n operations to assemble n components). This representation and the conditions in which this can be done will be generalised in the following section. An assembly sequence is encoded in a chromosome. The chromosome is a permutation of components of the product. A gene (a term of the sequence (Gen and Cheng 1997)) in locus j, j = 1 ... n-counted conventionally from left to right-encodes the addition of the corresponding component in the j-th step. Any partial chromosome with k genes, k = 1 ... n, represents an assembly state, where the first k components are assembled in a partial assembly and all their corresponding liaisons are established. A component, encoded as a gene, will appear in the chromosome once and only once. Because further constraints apply, an n-terrn sequence of components of the assembly may be an illegal or infeasible chromosome. A simple n-term sequence is an illegal chromosome if it is not a permutation. In this case a number of components appear more than once, thus not all components are present (e.g. for the torch in Fig. 1. alal-a3-a4-a4-a6-a7-a8-a9 is an illegal chromosome: a1 and a4 appear twice and a2 and as do not appear at all). A permutation is a legal chromosome and designates a tentative assembly sequence. All components are present but it is possible that the assembly cannot be realised because of the presence of constraints (e.g. geometry, unreachable positions-a l-a2-a3-a4-a5-a6-a7-a8-a9 is infeasible because a3 cannot be assembled to a2). A feasible chromosome represents a feasible assembly sequence. It is a constrained permutation, a permutation that complies with the assembly's specific constraints (e.g. a4-a3-a2-al-a5-a6-a7-a8-a9 is a legal and feasible chromosome, it complies with



 Assembly sequence optimization using genetic algorithms



245



n-lerm sequence (non - permutation) ILLEGAL CHROMOSOME Figure 3. Relations between chromosomes and assembly sequences.



all constraints). The relations between those concepts and related conditions are presented in Fig. 3. 7.2. Generalisation of the representation for non-SLMC assembly plans



The level of detail of an assembly plan should be variable so as to accommodate the specific needs of the user. For the most abstract situation, introduced above, the chromosome represents only the order of assembly of components: a4-aS-a3-a2-a 1 means the first component to be assembled is a4 to which are added as, a3, a2 and a1 respectively. In this case a gene encodes the addition of a component to the partial assembly and symbolises the totality of operations required to assemble a component or add it to a partial subassembly. In this case the chromosome is a synthetic notation of the assembly process and each gene encodes a composite assembly operation. The definition of a chromosome can be broadened to encode Entities Meaningful for the Assembly Sequence (EMAS), noted ai. (Marian, Kargas et al. 2003) A chromosome, in this case, becomes flexible and it can encode other types ofplans (non-SLMC). A gene/EMAS can be generalised to represent more than just the addition of a component to the partial assembly. Thus, non-SLMC assembly plans can be encoded. To encode non-SLMC assembly plans a vertex and a gene will no longer encode a composite assembly operation associated with adding a component to the partial assembly. The concept of 'liaison' changes for Non-SLMC assembly operations. It no longer represents only a contact between two parts of the product, but an edge between two adjacent vertices in the graph ofassembly, representing EMAS. The resulting graph ofassembly for components and operations (vertices) has the same properties as a graph of an assembly that includes only components. Non-sequential assembly plans



A 'non-sequential assembly sequence' is a contradiction, but assembly plans with nonsequential operations are frequent, so they should be considered in the representation. To include non-sequential operations in an assembly sequence, the non-sequential set of operations should be isolated, aggregated and referred to as an EMAS (e.g. ai) and encoded as a gene. The addition of two or more components in a non-sequential assembly operation (i.e. their insertion is to be done simultaneously in a co-ordinated motion along different trajectories) can be encoded in a gene as a complex operation.



 246



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



£



0.1



0.4



a.



0.3



b.



Figure 4. A product assembled with a non-monotone assembly sequence and its graph of liaisons.



In this case a chromosome represents a non-sequential assembly plan (an assembly sequence with a non-sequential element). Non-linear assembly sequences



In case a gene/EMAS represents a subassembly previously made elsewhere and added as is, a gene encodes the addition of the subassembly to the partial assembly. In this case the chromosome represents a non-linear assembly sequence. Non-monotone assembly sequences



A gene/EMAS can also encode special assembly operations, such as a manufacturinglike or an assembly-like operation not involving the addition of a part, for example quality checks on a partial assembly. The moment when the operation is performed may affect the quality of the assembly process. For the quality check, for instance, its complexity is different in various assembly stages as the accessibility is different. This enables the differentiation of the value of different assembly sequences. The possibility to encode diverse assembly operations permits the encoding of production line-related information. This allows also for the representation of non-monotone assembly sequences. Figure 4 presents a product requiring a non-monotone assembly sequence comprising addition ofcomponents cl , c2 and c3 and an assembly operation, a4, and its graph ofliaisons. Components cl , c2 and c3 and operation a4 can be encoded as EMAS: a1, a2, a3 and a4, respectively. In this case the chromosome a2-a3-al-a4 encodes, in a homogeneous notation, non-homogeneous information: component c2 is assembled first, to which c3 is added, then c1. In the last step c3 is pulled in the slot of cl (an assembly operation-a4-with no component added). Each substring, a2, a2-a3, a2-a3-al and a2-a3-al-a4 represents an assembly state, the last one encoding a more advanced assembly stage than the previous one. This generalisation permits even encoding the addition of fluids in a piece of equipment-an assembly-like specific operation-s-as a gene.



 Assembly sequence optimization using genetic algorithms



247



Pseudo-nan-coherent assembly plans



An assembly plan is always coherent: each part that is inserted (except the first) will touch or effectively touch some other previously placed part. There are, however, two notable exceptions. The first exception concerns non-linear plans, but in this case the assembly is subassembly-coherent. Furthermore, the coherence is respected in each subassembly and this case can be encoded and treated as presented above, for non-linear assembly plans. The second exception occurs when an auxiliary fixture or tool is to be used, temporarily, in an early stage of the assembly process, to hold a number of parts that are not in direct contact, before a connecting part is inserted. In this circumstances, the auxiliary fixture or tool can be considered as a component (or EMAS) to be added to and then removed from (a 'negative fixture' or EMAS is added to) the assembly. Thus, the assembly sequence is transformed from a non-coherent (due to representation) into a coherent and non-monotone sequence and can be encoded as above. Consequently, all assembly sequences are coherent and this will be used for the algorithms that generate assembly sequences. The 'non-coherent' assembly sequences will be designated as Pseudo-Non-Coherent. An example involving PseudoNon-Coherent assembly sequences will be detailed in Section 8 in conjunction with Fig. 6. The framework presented here can encode any type of assembly plan for any type of component or subassembly. It can even encode in an EMAS, besides rigid components with fixed geometry, components with variable geometry, even with variable volume. For example the fluid inside a heat pipe (a liquid and the vapours of that liquid) can be represented as an EMAS encoding all the elementary operations related to the addition of the fluid (as solid, liquid, vapours or double/triple-phase fluid). The representation of SLMC assembly sequences, well studied in the literature about assembly, can, thus, be extended to other cases. The degree of abstraction of the chromosome, the definition of EMAS/genes, the chromosome, as well as liaisons, contacts and specific rules governing the optimisation process are to be stated for every situation. 8. A MODEL OF THE PRODUCT FOR ASP AND THE AUTOMATIC GENERATION OF FEASIBLE ASSEMBLY SEQUENCES



8.1. Background



The model of the product for ASP purposes aims to provide the framework able to encode the information necessary in the process of generating feasible assembly sequences, namely the assembly constraints. The constraints are encoded in an implicit manner, to avoid the CE and its effects (effects characteristic to explicit representations). The complete model of the product for ASP purposes is defined and stored using all or some of the following: contact/liaisons information stored in the graph and table of liaisons, implicit precedence relations derived from liaisons, explicit precedence relations stored in the assembly table and Boolean relations.



 248



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



.b--0/ a6



a5 1.1.



1.2.



1.3



1.4



at



a2



a~



a5



a4



a5



a1



a9



a6



a8 a9



a6



a9



~~\O' \ 0/



al



•2



a a4



a5



a1



a8 a6 a9



al



2.2.



2.3



~ '\~I



2.4



~-/



2.5



•



1.5



.1



.8 a6 a9



a~



2.1.



0 a2



a~



al



a2



a~



aJ 2.9



aJ



~~



al



a2



a5



a9



-/



a1



a6



a8



a9



• a3



a~



a1



a6



as a9



a6



as a9



• ~/



a5



a5



a1



~o/~ ~?c!



al



.2



a a4



. -. -. -. -. a5



a1



a6



a8 .9



~~,



~



e- . .1



.2



.3 .4



.5



a



.5



2.7.



2.8.



a6



~c: ~ --/



al 2.6.



a3



al



a5



C!)



•



a6



.6



a8 a9



leg.....



°



•



comporlt1'll. not .s.HtI'IbItd



atse~1ed



16



\



..



component



~. _



,



-



tandldlte componerIC lor MIX!&.leg



assembly'.



a8 a9



·"e~1



a2



~



.5



aJ



a6



13



.9



e\.~ev .1



.6



a8 a9



e- e -



Figure 5. Illustration ofIPR and the wave model for assembly for the torch in Fig.!. (For two assembly sequences) (Marian, Luong et al. 2003).



An implicit representation is to be used in conjunction with an assembly sequence generation algorithm. The implicit representation has to enable, in each step, the extraction of the relevant information (from the pool of constraints) about which component(s) can be assembled in that particular step so as the resulting assembly sequence is feasible. The model of the product for ASP is developed in conjunction with the algorithms that generate feasible assembly sequences, presented in this section and Section 10. Constraints in assembly are divided into several categories, the most important of which are the absolute constraints and the optimisation constraints. The violation of absolute (Jones and Wilson 1996) or hard constraints (Sebaaly and Fujimoto 1996) produce infeasible assembly sequences. The optimisation criteria (Jones and Wilson 1996), (Sebaaly and Fujimoto 1996), weak constraints (Sebaaly and Fujimoto 1996)



 Assembly sequence optimization using genetic algorithms



d



a



a1



a



249



-a5



a5



a6



-a5



a4



Figure 6. Assembly process involving coherent and non-monotone assembly sequences (Marian, Luong et al. 2003).



or quality measures (Jones, Wilson et al. 1998) differentiate the quality of assembly sequences and plans. As a result the constraining relations in assembly are either absolute or optimisation constraints. Constraints on assembly plans come from a wide variety of sources: contacts and liaisons between components, design requirements, geometry, part and tool accessibility, assembly line and workcell layout, requirements of special operations, and special fixed (imposed) assembly sequences. Even supplier relationships can influence the choice of a feasible or preferred assembly sequence (Ames and Carlton 1995), (Jones and Wilson 1996), (Jones, Wilson et al. 1998). In the remaining of the paper, constraints will only refer to absolute constraints. An assembly planning system must have the ability to manage assembly constraints. Huang & Lee in (Huang and Lee 1988) and (Huang and Lee 1991) showed that there are certain precedence constraints among assembly operations. The precedence constraints knowledge plays a very important role in assembly planning because an assembly plan cannot violate those precedence constraints. Precedence constraints include information about what component(s) can be/are to be assembled in a certain step, what component(s) can be/are to be assembled next, the precedence relations between component(s), clusters, subassemblies and so on. From the point of view of generating a feasible assembly sequence, absolute constraints can be derived and defined as precedence constraints (Marian, Luong et al. 2003). For example in Fig. 4, the impossibility of inserting c3 in the corresponding slot of c2 after c2 and cl have been assembled, a geometrical constraint can be transformed into a precedence constraint: assemble c2-c3, then assemble d.



 250



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



The representation has to be capable of encoding all precedence constraints present in an assembly process. The logical combination of those precedence relationships constitutes a correct and complete representation for the set of all assembly sequences (Homem de Mello and Sanderson 1991) The graph ofliaisons and the coherence ofthe assembly process permit the definition of two broad categories of precedence relations in an assembly. Intrinsic precedence relationships are strictly specific to the liaisons between components/EMAS. Extrinsic precedence relationships, specific to the actual product and assembly process: geometry, part and tool accessibility, assembly line and workcell layout, etc. are stored in the assembly table and as Boolean relations. 8.2. Intrinsic precedence relations



Intrinsic Precedence Relations (IPR) are precedence relations in an assembly process, specific to an assembly state, derived only from the connectivity of the graph of liaisons and coherence of the assembly process (Marian, Luong et al. 2003). IPR are defined with the aid of the graph/table of liaisons and have the following properties: • IPR depend on the configuration of the product-liaisons between components/ EMAS; • IPR depend on the state of the assembly-they are volatile, specific to a particular assembly state and change completely for the next assembly state; • IPR can be used to determine candidate components for the next assembly state; Example: Assume the assembly process of the torch begins with a4, Fig. 2, and consider, for the beginning, only the liaisons and the associated edges, ignoring any other precedence relation. The connectivity and coherence of the graph ofliaisons dictates which component can be assembled in the next state (candidate components): a1, a2, a3 or as and can be written: (a4 < al)



v (a4 < a2) V (a4 < a3) v (a4 < as)



(2)



which expresses all the precedence relations (completeness) for generating an assembly sub-sequence that is feasible (correctness). Consider the second component to be assembled is as. (a4-aS) is the partial assembly at state 2. In this case, candidate components for step 3 are: a1, a2, a3, a6 or a7. This can be written as: ((a4 - as) < a 1) V ((a4 - as) < a2) V ((a4 - as) < a3) V((a4 - as) < a6) V ((a4 - as) < a7)



(3)



 Assembly sequence optimization using genetic algorithms



251



Suppose now that the assembly starts with as. Candidate components for the second state are a4, a6 and a7, which can be expressed as: (as < a4) V (as < a6) V (as < a7)



(4)



A number of conclusions can be drawn from this example: • Even if no precedence relations were defined for the product and assembly process, a number of precedence rules exist and have been defined (disjunctions (2), (3), and (4)); • Those precedence relations are different if the assembly process starts with different components; • Those precedence relations are different for different states of the assembly process; • Any of the expressions (2), (3) and (4) is complete (all precedence relations for the indicated state are expressed), and correct (any component added in the indicated step that complies with the conditions set by the expressions will lead to a feasible assembly step); A representation can take advantage of this class of precedence relations and drastically reduce the amount of information to be stored and analysed. The volatile character of IPR and the size of ASP problem reduces the probability for a particular precedence relation (for a certain stage) derived from the graph of a product to be ever used in generating an assembly sequence. As a result, IPR can be defined only as needed for each stage in the assembly process and can be used to generate assembly sequences through guided search. Those assembly sequences are feasible if no other precedence constraints are present. IPR can be used, in conjunction with an algorithm that takes advantage of the coherence of the graph of liaisons, to generate feasible assembly sequences, provided there are no other precedence relations defined for the assembly of the product. 8.3. Guided search algorithm for generation of assembly sequences considering only IPR



The guided search algorithm for randomly generating a feasible chromosomeconsidering only IPR-is presented below. It is the foundation for guided search of feasible assembly sequences and for the crossover operator. It is based on the wave model of assembly and works in conjunction with the table ofliaisons of the assembly (Table 1 for the torch in Fig. 2): Step 1: The first component is randomly chosen from the components ofthe assembly; Step 2: The candidate components for the next assembly stage are those for which the value of aij in the corresponding rows of the components already assembled of the table ofliaisons is '1'. Step 3: Randomly select a component from the candidates; Step 4: Delete the column of component selected in Step 3; Step 5: Repeat steps 2-4 until all components have been placed;



 252



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



The algorithm generates a perfectly random first component to be assembled. Then (using IPR), in step 2, determines candidate components for next assembly state, the ones that have a liaison with the partial subassembly and have not yet been assembled (after a components is assembled, its column is deleted, step 4), so the chromosome is a permutation constrained with IPR. The second vertex is randomly chosen from candidate components. The algorithm is very effective, at each run it adds a gene to the chromosome. Example: Considering the previous example, for as assembled in the first step, the candidates for the second locus are a4, a6 and a7. Choosing for example a4 in locus 2, the candidates for the locus 3 will be all components in contact with as and a4 and not already assembled, in this case al , a2, a3; a6, a7 and so on. The algorithm for IPR will be completed with other constraints to become the guided search algorithm for generating feasible assembly sequences. IPR and the wave model for assembly for the torch in Fig. 2., are illustrated in Fig. 5. The model only considers IPR, and no other supplementary constraint is taken into account. For two assembly sequences the wave front and IPR are completely different for each assembly state and for the same assembly state for the two cases (e.g. subassembly presented in 1.1-1.3. is: a4-a3-a2 and the subassembly presented in 2.1.-2.3. is as-a4-a6). The model can be directly generalised for EMAS instead of components as vertices in the graph of liaisons. IPR are easy to define, greatly reduce the size of the representation and the space needed to store it. IPR only include liaison constraints and establish the potential assembly sequences, i.e. is the maximum number ofassembly sequences if no other constraint applies. The precedence relationships are stored implicitly in n 2 contact relations between components. When defining the table of liaisons, all liaisons are set by default to '0'. Then, the relevant liaisons are input and the value in the corresponding cells is changed to '1'. This way, the liaisons are stored in 2L relations, where L in the number of liaisons of the assembly. Defining additional constraints (see next section) further reduces the maximum number of feasible assembly sequences: e.g. a1 and a4 cannot be assembled unless a2 is already assembled, as it cannot be added later; same for a3. 8.4. Extrinsic precedence relations



Extrinsic Precedence Relations (EPR) are all supplementary precedence relations that can be derived from the constraints in an assembly process and are not related to the graph of liaisons. EPR originate from constraints and requirements characterising the product-e.g. geometric and accessibility constrains-or the assembly process-e.g. assembly line layout, special operations, supply of parts. EPR are seldom assembly state-dependent and can be expressed as precedence relations between the establishment of one liaison and the establishment of another liaison (Marian, Luong et al. 2003).



 Assembly sequence optimization using genetic algorithms



253



The following sections present a framework developed and used to implement precedence relations (IPR and EPR), their definition, properties and auxiliary mechanisms and their implementation in the guided search algorithm. The same precedence relations can be defined in different ways, e.g. {(a1 /\ a2 /\ a3) < a4} is equivalent to {(a1 < a4) /\ (a2 < a4) /\ (a3 < a4)}. For this reason, the framework aims to offer straightforward tools to define and implement EPR frequently encountered in the practice of assembly rather than endeavour to offer an inflexible, exhaustive and excessively complex solution. The use of Boolean relations, for example, intends precisely to reduce the complications involved by using a rigid set of operators. The framework has an open character, new EPR can be defined as needed. 8.5. Implementation ofEPR



EPR are compactly stored in a table-the assembly table-directly derived from the table ofliaisons and related to the graph ofliaisons. EPR are encoded by using a series of operators. If no EPR constraints are defined for the product the assembly table will be reduced to the table ofliaisons. The assembly table is a table with the following properties: • the cell aij of the assembly table stores a collection of precedence relations between assembly of vertex/EMAS in row ai to the vertex/EMAS in column aj; • the precedence relations in a cell of the assembly table are encoded using a set of operators; • the relation between ai and aj is 'can be assembled to' in the conditions set by the operators in the corresponding cell (aij); • the assembly table may be non-symmetric: in an assembly sequence if ai can be assembled to aj in certain conditions, the reverse may not be (technologically or otherwise) true (e.g.: assembling wheel to a car-i.e. holding the car and assembling the wheel to it-is a feasible operation; the reverse-holding the wheel and assembling the car to it-however, even if not impossible and not infeasible from the geometrical and accessibility point of view is not a technological solution). EPR are implemented as filters in the guided search. They permit or prohibit assembling a vertex/EMAS. In some cases the operators trigger subroutines that will reduce the search space only to the vertices of interest for one or a limited number of assembly states (like for fixed assembly sequences or for clusters). EPR define precedence relations between two vertices adjacent to an edge CO', '1'), between a liaison and another liaison (reference liaison 'xi' plus '>xi'and '»xi'), Boolean relations between any two combinations of vertices/edges, between groups of liaisons and clusters, for SLMC/NON-SLMC assembly sequences. The operators encode precedence relations frequently appearing in assembly processes.



 254



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



8.5.1. EPR for individual liaisons



The operators presented here refer to individual liaisons and are defined as follows (Marian, Luong et al. 2003): - Operator '0': if cell aij contains the operator '0', ai cannot be assembled to aj in any assembly stage; - Operator '1': if cell aij contains the operator' 1', ai can be assembled to aj at any stage; - Reference Liaison 'xi', i E N (auxiliary operator): is a labelled liaison used to define other precedence relations. It permits the definition ofprecedence relations between a liaison and another liaison, i.e. precedence relationships between the establishment of one liaison and the establishment of another liaison. If a liaison is commutative and ifliaison aij is labelled as xi, aji is also designated as xi; - Operator' >xi': if cell aij contains the operator' >xi', ai can be assembled to aj after the reference liaison ('x') has been established. - Operator '»xi': if cell aij contains the operator '»xi', ai must be assembled to aj in the next stage after the reference liaison ('x') has been established. aim Operators '1' and '0' define direct precedence relations between EMAS. Operators '>xi' and '»xi', on the other hand, define precedence relations between a liaison and another liaison. Those operators are used to express precedence and are implemented as test filters that determine, at each stage, what EMAS can be assembled. If the EMAS is not present in the partial subassembly, the corresponding liaison cannot be established. Operator '»xi' permits the encoding of fixed assembly sequences. The edge between the first two vertices in the fixed assembly sequence is allocated xi, in the cell corresponding to which vertex to be added to the first. The subsequent liaison is imposed (»xi) and labelled 'xi + 1,' and the cycle repeats. This facility is rarely used. A fixed assembly sequence of a subassembly is not likely to be, in itself, of any important value in the optimisation process as it does not allow for variation. It can be simply replaced with a subassembly. The fixed assembly sequences are generally to be used for the analysis of assemblies including one or more subassemblies with reduced number of parts «5). 8.5.2. Boolean relations



Sometimes, precedence relations in assembly cannot be expressed as simple precedence relations between liaisons ('>' and '»" relations), but only as complex Boolean precedence relations between an EMAS/liaison or a group of EMAS/liaisons and another liaison or group ofliaisons. Boolean relations are very diverse, their number is generally reduced for an assembly process and they are difficult to implement as a collection of rigid operators. Thus, they can accompany the assembly table, complete it with supplementary precedence relations that cannot be or are difficult to implement in the assembly table and be checked for every assembly state during the guided search. The following example presents a practical use of Boolean rclatioris.



 Assembly sequence optimization using genetic algorithms



255



Table 2 Assembly table for the electric torch for a SLMC assembly process. Ir=~



al a2 a3 a4 as a6 a7 a8 a9



al



a2



a3



a4



as



()



a6



a7



>x2



()



()



()



>x2 x2, >xl xl



()



()



()



()



()



1



a8



a9



()



()



()



()



(J



()



()



0



()



()



()



()



()



I



1



0



>x2 0 >x2



()



()



x2, >xl



xl



()



()



()



I



()



()



()



()



I



()



()



()



()



()



()



1



()



()



()



()



()



()



()



()



0



1



0 0 I 0



()



()



()



()



1



()



1



()



()



0



1 ()



1 ()



Example: Model of a product for a SLMC assembly process For the electric torch, Fig. 2, the precedence requirements are: EPRl: liaison a2-a4 has to be done after a3-a4 (accessibility constraint), EPR2: liaison a1-a4 has to be done after a2-a4 (accessibility constraint); EPR3: as or a9 have to be assembled after a6, a7 and as have been assembled. The model of the product for SLMC assembly process includes: - the graph of liaisons, presented in Fig. 2; - the table ofliaisons (Table 1); - the assembly table containing operators defined as above (Table 2) for EPR1 and EPR2. The symbol 'Ir=~' stands for 'can be assembled to'; - the following Boolean relation for EPR3: (as V a9) > (a6 ;\ a7 ;\ as). This model encodes all precedence relations for the electric torch for a SLMC assembly process. The assembly table, operators and Boolean relations described so far permit the definition of any precedence relation between the establishment of one liaison and establishment of another liaison or state of the assembly process (Homem de Mello and Sanderson 1991). Other models of products for ASP for non-SLMC assembly processes are presented in (Marian, Luong et al. 2003). 8.5.3. EPR forgroups



if liaisons



The productivity of defining EPR can be substantially increased when considering groups of liaisons. This section presents the definition of EPR for a liaison and a component, EPR between groups ofliaisons and EPR for clusters. EPR FOR A LIAISON AND A COMPONENT. Defining reference liaisons for a component is simple: all liaisons of a component appear on a line/column in the assembly table. All non-zero cells from the table of liaisons for component ai (row and column ail



 256



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



are converted to a reference liaison 'xi', Besides the increase of productivity in defining EPR, this enables the definition of precedence constraints between a liaison and the assembly state (when a certain component is assembled) (Homem de Mello and Sanderson 1991). EPR BETWEEN GROUPS OF LIAISONS. EPR can be defined between two groups of liaisons. Let's consider a certain group G 1 of m liaisons has to be established after another group G2 of n liaisons. The liaisons corresponding to group G 1 are labelled xm and those in group 2 are labelled xn. The precedence relation can be expressed as a Boolean relation: (Vxm



E



Gl) and (Vxn



E



G2),



(5)



(xm) < (xn)



The definition of EPR in this case can be encoded in the assembly table by using Priorities, as follows: Pl



= xm



and



P2



= xn,



considering precedence relations defined in (5)



A priority P2 is a set of precedence relations between a group ofliaisons that all have to be established before another liaison or group PI ofliaisons. The higher the priority level, the sooner the liaison is to be made. In the process of generating the assembly sequence, the candidate vertices for a locus are those with the highest priority. Priorities are similar to reference liaisons, but they trigger a subroutine that acts as a filter and lets only the vertices with the highest priority level to be assembled prior to considering the next priority level. Use of priorities or reference liaisons is a question of preference and implementation. An example including priorities is presented in Fig. 6 and Table 3. Using ofpriorities permits the definition of (m + n) precedence relations instead of (m x n) Example: A model for a pseudo-non-coherent assembly process using priorities 3 pegs, al , a2 and a3 have to be assembled-projection welded-to the handle a4 (Fig. 5). Surfaces a have to be in the same plan and the position of the pegs is strictly determined. Pegs al , a2 and a3 are, thus, inserted (order not important) in the Table 3 EPR encoded as priorities for the assembly in Fig. 6. [F=~



a1 a2 a3 a4 a5 a6 -as



a1



a2



a3



a4



a5



a6



-a5



()



()



()



0 0



0



0 0



P2 P2 P2



P4 P4 P4



P3 P3 P3



P1 P1 P1



()



()



0



0



()



()



P2 P4 P3 P1



()



P2 P4 P3 P1



P2 P4 P3 P1



0 0 0 0



0



0



0 0 ()



0



 Assembly sequence optimization using genetic algorithms



257



corresponding holes in the fixture as, their upper surface is levelled (ground)-a manufacturing operation, a6, then handle a4 is projection welded (assembly order not important). The assembly is then removed from the fixture (a 'negative fixture', -afi, is added. The graph of liaisons is shown in Fig. 5. EPR for this assembly are shown in Table 3. This case involves operations, negative fixtures and priorities. EPR FOR CLUSTERS. Clusters are groups of parts that are to be assembled in an uninterrupted sequence. An assembly sequence that will assemble other components amongst those of the cluster can be a feasible one but is most likely to be less valuable. It would imply changing of tools, grippers, and so on. Information about clusters can be encoded as absolute constraints, meaning the components of a cluster are to be assembled in an uninterrupted sequence. The appropriate liaisons are encoded as clusters: kCi, where k is the name of the cluster, kEN, and i is the number ofcomponents ofthe cluster. Once an component of a cluster kCi has been assigned in a locus ofthe chromosome, the candidate components for the next i-I genes are only the components of the cluster, until all have been assigned. Clusters are similar to priorities, but priorities can set a hierarchy of sets of parts to be assembled, whereas clusters can be assembled whenever the first component has been considered. The definition of EPR shown here offers a framework for how to structure and represent constraints as precedence relations. The representation permits a flexible definition of EPR for different sets of initial hypothesis then the fine-tuning of the problem to answer the particular requirements of the user The same EPR, using the relations defined so far, can sometimes be expressed in different ways, giving the user the flexibility to utilise the most appropriate relations for its particular application. A major advantage of the proposed implicit representation is that, unlike explicit representations, permits the iterative and gradual definition of relations between EMAS and assembly constraints. ASP problem can be solved initially for a simplified set of initial hypothesis, e.g. the assembly sequence considering only components of the products in a SLMC process, then this process can be expanded to incorporate operations, fixtures, subassemblies, etc. to make the process more realistic. 8.5.4. Definition of the assembly table



The assembly table is an important tool in S/O the ASP. It completes the table ofliaisons and both tables, together with the Boolean relations, encode precedence information about the assembly process (as IPR and EPR). They are the database containing the pertinent data about assembly and their quality is reflected in the assembly sequences generated and then optimised using them. The assembly table contains most or all of the EPR. Besides the normal information available and used to build the assembly table, it is useful to consider the reduction in the number of liaisons. This can be done where possible and when this does not lead to unwanted loss of assembly sequences.



 258



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



In most cases a number ofliaisons present in the table of liaisons cannot be executed directly. They appear as parasitic liaisons, due to contacts between components that touch each other but can never be directly assembled, e.g. a bolt-washer-nut assembly, when the bolt-nut can only be assembled after the washer-nut or bolt-nut have been assembled. The bolt-nut contact and liaison is such a parasitic liaison. Golabi proposed, for those situations, the pruning of unnecessary links (Golabi 1996) or liaisons. Once the unnecessary links are pruned, the definition of EPR is greatly simplified because EPR are defined for a reduced number of liaisons. As a consequence, the generation of feasible assembly sequences as well as the Genetic Operators can be simplified. For the definition of the assembly table the following procedure have to be followed (Marian, Luong et al. 2003): Step 1: Define vertices (as components, subassemblies, fixtures (pairs of vertices),operations); Step 2: Determine liaisons-relations between related vertices; Step 3: Draw the graph ofliaisons and define the table ofliaisons; Step 4: If possible, prune unnecessary links (Golabi 1996) (seriously simplifies further definition of assembly table); Step 5: Define precedence relations in the assembly; Step 6: Define assembly table, initially as a copy of the table ofliaisons; Step 7: Input precedence relations in the assembly table for simple precedence relations as operators; Step 8: Input complex precedence relations as Boolean relations; Those steps are conceptual. Being, NP-complete, ASP problem cannot be solved automatically at this moment. A more formal and automatic approach would most probably complicate the extraction and definition of the assembly table. No fixed algorithm to construct the assembly table was developed due to the extreme variety of relations and operators that can be defined and the impossibility to anticipate them at the design stage (e.g. supplier constraints are not available when planning the assembly process). Moreover, automatic extraction of information and precedence relations about elastic elements and fluids, to name just two, has not been done yet. However, a number of those steps can be automated and the information extracted from the CAD model, especially for simple products (Golabi 1996). From experience, building the assembly table for a product with 100 vertices takes about 0.3-1 hours for an engineer, depending on the complexity of the assembly and the degree of detail sought for the assembly sequences. It is important to note: only 2L relations have to be added to the table ofliaisons, only half(L) have to be manually input (symmetry) and most products have subassemblies and clusters, speeding up the process. 9. QUALITY MEASURES FOR ASP AND THE FITNESS FUNCTION



A vital issue in any optimisation process is the definition of an objective function able to discriminate between the more and the less desirable solutions of the problem. This



 Assembly sequence optimization using genetic algor ithms



259



issue becomes criti cal when the quality of a solution cannot be easily expressed as a mathematical function . For the ASP problem optimised using GA, the objective function is designated as the Fitn ess Function (FF). The FF has to attach a value to each assembly sequence, proporti onal to the wo rth of the sequence in the assembly process. The FF has to differenti ate, using quantit ative and measurable criteria, the quality of an assembly plan. The number of optimisation cr iteria that can/ have to be used to assess the quality of an assembly sequence is overwhelming and the definit ion of a 'good ' or 'better' assembly sequence has, sometimes, a subj ective co nnotation. It may dep end , besides the objective differences based on the configuration of the produ ct as such, on subj ective issues, external to the produ ct. Th ose external issues include , amo ngst others: the assembly line layout, the personal preferences and techni cal culture of the person who does th e assembly planning, preferenc es of the assembly operato rs, socio- economic conditions and so on. An assembly sequence optimised for a produ ct to be assembled in a developed country can be fairly poor when the same produ ct is assembled in a developin g country and vice-versa. This is due to the differences in constraints for the two cases (e.g. wages, level of automation, quality and reliability of supply and communicatio ns, to name j ust a few). A framework able to define the quality of an assembly sequence, be simultaneo usly objective, flexible and simple, although not easy to develop, is an indispensable step towards the optimisation of the ASP problem . It is also essential for the implementation of standards in this dom ain so as to avoid non-desirable, subjec tive, approaches in S/O the ASP problem . The followin g prin ciples are used in the definition of the framework for calculating the fitness function : • Th e FF, as a final indicator, is a composite mathematical cost functio n of the quality of an assembly proc ess. • The quality of an assembly process depends of the quality ofeach assembly operation ; • The FF for the assembly of a produ ct is the sum ofpartial fitness values correspo nding to each assembly op eration ; • T he quality of the same assembly operation involving or not the establishment of a liaison / contact between two com ponents depends on wh en the assembly operation takes place relative to other operations; • The optimisation seeks to find an assembly sequence with a maximum/near maximum value of the FF, thu s a goo d/ best sequence has a high/highest value of the fitness function; • The facility to perform an assembly operation is rewarded with a high value of the corre sponding partial fitness, whereas the difficulty to perform the operation is penalised accordingly; • The FF sho uld be able to accommo date a variety of quality measures (like time for assembling the produ ct, costs involved, difficulty to assemble each ofthe comp on ents, etc.);



 260



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



• The FF should be able to accommodate a singe optimisation criterion as well as multiple optimisation criteria; • The FF has to have an open character such that the importance of each optimisation criteria can be modified along several optimisation runs; • The FF has to have an open character, to offer the provision that other criteria can be considered, defined and added at a later stage, as need may arise; 9.1. The fitness function for a single optimisation criterion



For a single optimisation criterion, 1 the FF is defined as:



(6) where PFF 1 (i) is the partial FF for the criterion 1 associated with the assembly of a component/EMAS in locus i. FF1 (i) is defined between 0, corresponding to an ideally bad assembly step, and 1, corresponding to an ideally good assembly step (Marian, Abhary et al. 2000). The FF is defined as a sum of partial fitness functions, PFF(i), each corresponding to an assembly task. It indicates the facility to add an element or subassembly to the existent partial assembly. The value of the FF depends on the definition of the optimisation criterion and the position of the gene in the assembly sequence but is non-dimensional. This makes comparisons between optimisation with different criteria independent of the actual criteria and allows optimisation with different criteria to be combined together for multi-criteria optimisation. PFF j (i) can be a obtained from any linear or non-linear, differentiable or nondifferentiable, continuous or non-continuous function. The reason for this extreme elasticity in defining the fitness function is that the GA only needs a value of the fitness assigned to each individual in the population, not the way this value is obtained or varies from an individual to its neighbour. The optimisation ofthe assembly process may be attempted for the fundamental final indicators: assembly cost, time and performance/reliability or for a number of specific or composite criteria (implemented in this case in partial FF functions as presented below). The definition of the FF can use the geometrical and physical cost evaluation for assembly planning as presented in (Mosemann, Rohrdanz et al. 1997). Care should be taken to correctly assign the function so that the highest value correspond to the best assembly situation. Two evaluation criteria adapted from (Mosemann, Rohrdanz et al. 1997), used hereafter in an example, are presented below: Separability



The separability is an optimisation constraint directly derived from the geometry of the assembly and is conceptually similar to the accessibility presented in (Hsu and Lin 1997). It quantifies the projection of the approach/depart of a component with respect to a



 Assembly sequenc e optimization using genet ic algorithms



261



subassembly. A better accessibility is the result of an increased projection angle: A : P(C) ---+ [0 · · · I]. (Al , A2) ---+ A(AI , A2) =



Area[n Ods(A1. A2) U] • 47r



(7)



whe re P(C) is th e subset of all compo nents of th e assembly, A 1, A2 the two subassemblies (in th e trivial case co mponents) to be assembled in the current step i and (lds(A1, A2),U) is the proj ection on th e unit sphe re of th e local depart space. An extended Ids dire ctly translates into an extended proj ection on the un it sphere and con seque ntly in a bett er separability/accessibility and a bett er part ial fitness.



n



Reorientation



Th e reori entation requirem ent is an assembly process-related constraint. In case of instability of the subassembly or need of reorientation du e to the assembly cell environment, assembly costs are assigned for each reorientation R (ai) corresponding to the stage whe n component ai is assembled : R(al) = 1 - - . 7r M .



ct



Tnai



(8)



whe re a-reorientation angle, mai-mass of reoriented part ial assemb ly before ai is assembled, M-mass of the enti re produc t; If the reorientation angle is zero, th e reorientation cost for th e selected assembly op eration is also zero so th e quality of th e ope ration from the point of view of th e reori ent ation is maximu m. A large number of similar constraints are defin ed in the literature (M osem ann, R oh rdanz et al. 1997). Eventually the number of cr iter ia actua lly impl em ented will be a trade-off between the desired degree of optima lity, the cost for th e imp lem entation and computatio n resources required. 9.2. The fitness function for multi-criteria optimisation



For multi-criteria optimisatio n, th e FF is defin ed as:



= a 1 . FFt + a2 . FFz + ... + a k . FFk where al + a2 + ... + ak = 1.



FFj;



(9)



(10)



In orde r to normalize and systematise th e FF, the partial FF and th e multicriteria FF1; are defined between 0 and I . T he reason is to have the possibility to make co mparisons wi th an ideal assembly sequence for an idealised produ ct. The relative impo rtance of each criter ion can be adjusted th rou gh the coe fficients a . T he fitness of the best assembly seque nce may be far from the ideal value of 1 because the inhere nt difficu lty to build certain assemblies due to th e specific constraints. It is also a rough measure of the overall difficulty of th e assembly der ived from th e difficult ies to assemble each EMAS .



 262



Lee H . S. Luong, R omeo Marin Marian and Kazem Abhary



Figure 7. A 4-comp on ents product for the computation of the FE



Example: Co mputation of FF A simple example on application of the FF as defined above is presented for the 4component produ ct shown in Fig. 7. Co mponents are as follows: c1-an L-shaped steel profile; c2- a steel bar with rectangular cross-section, c3- a screw with cylindrical head and socket, attaching c2 on c1, and c4 is another screw with a square head. The vertices/genes are: al , a2, a3 and a4 cor responding to components c1, c2, c3 and c4, respectively. This example is presented to illustrate the comp utation of the value of the FF for 2 different chromosomes: C 1: al -a2-a3 -a4 and C2 : a4-a l-a2- a3. M ass details are: c1-lOkg; c2- 4kg; c3-2kg; c4-2kg. The separability/ accessibility values (application of (7)) corre sponding to the assembly of components, for C1: a1-0.5 (hemisphere), a2-0.25, a3-0.25 , a4- 0.25 (quarter of a sphere), and for the application of (8) to the assembly for C 2: a4-0.5 (hemisphere), a1-0.35 (accessibility improved because a2 not yet present), a2-0.02 (from (R ohrdanz, M osemann et al. 1997), accessibility through a line), a3- 0.25, (assuming the assembly is done down wards). The FF is defined in this case as: FF = at . FF I + a 2 · FF2.where o l = a 2 = 0.5, the partial fitness functions are FFI-accessibility and FFrreorientation ;



+ 0.25 + 0.25 + 0.25)/ 4 = 0.3125; FF I (C 2) = (0.5 + 0.35 + 0.02 + 0.25)/ 4 = 0.28; FFI (CI ) = (0.5



 Assembly sequence optimization using genetic algorithms



263



So from the point of view of the accessibility, the chrom osome C 1 is better than C2 . FF2(C l )



= (1 + 1 + 1 + 1)/ 4 = 1 (no reori entati on)



FF2(C2) = (1 + 1 + (1 - (rr . 12/rr . 18)) + 1)/4 = 0.835



From the point of view of reor ientation , C 1 has again a higher fitness function.



= o l . FFI (CI ) + a 2 · FF2(C l) = 0.5 ·0.3125 + 0.5 ·1 = 0.65625 FF(C2) = o l . FFl (C2) + a 2 · FF2(C2) = 0.5 ·0.28 + 0.5·0.835 = 0.5575 FF(C I)



The fitness of the chromosome 1 is higher than for chromosome 2, so ir will have a better chance to be selected in the next generation throu gh th e weighed roulette in the GA presented in Fig. 1. 10. THE GENETIC ALGORITHM FOR THE OPTIMISATION OF ASSEMBLY SEQUENCES



Th is section presents the components of the GA designed for the optimisation of the ASP problem. The ASP problem has a number of particularities, as shown at the beginn ing of the paper, and the GA to optimise it should possessspecial features. The generation of the initial population , by guided search and the crossover operator are presented in detail below. T he evaluation process has been presented in section 9 and the selection process in a classical one, so th ey are only succinctly show n. The struc ture of th e GA is classic, as presented in Fig. 1. It operates on an extended population of parent and children chromosomes. Using a sufficiently extended initial population and an extended sampling space of both parents and offspring th e premature convergence was avoided without the need to use mutation . A mut ation operator, although possible to develop, would be much more complicated than the guided search operator or the crossover operator, and would have disputable results. Indeed, due to the highly constrained character of the problem, to ensure th e feasibility of the offspring, mut ation would most likely change more than two genes, with a high probability to change all the genes at the right of the first mutation locus. An alternative to mut ation , which was considered during the research and is always available if needed, was to randomly replace a feasible chromos ome in the populati on with a new feasible chromosome generated through guided search, at a rate similar to normal mutation rates. But , so far th e GA worked without the mut ation operator. 10.1. Automatic generation of assembly sequences by guided search



T he solutions of the ASP problem are generated by guided search, when both lPR and EPR are considered. The guided search operator for chromosomes is based on the guided search that considers only IPR. lPR define the maximum number of assembly sequences when no other constraint is present. Adding precedence constraints will reduce the number of feasible assembly sequences. Preceden ce relations encoded



 264



Lee H. S. Luong. Romeo Marin Marian and Kazem Abhary



with operators in the assembly table and Boolean relations are implemented as test filters that, applied to candidate elements for assembly in a particular stage, eliminate those that would produce infeasible chromosomes. Conceptually, the guided search for chromosomes considering IPR and EPR is as follows: Step Step Step Step



Step Step Step Step



1: The first vertex is randomly chosen; 2: Test vertex defined in step 1, if it doe not satisfy EPR, repeat step 1; 3: Delete column of first vertex; 4: The candidate vertices for the next assembly stage are those for which the value of aij in the corresponding rows of the elements already assembled of the table ofliaisons is non-zero; 5: Test vertices defined in step 4 and eliminate vertices that do not satisfy EPR; 6: Randomly select a vertex from the candidates; 7: Once a vertex been selected, its column is deleted; 8: Repeat steps 4-8 until all vertices have been placed;



The algorithm generates a perfectly random first component to be assembled (step 1). This component is tested to determine if it satisfies EPR (step 2). If not, it is rejected and first 2 steps are repeated until a suitable vertex is found. Then, in step 3, candidate vertices for next assembly state, the ones that have a liaison with the partial subassembly (IPR), have not yet been assembled (after a components is assembled, its column is deleted, step 6 so that the chromosome is a constrained permutation) and satisfy EPR are detected. The second vertex is randomly chosen from candidate vertices. The algorithm is very effective, at each run it adds a gene to the chromosome, except for the definition of first gene. Generating and testing the first gene was faster than checking every vertex for all the EPR that appear in the assembly process. 10.2. The crossover operator



The offspring chromosomes have to satisfytwo conditions: they have to be feasible and they have to keep most-if not all-the parents' properties. Due to the constrained character of the ASP problem and due to the integer representation used for chromosomes, a modified crossover operator had to be developed. The crossover operator relies heavily on the guided search. It operates with feasible chromosomes and, from pairs offeasible parent chromosomes undergoing crossover, the result is pairs of feasible offspring chromosomes. The crossover operator is presented, conceptually, below: Input: the population of parent chromosomes Step 1: Randomly select pairs of parent chromosomes; Step 2: For the first pair of parent chromosomes randomly select the cut point; Step 3: For the first parent chromosome;



 Assembly sequence optimization using genet ic algorithms 265



Step 4: For each locu s at the right hand side of the cut point : - determine by guided search the candidate vertices for the gene. If the gene in the corre sponding locus from the other parent is amo ngst candidates, Then cho ose it Else place any other candidate gene ; Step 5: R epeat step 4 for the second parent chromosome ; Step 6: R epeat Steps 3- 5 for the remaining pairs of parent chro moso mes; O utput: a populati on of feasible offs pring (children) chrom osom es; T his algorithm is able to produ ce feasible chromosomes, as the candidates for each locu s are determined by guided search. Moreover, for each locus, as the corresponding gene from the other parent is the first to be chosen, if it exists amo ngst candidates, the crossover preserves the maximum of the parents' properties. 10.3. The fitness function



Each chromosome corresponds to an assembly sequence. After the crossover, the entire population, parents and children chro mosomes, undergoes evaluation . A fitness value is associated with each chromoso me. The fitness function used in the GA is as defined in Sectio n 9. 10.4. The selection process



T he sampling me chanism (how chromoso mes are selected from the sampling space) used is the stochastic sampling, associated with the H olland 's prop orti on ate selection or roulette wheel selection (Holland 1975). The selection probability or the surv ival probability for each chromoso me is proporti onal to its fitness value: a chromosome with fitness value f; is allocated f;/fm offipring, where fm is the average fitness value of the popul ation . A string with a fitness value higher than the average has a chance of allocating more than one offspr ing, while a string with a fitness value less than average my have no offs pring in the next generation. The prop ort ion ate selection allocates fractional numbers of offspri ng to strings. 11. A CASE STUDY



The example selected to illustrate the algorithm is the Assembled Body of a hydraulic motor, presented in Fig. 8 (Marian, Luong et al. 2001; Mar ian, Kargas et al. 2003) designed and built as a prototype by one of the authors. The produ ct has the following compo nents: cl-body, c2, c3, c4, c5- lower bushes, c6, c7, c8, c9-middle bushes and c10, e l l, c12, cl3- upper bushes. The bushes are essentially cylindrical and are pressed (press fit) into the corre sponding holes bored int o the block. T he model of the Assembled Body for ASP: The graph of liaisons of the Assembled Body is presented in Fig. 9. The vertices correspo nd to th e components of the produ ct; c1, c2 .. . c13 are represent ed as vertices a I , a2 . . . a13. Table 4 is the table of liaison s of the Assembled Body.



 266



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



c6



c10



c7



c11



c8



c12 c9



c13



c1



Figure 8. The assembled body of a hydraulic motor.



upper bushes



middle bushes



'1-t--.----j'---,--+-----,-+----l a1 body



lower bushes Figure 9. Graph of liaisons for the assembled body of the hydraulic motor presented in Fig. R. (Marian, Luong et a1. 2003).



 Assembly sequence optimization using genetic algor ithms 267



Ta ble 4 Table of liaisons for the assembled body presented in Fig. 8.



al a2 a3 a4 as a6 a7 as a9 a lO all a12 aU



al



a2



a3



a4



as



a6



a7



as



a9



a lO



0 1 I 1 1 1 1 I I 1 1 1 1



1 0 0



1 0 0 0



0 0 0



1



1 0



1 I 0



1 0 1



1 0



()



1 0 0 1



1 0 0 0



()



()



0



0 0



()



0 1 0



() ()



1



0 0



0



0 0 0



0 0 0



0



0 0



0



0



0



0



0



I



()



0 0



()



0 0 ()



0 0



1 0 0



0 0



0 0 0 0 0 0 I



()



0 ()



0



0 0 0



0 0



0



I 0



1



0



() ()



1



()



0 0 0 0



0



1 0 ()



0



0



0



0 1



al l



a12



a13



1



1



1



()



()



()



()



()



()



0



0 0 I 0



() ()



0



0 ()



0 0 0



I



0 0 0



0



()



0



()



()



()



0 0 0



()



()



()



0



()



()



(]



0



1



Table 5 Assembly table for the assembled body present ed in Fig. 8.



aI a2 a3 a4 as a6 a7 as a9 alO al l a12 aU



al



a2



a3



a4



as



a6



a7



a8



a9



al0



all



a12



al 3



0 I I I 1 xl x2 x3 x4 >x l >x2 >x3 >x 4



1 0



1 0



0



1



I 0 0



xl I 0 0 0



x2



x3



x4



>x 4 0



()



1



0 I



> x2 0 0



> x3



I 0



> xl 0 ()



() ()



0 0



0 0 0



0 I



0



()



0



0 0 1 0 ()



0 0



0 0 ()



0 0



0



0 I 0



0 0



0 0



0



0 0 0



0



()



1



0



0 0 0



0



0 0 0



0 0 I 0



0 0



0



0 0 0 0



1 0 0 0



0 0



0 0 0 0 0 I



0 0



0 0 0 0



0 0 0



0 0 I



()



0 0 0



0 0 0 I



0 0 0 0 ()



0



0



1



()



0



0 0



0 ()



0 0



0 0 0



1



0 0



0 0



0



()



()



0 0



1



0



0



0 ()



For this example, the assembly process is SLMC, each gene corresponds to the addition of a component to the partial subassembly. The conditions for assembly (EPR ) are the following: upp er bushes are to be assembled after middl e bushes are assembled to body (geom etric conditions deri ved into prece dence conditions). T hose EPR are imple mented in the assembly table as follows: - Liaisons between middle-bushes- body are allocated references X I . .. x 4; - Liaison s upper bushes- middle bushes are to be done after x l ·· · x 4, so they are allocated (> x l ) .. . (» x 4) - T he relation s are symmetric; Table 5 is the assembly table for the Assembled body. T HE FITN ESS FUNCTION. It is assume d that compo nents are assembled vertically, downwards. Bushes have to be assembled (pressed) in the body, even if the reverse is possible.



 268



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



The body, being heavy and difficult to manipulate (around 15 kg), should be the base component, to which other components, the bushes (0.2-0.4 kg) are added/pressed into, not vice-versa. Assembling body to bushes, rotating the body or changing the type of bushes to be assembled carries a penalty. The FF for the assembly process of the Assembled body is defined as: FF = PFF(l) + PFF(2) + ... + PFF(13) 13



(11)



The PFF(i) are definedasfollows: PFF(i) = (l-PF(i))



(12)



Where PF(i) is a penalty function corresponding to adding a particular component to the partial subassembly in step i. The penalty function approach was chosen because the penalties are easy to define, realistically capture the difficulties associated with the assembly process, the number of penalties to consider is relatively reduced and the evaluation is simple and straightforward. The values of the PF are: - if the body is assembled to a bush: PF = 0.8; - changing type of bush: penalty ofPF = 0.3; - rotating block PF = 0.5; The body has to be rotated at least once and this operation carries a penalty of 0.5. Changing the type of bushes carries a penalty of 0.2 and will occur at least twice in the assembly process. The penalty function is 0.8 for adding the body to a bush but this can be avoided if the block is the first to be assembled. The results of a representative run of the GA are shown in Fig. A5. The population is 50 chromosomes and generally the best chromosomes appear before the 30-th generation. The best results correspond to a sequence starting with al , then a2, a3, a4 and a5, in any order (4! = 24 possibilities), then the body is turned-penalty ofO.5-followed by assembling bushes a6, a7, a8, a9 in any order (4! possibilities)-penalty of 0.2 for changing type of bushes-then alO, all, a12, a13 in any order (4! possibilities)penalty of 0.2 for changing type of bushes. The maximum value of the FF is FFmax = (13 - 0.5 - 0.2 - 0.2)/13 = 0.93 and 13824 'best' sequences. The number of best sequences could is the fitness of (4!)3 be further reduced, by using supplementary penalties, but in this case this would complicate matters without being of much benefit. The results obtained by optimising the assembly sequence with GA were consistent with the actual assembly sequence used when physically building the hydraulic motor the assembled body was part.



=



12. CONCLUSIONS



This chapter has presented a methodology to optimise the Assembly Sequence of a product using Genetic Algorithms. The ASP problem is particularly difficult to



 Assembly sequence optimization using genetic algorithms



269



0 .9 c: 0 .8



0 ;: u



c:



..=



0 .7



u:



0.6



'" '" '" .s



0 .4



.J-,



....



r-.



0



~



M



~






en ~



N N



IL)



N



co



.....,



N



~



M



~



,.:. M






0



M '



!""'!""\



Gen eration



1--- Average fitness - Max fitness I



~I I



Figure 10. Evolution of fitness function for the optimisation of the Assembled Body (Marian, Luong et al. 2(03).



optimise for two reasons. The first reason is the extraordinary variety of the assembly as a technological problem which has to address any type of assembly sequence and plan involving components and subassemblies with a fixed or variable geometry and volume (solid, liquid, gaseous or multiphase). The second reason comes from the large scale, highly constrained, combinatorial character of the problem as a mathematical abstraction. An integrated methodology has been conceived, to model, then solve and optimise a full-scale, unabridged ASP. A number of models were developed: a model of assembly processes (what assembly planning means in mathematical terms), a model for assembly sequences (encoded as chromosomes), a model of products for assembly purposes (encoding and storing of constraints in assembly) and a framework to encode quality measures in assembly. Those models, even if very comprehensive, have an open character, so that they can be further expanded to accommodate other situations as need may arise. The ASP problem has been solved by randomly generating feasible assembly sequences using a guided search algorithm. The algorithm is very effective, works in polynomial time, produces a gene and adds it to the chromosome at each iteration. The ASP is optimised using a GA that works with and produce feasible assembly sequences/ chromosomes. GA have the advantage of being able to handle large-scale problems as a direct consequence of being parallel in nature. Another advantage is the flexibility in defining the fitness function from optimisation constraints. The GA has a classical structure but with specially designed operators that rely on the guided search.



 270



Lee H. S. Luong, Romeo Marin Marian and Kazem Abhary



13. REFERENCES 1. Ames, A. L. and T. L. Carlton (1995). Lessons Learned from a Second Generation Assembly Planning System. IEEE International Symposium on Assembly Task Planning. 2. Bourjault, A. (1984). Contribution a une Approche Methodologique de l'Assemblage Automatise. Sciences Physiques, Universite de Franche Comte. 3. Gen, M. and R. Cheng (1997). Genetic algorithms and engineering design, John Wiley & Sons, Inc. 4. Gen, M., G. Zhou, et al. (1998). A Comparative Study of Tree Encodings on Spanning Tree Problems. IEEE World Congress on Computational Intetelligence. 5. Golabi, S. i. (1996). Automatic generation of all geometrically feasible assembly sequences using solid modelling, University of South Australia. 6. Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization & Machine Learning, Addison Wesley Publishing Company, Inc. 7. Haupt, R. L. and S. E. Haupt (1998). Practical Genetic Algorithms. New York, John Wiley & Sons, Inc. 8. Holland, J (1975). Adaptation in Natural and Artificial Systems. Ann Arbor, University of Michigan Press. 9. Homem de Mello, L. S. and A. C. Sanderson (1991). "Representation of Mechanical Assembly Sequences." IEEE Transactions on Robotics and Automation 7, no. 2 (April 1991): 211-227. 10. Hsu, H.- Y. and G. C. I. Lin (1997). On the Assemblability of a Product. International Conference on Manufacturing Automation (ICMA'97), Hong Kong. 11. Huang, Y. F. and C. S. G. Lee (1988). Precedence Knowledge in Feature Mating Operation Assembly Planning. West Lafayette, Indiana, Engineering Research Center for Intelligent Manufacturing Systems, School of Engineering, Purdue University. 12. Huang, Y. F. and C. S. G. Lee (1991). A Framework of Knowledge-based Assembly Planning. IEEE International Conference on Robotics and Automation, Sacramento, California. 13. Jones, R. E. and R. H. Wilson (1996). A Survey of Constraints in Automated Assembly Planning. IEEE International Conference on Robotics and Automation. 14. Jones, R. E., R. H. Wilson, et al. (1998). "On Constraints in Assembly Planning." IEEE Transactions on Robotics and Automation 16(Nr. 6, December 1998). 15. Kaufman, S. G., R. H. Wilson, et al. (1996). The Archimedes 2 Mechanical Assembly Planning System. IEEE International Conference on Robotics and Automation, Minneapolis, Minnesota. 16. Kavraki, L., J-c. Latombe, et al. (1993). "On the Complexity of Assembly Partitioning." Information Processing Letters 48(5): 229-235. 17. Marian, R., K. Abhary, et al. (2000). On the Definition of Fitness Function for the Optimisation of Assembly Sequences using GA. ICME 2000-The Eight International Conference on Manufacturing Engineering, Sydney. 18. Marian, R., L. Luong, et al. (2001). Modelling of a Linear Hydraulic Motor. 5-th Intl & 9-th Annual Mechanical Engineering Conference, Rasht-Iran. 19. Marian, R., L. H. S. Luong, et al. (1999). Applications of Genetic Algorithms in Design for Assembly. EDA Conference '99, Vancouver. 20. Marian, R., L. H. S. Luong, et al. (1999). Chromosome Generation for Assembly Planning Using a Guided Search. The Third Australia-Japan Joint Workshop on Intelligent and Evolutionary Systems, Canberra, Australia. 21. Marian, R., L. H. S. Luong, et al. (1999). Optimisation of Assembly Sequences Using Genetic Algorithms. lO-th International DAAAM Symposium, Viena, Austria. 22. Marian, R., L. H. S. Luong, et al. (2000). A new crossover technique for Assembly Sequence Planning Using GA. Computer Integrated Manufacturing CIM 2000, Singapore. 23. Marian, R. M., A. Kargas, et al. (2003). A Genetic Algorithm for the Optimisation of Assembly Sequences. 32nd International Conference on Computers and Industrial Engineering, Limerick, Ireland. 24. Marian, R. M., L. H. S. Luong, et al. (2000). Assembly Sequence Representation and Optimisation Using GA. The First Japanese-Australian Joint Seminar, Adelaide, Australia (proceedings to appear). 25. Marian, R. M., L. H. S. Luong, et al. (2003). "Assembly Sequence Planning and Optimisation Using Genetic Algorithms. Part I: Automatic Generation of Feasible Assembly Sequences." Applied Soft Computing(2/3F): 223-253. 26. Milner, J M., S. C. Graves, et al. (1994). Using Simulated Annealing to Select Least-Cost Assembly Sequences. IEEE International Conference on Robotics and Automation.



 Assembl y sequence opt imiz ation using genetic algori thms



271



27 . Mosem ann, H. , E R ohrd anz, et al. (199 7). Geome trica l and Physical Cost Evaluation for R ob ot Assem bly Seq ue nce Planni ng. IEEE Int ernation al Co nference o n Intelligent Engineerin g System s, Bud apest, H ungary. 28 . Nai sbitt, ]. (1982). M egatrends-Ten N ew Directions Transfo rm ing Our Lives. 29. N of S., W W ilbert, et al. (1997) . Industrial Assemb ly, Chapman & H all. 30. R edford, A. and ]. Chal (1994). D esign for Assembly. Prin ciples and practic e. Londo n, M cGraw- H ili Book company. 3 1. R ohrdanz, E, H. M osem ann, et al. (1996). Hi ghLAP : A Hi gh Level System for Ge nerating, R epresent ing and Evaluating Assemb ly Sequences. IEEE Int ern ational Joint Symp osium on Int elligence and Sistem s. 32. R o hrdanz, E, H . M osemann, et al. (1997). Constraint Evaluat ion for Assem bly Seq uence Planning. IEEE Int ernation al Sym posium o n Assembly Task Plann ing, Marina del R ey. 33. Romney, B., C. Godd ard, et al. (1995). An Efficient System for Geo metric Assem bly Seq uen ce Generation and Evaluation . ASME Internation al Co mputers in En gineeri ng C o nference, Bosto n, Mass-



achussets.



34. R onald, S. P. (1995 ). Geneti c Algorithms and Per mutation-Encod ed Probl ems. D iversity Preservati on and a Study of Multimodality. School of C omputer and Infor mation Scien ce. Adelaide, University of South Australia: 213. 35. Sebaaly, M. E and H. Fujimot o (1996). A Ge netic Planner for Assembl y Aut om ation . 5th International Conference on Concurrent En gineering Res earch and Applications, Tokyo, Japan. 36. T ichern, M., T. Storm, et al. (1999). H ow to Achieve a Breakthrough in Ind ustri alisation of Flexible Assemb ly Automation. 9-th Internation al Flexible Automation and Int elligent M anufactur ing (FAIM) Co nference, Tilburn, The N eth erlands. 37 . W ilso n, R . H . (1992) . O n geome tric assembly planning, Stanfor d Un iversity. 38. Wil son , R . H. , L. Kavraki, et al. (1995). " Two- H anded Assembl y sequencing." lnter natio nal j oumal of R ob o tics R esearch 14(4): 335-350. 39 . Wi lson , R .]. and ]. ]. Watkins (1990). G raphs- An Int rod uctor y approach . N ew York , John Wi ley and Sons. 40. Wolt er,]. D. (1989). On the auto matic generation of assemb ly plans. IEEE Intern atio nal C o nference o n R ob otics and Assembly Plannin g, Scottsdale, Arizona. 4 1. Wolt er, ]. D. (1991). A Co mbinator ial Analysis of Enume rative D ata Struc tures for Assembly Planning. IEEE Int ern ation al C onference on R obotics and Aut om ation, Sacramento, Ca lifornia , U SA. 42. You ng, H . D. (1992). U niversity physics. R eading, M assachusetts.



 KERNEL-BASED SELF-ORGANIZED MAPS TRAINED WITH SUPERVISED BIAS FOR GENE EXPRESSION DATA MINING



STERGIOS PAPADIMITRIOU



1. INTRODUCTION



Clustering is a popular data analysis technique that aims to provide insight into the structure of the data and aids at the discovery of functional classes. Gene expression analysis, utilizes clustering techniques extensively. These techniques accomplish the grouping of genes with similar expression patterns into clusters [6, 11, 18, 19]. Such approaches unravel relations between genes and help to deduce their biological role, since genes of similar function tend to display similar expression patterns. Most of the so far developed algorithms perform the clustering of the expression patterns in an unsupervised manner [11, 16, 22], although in many application areas already exists valuable domain knowledge. For example, collections of genes knowing to encode proteins of similar biological function, e.g. genes that code for ribosomal proteins, constitute useful a priori knowledge [7]. This means that existing information is not fully explored in order to deduce the correct expression characteristics of genes that make them part of functional groups. Additionally the frequent case that genes of similar function become allocated to different clusters and are therefore known to be erroneously grouped cannot be handled by a pure unsupervised approach. Despite of the need to integrate a priori knowledge, most of the widely clustering methods, like hierarchical clustering [11], K-means clustering, Bayesian clustering [12, 13] and the Self-Organizing Map (SOM) [22] usually ignore any existing class information. In addition many kernel-based developments of the SOM approach, do not have provision for incorporating supervised labeling [3, 1]. 272



 Kernel-based self-organized maps trained wit h supervised bias for gene expression data mining



273



The standard SOM algori thm has a number of properti es, w hich render it to a candidate of particular inte rest as a basis frame work for building more advanced algorithms for clustering applicatio ns. SOMs can be impl em ented easily, are fast, robust and scale well to large data sets. T hey allow one to imp ose part ial struc ture on th e clusters and facilitate visualization and interpre tatio n. In th e case hierarchical information is requ ired, it can be implemented on top ofS O M , as in [24]. R ecentl y, several dynami cally ex tended schem es have been pro po sed that overcome the limitation of th e fixed non-adaptable architectu re of th e SOM . Some examples are the Dy namic Topo logy R epresenting struc tures [21], the G rowi ng C ell Structures [12, 9], Self-O rganized Tree Algo rithms [8, 16], th e joint entropy maximization approach [2] and th e Adaptive R esonance T heory [5]. T he presented approach has many similarities to th ese dyn amically extended schemes. However, we focus on th e design of such types of algori thms that aim to explore effectively existing a priori supervised class labeling, for multi-class and multi-labeled data. T he multiple labeling, i.e. the possible assignment of more than one class label at each pattern, perplexes th e clusteri ng and classification tasks. Also, in contrast to th e complexity of some of the se schemes, we built simple algo ri thms that through the restriction of grow ing on a rectangular gr id, can be im plement ed easily and the training of the models is very efficient. T he ben efits of the more co m plex alte rna tives of dynamical ex tensio n are still retained . We call the prop osed model KSD G- SOM from Kern el Supervised Dy namic Grid SOM, since it is a mod el train ed in kern el space and altho ugh it is SOM based it tightl y integra tes unsup ervised and supervised learn in g compon ents . Additionally, th e KSD G-SO M has been designed in order to automa tically de tect th e appropriate level of expansion, so that the number of clusters is co ntrolled by a properly define d mea sure of the algorithm itself, with no need for any a pri ori specifica tio n. The paper is ou tlined as follows: Sectio n 2 deals wi th the definitio n oferror measure minim ization in kern el space. Section 3 deals w ith the learn ing algo rit hms that adapt bo th the struc ture and the parameters of the KSD G- SOM . The expansion ph ase of th e KSD G- SOM learn ing is described in Section 4 separately since it is rath er lengthy. Sectio n 5 deals with the auto ma tic dete ction of the appro pria te expansion level. Sect ion 6 discusses results obtaine d from an application of th e KSDG-SOM to a gene expression data analysis task. Finally, Section 7 present s the conclusions alo ng with some directions onto which further research can proceed for improvement s. 2. KERNEL-BASED SELF-ORGANIZED MAP ADAPTATION



m



Denote by L a lattice of N neurons and by [ c d the input space . Each node i E L is character ized by its weight vector W i = [Wit .. . W id] E I and by a latti ce coordinate r, E LA, w he re L A is th e lattice space. Instead of dire ctly computing th e activation of a node wi th weight vecto r Wi from th e input vector x by th e inn er product (x , w. }, we utili ze a kernel function K (x , w. ) =



 274 Stergios Papadimitriou



(


We want to minimize the distortion between the mapping of the input (x) and the mapping ofthe node (wJ Therefore we perform gradient descent with respect to w..



a



2



a aWi



-II(x) - 


a



(IIX - Wi



= -2-- exp -



aWi



a aWi



+ -K(x, X,O"i) 20"i



2



11



2



a aWi



2-K(x, Wi' O"i)



)



(1)



Therefore the update rule for the kernel centers Wi should be ofthe following form: (2)



with JLw the learning rate for the kernel centers. The next step is to derive the learning rule for the kernel radii a., Vi E L. By performing gradient descent on II (x) - 


Therefore the update rule for the Gaussian centers will be of the form: (3)



with JLu the learning rate for the kernel variances (i.e. radii). Typical values that we have used for JLw and JLu are JLw = 0.1 and JLu = 0.1. However, the convergence of the algorithm is not sensitive to the exact values of these parameters. 3. THE KSDG-SOM ALGORITHM



The KSDG-SOM is initialized with four nodes arranged in a 2 x 2 rectangular grid and grows nodes to represent the input data. This type of initialization is somehow arbitrary and different starting configurations can be used, e.g. 9 nodes arranged as a 3 x 3 grid. Weight values of the nodes are self-organized according to a new method inspired by the SOM algorithm. The self-organization process maps properties of the original high-dimensional data space onto the lattice consisted ofKSDG-SOM nodes. The map is expanded to represent the input space by creating new nodes, either from



 Kernel-based self-organized maps trained with supervised bias for gene expression data mining



275



the boundary nodes performing boundary extension, or by inserting whole columns (or rows) of new units with a column extension (or row extension). A training epoch consists ofthe presentation of all the training gene expression patterns to the KSDG-SOM. A training run is defined as the training of the KSDG-SOM with a fixed number of neurons at its lattice i.e. the training between successive node insertions/deletions. The KSDG-SOM learning aims to minimize an inhomogeneous performance measure of the form:



eE =



min



(t



ALE;



+ rsu . Entropy; + MOP)



(4)



where ALE denotes the Average Local Error, MOP abbreviates the Model Order Penalty and K is the number of nodes. Also, rsu is a parameter that controls the relative significance of the supervised part. This minimization is achieved with the formulation of SOM-like learning rules and with a dynamic expansion process. The parameter ALE of Equation 4 accounts for the unsupervised (quantization) error corresponding to pattern i and can deal with the lack of class information. This measures tries to disperse patterns that are different according to some similarity metric, to different clusters, even if they are labeled with the same functional classlabel. A commonly used measure for the local error, and the one that we minimize with the formulation of equation 1, is the Euclidean distance between the kernel mapping (x;) of input vector Xi and the representative prototype (Wk) of its best matching node k, i.e. (5)



Then the ALE; is obtained by averaging the LE; over all the patterns of the same node. The justification for this averaging is explained in Section 4. The availablea priori information for the functional class ofthe patterns is considered by the entropy measure. This measure corresponds to the entropy of the node where the pattern is mapped. Minimization of this measure is performed by gathering similar labels onto the same clusters. The MOP (ModelOrderPenalty) term punishes any increase in the model complexity. In this framework, the model complexity relates to the number of KSDG-SOM nodes that correspond to clusters ofpatterns. It also punishes models oflow complexity. A term of the following form is exploited: MOP = y. M(K - 3· Kl al>el)2 -



where M(x) = { ~:



(I'



L1ALE -



(2'



L1Entropy



(6)



~~ ~ ~ ~, t..ALE denotes the variation of the Average Local Error



parameter and K1al>el is the number of the class labels. This formulation does not punish increases at the number of nodes as long as these reduce significantly the other error



 276



Stergios Papadimitriou



terms (i.e. we benefit from these). However, when increases at the model complexity do not compensated by reductions of the error terms, the ModelErrorPenalty quantity endeavors for a simpler model. With r .II' = 0 we have pure unsupervised learning with model complexity penalization. As r.l ll increases, the cost 8 E is minimized for configurations that fit better to the a priori classification. Finally, for sufficiently large values of r 5/1> the a priori component dominates completely. Clearly, since in this case the information provided by the data is demolished, care should be taken to avoid such r .Ill values. After the preliminary discussion, we can now proceed to describe the KSDG-SOM learning algorithms in more detail. The top-level KSDG-SOM learning in algorithmic notation can be described as:



< Top-level KSDG-SOM learning algorithm> 1. Initialization (sets rSIi = 0, i.e. pure unsupervised learning) (Subsection 3.1) Repeat / / develop a series of models corresponding to increasing / / consideration of the supervised parameter, r .Ill Repeat 2. Training Run Adaptation phase. (Subsection 3.2) 3. Expansion phase (Section 4) until criteria for stopping map expansion are satisfied (Section 5) 4. Fine Tuning Adaptation phase (Subsection 3.4) 5. Save configuration of the map for the current supervised/unsupervised ratio, r su6. Compute classification performance for the current r .Ill 7. Increment the significance of the supervised part, i.e. increase ratio r until Classification Peiformance ~ 1 8. Model Selection Step (Subsection 3.6) SIJ



The details ofthe algorithm, i.e. the initialization, adaptation, fine tuning phases and the corresponding convergence criteria are described in detail below. The technical subleties involved in the expansion process are described in detail in section 4. 3.1. Initialization phase



The weight vectors of the four starting nodes that are arranged in a 2 X 2 grid are initialized with random numbers within the domain offeature values. Other initialization schemes are possible i.e. as noted we can initialize to a 3 X 3 grid. The supervision parameter r", controls the tradeof between unsupervised and supervised training and is discussed in detail in Section 4. It is initialized to 0, i.e. pure unsupervised learning is performed for the first KSDG-SOM model being generated. 3.2. Training run adaptation phase



The purpose of this phase is to stabilize the current map configuration in order to be able to evaluate its effectiveness and the requirements for further expansion. During this phase, the input patterns are repeatedly presented and the corresponding



 Kernel-based self-organized maps trained with supervised bias for gene expression data mining



277



self-organization actions are performed until the map converges sufficiently. The training run adaptation phase takes the following algorithmic form.



: MapConverged: = false; while MapConverged = false do for all input patterns do present and adapt the map by applying the map adaptation rules (Subsection 3.2.1) endfor Evaluate map training run convergence condition (Subsection 3.2.2) and set MapConverged accordingly endwhile The map adaptation rules and the training run convergence condition are described separately in the following two paragraphs. 3.2.1. Map adaptation rules



The map adaptation rules that govern the processing of each input pattern Xk are as follows:



1. Determination ofthe weight vector w, for which its kernel mapping (Wi) is closest to the kernel mapping (Xk), of the input vector Xk, according to the utilized distance measure (i.e. determination of the winner node). 2. Adaptation of the weight vectors (i.e. Gaussian centers) only for the four nodes in the direct neighborhood of the winner and for the winner itself according to the following formula: (7)



3. Adaptation of the Gaussian spreads also only for the four nodes in the direct neighborhood of the winner and for the winner itself according to the following formula: (f,(k



«,(k), + 1) = { (fj (k) + f.l." . Ak(d (i, j)) . tso, (k),



(8)



where the learning rates u.; (k), flu (k), kEN, are monotonically decreasing sequence of positive parameters, Nk is the neighborhood of the winner node at the kth learning step and lI. k (d (j , i)) is the neighborhood function implementing different adaptation rates even within the same neighborhood. Also, the !l Wi (k) is defined in terms of the kernel distance metric of equation 2, and !lUj (k) is defined according to equation 3. The learning rates flu' (k), flu (k), kEN typically start from a value of 0.1 and decrease down to 0.02. These values are specified with the empirical criterion of having relatively fast convergence, without however sacrificing the stability of the map.



 278



Stergios Papadimitriou



The KSDG-SOM starts with a much smaller size than a usual SOM. Therefore a large neighborhood is not required to train the whole map at the first learning steps (e.g. with 4 nodes initially at the map, a neighborhood of1 only is required). As training proceeds, during subsequent training epochs, the area defined by the neighborhood becomes localized near the winning neuron, not by shrinking the vicinity radius (as in the standard SOM) but by enlarging the SOM with the dynamic growing. The neighborhood function Ak(d(j, i)), can thus be defined with the following simple formula (the row and column of a node i are denoted by i., i, respectively): ifj



0< a < 1,



ifli r



=i -



JTI



+ Ii,



- jel = 1



otherwise 3.2.2. Evaluation of the map training run convergence condition



The reduction of the Total Growth Parameter (TGP, defined in section 4) controls the training run convergence condition. The corresponding convergence test is: MapConverged := (



IT GPb - TGPal ) < ConvergenceErrorThreshold TGPa



where TGPb = Li(GPi)b and TGPa = Li(GPi)a denote respectively the sum of the Growth Parameters for all nodes before and after the presentation of patterns (i.e. one training epoch) and the ConvergenceErrorThreshold is a given value. The above formula states that the map converges when the relative change of the TGPparameter between successiveepochs drops below the threshold value. The setting of the ConvergenceErrorThreshold is somewhat empirical but a value in the range 0.010.02 performs well in assuming sufficient convergence without excessive computation. 3.3. Expansion phase



This phase constitutes the main core of the learning algorithms. It is described in detail in section 4 3.4. Fine tuning adaptation phase



The fine tuning phase aims to optimize the final KSDG-SOM configuration. This phase is similar to the training run adaptation phase described previously (subsection 3.2) with two differences: 1. The final criterion for map convergence is more elaborated. We require much smaller change of the Total Growth Parameter for accepting the condition for map convergence. 2. The learning rate decreases to a smaller value in order to allow fine adjustments to the final structure of the map. Typically, the ConvergenceErrorThreshold for the fine tuning phase is about 0.00001 and the learning rate is set to 0.01 (or to an even smaller value).



 Kernel-based self-organized maps trained with supervised bias for gene expression data mining



279



3.5. Evaluation of classification performances



Each KSDG-SOM node is assigneda classification vectorcl with elements cli = Pi where Pi is the ratio ofpatterns with functional label i, among all the patterns mapped to the node. This vector is considered as the predicted classification. This classification is a soft one: each d, expresses the probability that a node (and consequently the mapped patterns) are assigned a label i . We therefore compute performance based on a metric proposed by [20]. Specifically, for each class label i of each pattern j, a score is assigned. This score equals Pi, if the corresponding label is included in the original class assignment (i.e. c i = 1) and equals qi = 1 - Pi in the other case (i.e. ci = 0). In this way a total score for each pattern j is calculated as:



L I\



Totalxcorc , =



SC"



where SCi =



i=1



Pi



_



{ qi - 1 - P,



=1 if r , = 0 if c,



Intuitively, a small Pi ~ 0 for a class that does not appear as a functional label (i.e. c i = 0) for an input pattern is much more a success than a failure, therefore being considered by a score qi ~ 1. The performance for each pattern j, perf) is then obtained by dividing this score with the total number of functional class labelings, N, i.e. Perf,



.



Total'Score , = -------'-N



The global measure of the performance ClassificationPeiformance (r su) for a given ratio rsu (i.e. the supervision weighting parameter of equation 4), is obtained by averaging the Pert; values for all the patterns of the testing set. 3.6. Model selection step



During this step a well performing ratio r su is selected by using the following criteria: • The classification performance obtains a steep increase for the selected r su parameter value at the corresponding classification performance curve and this increase is followed by a plateau. The increase at the classification performance with increasing r SlI means that a priori information for the application was taken into account by the second (supervised) term of Equation 4 at the formation of the cluster boundaries. The plateau that we require to follow the steep increase implies that increasing further the strength of the a priori information, although it can bias the model heavily towards an imperfect domain theory, does not offer significant improvements to its generalization potential. This methodology for model selection will be illustrated better by means of an example in Section 6. • The number of the KSDG-SOM nodes that grow for the "optimal" t s« value should be relatively small (small model complexity). This criterion prefers the models with the smallest complexity that offer adequate generalization performance.



 280



Stergios Papadimitriou



We should note that these selection criteria are somewhat heuristic. However, there seem to perform an adequate model selection. 3.7. Node deletion



Nodes that are selected as winners for very few (usually one or two) training patterns, termed uncolonized nodes, are not deleted by our scheme although they probably correspond to noisy outliers. The patterns that consistently (three times or more) are mapped to uncolonized nodes are very unique and can either be artifacts or if not they have the potential to provide knowledge. Therefore they are amenable to further consideration. These patterns therefore are marked and isolated for further study. Nodes that are not selected as winners for any pattern are removed from the map in order to keep it compact. 4. THE EXPANSION PROCESS



The expansion is based on the detection of the neurons with large Growth Parameter (GP), referred to as the unresolved neurons. The node with the largest GP becomes the current focus of map expansion. The Growth Parameter for node i, denoted G Pi, is based on an inhomogenous type of error, computed as: GPi



= ALE i + Ysu ' Entropy,



(9)



where we recall that the ALE denotes the Average Local Error. We describe in turn the two components of 9, i.e. the average local error and the entropy. A local error term is commonly used for implementing dynamically growing schemes [4, 16]. A general assessment of the local error, lei, is given by Ie, =



L Dist(x, w.)



(10)



xEsj



where we denote by S; the set of patterns x mapped to node i, w, the weight vector of node i that corresponds to the average expression profile of S. and the Dist operator denotes the corresponding distance metric. However, the peculiarities of many application domains (e.g. the gene expression data analysis) motivated two significant modifications to the classic local error measure. Specifically: 1. Instead of the simple local error measure of Equation lOwe use the average local error ALEi per pattern, defined as: lei ALEi = 15,[



where ISi1denotes the number of elements of the set



(11)



15, I.



 Kernel-based self-organized maps trained with supervised bias for gene expression data mining



281



This measure does not increase when many similar patterns are mapped to the same node. 2. The second provision applies when we have classinformation available (either complete or partial) and we want to exploit it in order to improve the expansion. The local error that accumulates to a winner node is amplified by a factor that is inversely proportional to the square root of the frequency ratio r c of its corresponding . 0 f c1ass c. Then cIass c. Snecif pen lca11y, 1et r( =__1#patterns # 1 of class ( b e t h e firequency ratio tota patterns the amplification factor is r, 2. The supervised contribution to the inhomogeneous error of Equation 4 is based on the computation of a parameter HN, characterizing the entropy of the class assignment content of each node i. An advantage of the entropy is that it is relatively insensitive to the over-representation ofclasses. This means that independently ofhow many patterns of a class are mapped to the same node, if the node does not represent significantly other classes, its entropy is very small. We first consider the simple case of each pattern belonging only to one functional class. The assignment of a class label to each neuron of the KSDG-SOM is in this case performed according to a majority-voting scheme [17]. The entropy parameter, that quantifies the uncertainty of the class label of neuron m , can be directly evaluated by counting the votes at each SOM neuron for every class as [15]: HN(m)



=-



LPk -Iog p, N



(12)



k=l



where N denotes the number of classes and P» = -v. f'k , is the ratio of votes v" for patten! class k to the total number of patterns Tj,attern that vote to neuron m. The parameter Pk has a probability interpretation, i.e. it is computed as the relative frequency of votes for each class k. For the single label case, the number oflabeled patterns Tj,attern is also equal to the number of votes. Clearly, the entropy HN(m) is zero for unambiguous neurons and increases as the uncertainty about the class label ofthe neuron m increases. The upper bound of HN(m) is log( N ), and corresponds to the situation where all the classes are equiprobable (i.e. the voting mechanism does not favor a particular class). For the multi-label case, the voting scheme remains the same, but each pattern in this case can vote for more than one class. The frequent case of having patterns not assigned to any functional class is handled as a vote to the Unassigned class. A quantity HR(m) is defined similarly: HR(m)



=-



Lrk -Iog r, »;



(13)



k=l



The rk do not correspond to probabilities but they are class voting ratios deVk ). However, in this case Lk v" > Tj,attern and therefore fined as the P» (i.e. rk = -v. pattern



 282



Stergios Papadimitriou



Lk:l



rk > 1. Thus, HR(m) is not mathematically an entropy of a probability distribution. However, this quantity retains properties similar to the entropy. We consider an example in order to explain the handling of multiple labeling by equation 13. Let N = 3 and suppose that 30 patterns are assigned to node m 1, all of them having as a label all the three classes and that 90 patterns are assigned to neuron m: one third of them having as a label class 1, another third class 2 and the last third class 3. Although in each case there are 30 patterns voting for each class, the quantity HR will be high in the case of the 90 patterns (i.e. log(3)) and zero at the other case. Thus, the HR measure has quantified effectively the similarity ofmultiple classlabeling between patterns of some cluster. The steps of the expansion process are as follows:



<Expansion Phase:> Computation of the G P" i.e. of the Growth Parameter for every node i. repeat let i = the node with the maximum G Pi measure ifIsBoundaryNode(i) then / / expand at the neighbours of the boundary nodes JoinSmoothlyNeighbours(i) elseif IsNearBoundaryNode(i) RippleWeightsToNeighbours(i) else InsertWholeColumn(i); endif Re-execute the Training Run Adaptation Phase for the expanded map Reset the Growth Parameter measures for all nodes (since possible redistribution of patterns can occur over nodes). until not Randoml.ikeClustersk.ernainf); We describe below shortly the main issuesinvolved in these steps. We first explore the functionality involved in the repeat loop that controls the KSDG-SOM expansion and we concentrate on the criteria for the establishment of the proper level of expansion in the section that follows. The functions that are involved with the KSDG-SOM expansion are as follows:



e IsBoundaryNodeO The function IsBoundaryNodeO checks whether a node is a boundary node. Training efficiency and implementation simplicity were the motivations for the decision to expand mostly from the boundary nodes. eJoinSmoothlyNeighboursO The expansion of the map at the boundary nodes is performed by acquiring one to three new nodes from their direct neighbourhood. The weights of the new nodes are adjusted heuristically to retain the "weight flow" with the function join'Smoothlyl-leighboursf). elsNearBoundaryNodeO: A node is declared as a near boundary node by the function IsNearBoundaryNode() when the boundary of the map can be reached from this node by traversing in any direction at most two nodes.



 Kernel-based self-organized maps trained with supervised bias for gene expression data mining



283



• RippleWeightsToNeighboursO: The map configuration is slightly disturbed when the winner node is not a boundary node but is a near boundary. For a near boundary node a percentage (usually 20-50%) of the weight of the winner node is shifted towards the outer nodes with the function RippleWeights ToNeighbours (). This operation alters locally the Voronoi regions of influence of each Gaussian and usually with a few weight "rippling" operations the winner node is propagated to a boundary node (which is located near). • InsertWholeColumnO: Finally, if the winner is a node that is neither a boundary nor a near boundary the alternative of inserting a whole empty column is used. The rippling of weights is avoided in these cases, because usually excessive computation times are required before the winner propagates from a node placed deep in the map to a boundary node. Instead of inserting whole new columns we can insert alternatively whole new rows, or we can perform a combination of row and column insertion. • RandomLikeClustersRemainO: This function evaluates the "randomness" of the distribution of patterns to a specific cluster as we shall describe in the next section. If for a cluster the allocation of patterns is found to be "random," then the cluster is considered to own unrelated patterns and therefore further decomposition is required. 5. CRITERIA FOR CONTROLLING THE KSDG-SOM DYNAMIC GROWING



The growing process of the KSDG-SOM is controlled by the fore mentioned Boolean function RandomLikeClustersRemain(). Ideally, the output ofthis function should remain true until automatically the appropriate level of expansion is reached. Intuitively, the growing should stop when the patterns are characterized by inter pattern distances that deviate significantly (either much smaller or much larger) from the typical distances for randomly selected patterns. The problem can be concentrated to the definition of similarity (or alternatively randomness) between patterns and to the determination of the maximum percentage of "random" patterns allowed to be allocated to the cluster. Similarity between patterns is related to the distance between them. In order to treat the problem quantitatively, we set a randomization ratio ct. The randomization ratio controls how much of the randomized distances can be treated as non-random. From this ratio we derive distance thresholds dL til r and dUtl". Distances below dLtl" imply positive correlation between patterns. In a similar manner, distances larger than dUti" imply negative correlation. Both cases imply non-random association between patterns. The randomization ratio ct has the meaning that the probability that two patterns are "random" (unrelated) is lower than a, if the distance between them is either smaller than the threshold dLtl" or larger than dUtil,. Obviously, the definition of this ratio would be only possible if the distribution of the distance between random patterns were known. Practically, although for most applications the random distribution of inter pattern distances is unknown, it is easy to approximate it by shuffling randomly the features



 284



Stergios Papadimitriou



,



···



. .... . . . ... . . . ... ,



................



-_



f ~



:Line: RandomIzed Gene Expression Data



.. . _- --.:--- ---------



.



--



.. -... ... ..



: I



I



I



.·: .



I



.



--:- ..-



: J



:I :



I



:1



L



positive .. correlated genes "



I ...I:



·· · ~



1:



_



~



I



.. -~



I



:



..



:



I



..



.:



.



..



..



'



..: t



:



--- -_ -



: ,



I • •



: ..



• •



•



•



.: ~



• • •• • •



• •• • •• • • •



t



~



•



..



~



,



'



:



_.. : :



_.._- --



__ ..



..



negative correlated .' genes



.



I



:



:



_



_ _.__..



_.-



I ~



.



J':



.. ... .. .. ........ ~ ••• J... • ~ • . . ... ~



..... ...... .... ......... ~ ..J



j



pash-DoUed: lfene Express i~n Data



;



_Oo _



oo



_



..



~



,



~



_



.



,



Figure 1. The results of the data shufRing illustrate that the distances between the randomized data occupy a distinct distribution. For the example case of the gene expression data positive correlation is favored while for the random the distribution has a normal form.



that constitute each pattern. This randomization destroys the correlation between the different patterns, while it retains the other characteristics of the data set (e.g. ranges and histogram distribution of values). In this way, we compute an approximation to the distribution of the distance between random patterns. Figure 1 illustrates the distribution of the distances (with the Manhattan distance metric) between the random patterns and the actual patterns from a gene expression data experiment. In this case, we can determine an upper dUthr and a lower dLthr distance threshold and consider the distances lying in the interval [dLthr dUt hr] as random. A parameter P is also specified, that controls the percentage of internode "random" pattern distances, i.e. patterns Pi, Pj with distances dLthr < dist(Pi, Pj) < dUtl", allowed before expansion is initiated. The parameter P is specified empirically to a value of2%.



 Kernel-based self-organized maps trained with supervised bias for gene expression data mining



285



In conclusion, given the values of the randomization ratio a and the percentage p, the function RandomLikeClustersRemainO controls the growing process by taking into account the distribution of the data as illustrated in Figure 1, and as a result an appropriate number of clusters is determined automatically. 6. APPLICATION



Gene expression data are characterized by multiple functional labeling and are huge and noisy. Also, the available a priori knowledge is still very incomplete. Therefore, this application domain fits well for the application of the KSDG-SOM model. The DNA microarray technology provides the ability to measure the expression levels of thousands of genes in a single experiment [6,7, 10, 11]. The interpretation of such massive expression data is a new challenge for bioinformatics and opens new perspectives for functional genomics. A key question within this context is if given some expression data for a gene, this gene does belong to a particular Junctional class (i.e. it encodes for a protein of interest). We have applied the KSDG-SOM to analyze microarray expression data from the budding yeast Saccharomyces cerevisiae. These data are public availablefrom the Stanford web site. They were generated by studying this fully sequenced organism with microarrays, containing essentially every Open Reading Frame (ORF). The samples used were collected at various time points during the diauxic shift, the mitotic cell division cycle and sporulation. The whole data set consists of 80-element gene expression vectors for 6,221 genes. Annotation for these genes was derived from the Functional Classification Catalogue of the Munich information center for protein sequences (MIPS) Comprehensive Yeast Genome Database (CYGD) available at http://mips.gsf.de/proj/yeast/CYGD/db/index.htrnl. The selected annotations included all the 19 top-levelJunctional categories. The gene expression data is arranged in a table whose rows correspond to the genes and columns to the individual log-transformed gene expression values of each gene in a particular experimental condition represented by the column. The weighted Knearest neighbors imputation method presented in [23] is applied in order to fill up systematically the missing values. The KSDG-SOM selects the model with a supervision parameter value of r su ~ 0.8 as an appropriate model since as illustrated in Figure 2 it corresponds to a performance plateau. This selection of the model is performed according to heuristic criteria. We require acceptable classification performance, small number of nodes (i.e. model complexity control) and a plateau at the corresponding curve that indicates that further increase at the supervised gain (i.e. ofparameter rsu) should not be performed. Another important observation is that for the smaller classes consisting of genes with functional uniformity, the convergence is much faster, e.g. for the Cellular Communication/ Signal Transduction class with 59 ORFs at rsu ~ 0.8 the performance approaches 1, while for the largest class of metabolism (dotted), large values of rsu (rsu ~ 8) are required for the same performance.



 286



Stergios Papadimitriou



09 Q)



0.8



u



c:



E0.7 o 'l: ~ 0.6 c:



.g 


.~ 0.4 III



U



0.3



0.2 5



15



10



20



25



r ~ 1I



Figure 2. The selectio n of the appropriate ratio supervised/ unsupervised parameter (rsu) by means of the classification perfor mance plateaue.



Figure 3. A snapshot of KSDG -SOM training.



 Kernel-based self-organized maps trained with supervised bias for gene expression data mining



287



Finally, Figure 3 illustrates a snapshot ofKSDG-SOM learning. We can observe the dynamic extension of the map as it expands according to the rules of the growing phase. The Java based data mining tool based on KSDG-SOM is currently extended to provide modules for the extraction offuzzy rules from the dynamically created clusters, and also a novel Support Vector Learning module is integrated to provide support for some supervised learning tasks. The current version of the software can be obtained with an email requestto:[email protected] 7. CONCLUSIONS



This work has presented a new self-growing adaptive neural network model fitted to the requirements for clustering and classification ofmulti-labeled gene expression data. This model, called KSDG-SOM overcomes elegantly the main drawbacks of most of the existing clustering methods that impose an a priori specification at the number of clusters. The KSDG-SOM determines adaptively the number of clusters with a dynamic extension process which is able to exploit class information whenever available. The KSDG-SOM nodes represent Gaussian kernel functions having mean and variances that are adapted on a per node basis. The model grows within a rectangular grid that provides the potential for the implementation of efficient training algorithms. The expansion of the KSDG-SOM is based on an adaptive process. This process grows nodes at the boundary nodes, ripples weights from the internal nodes towards the outer nodes of the grid, and inserts whole columns within the map. The growing algorithm is simple and computationally effective. It prefers to grow from the boundary nodes in order to minimize the map readjustment operations. However, a mechanism for whole column (row) insertion is implemented in order to deal with the case that a large map should be expanded around a point that is deep within its interior. The growing process determines automatically the appropriate level of expansion in order the similarity between the patterns of the same cluster to fulfill a designer defmable statistical confidence level of not being a random event. Multiple KSDG-SOM models are constructed dynamically each for a different unsupervised/supervised balance. Model selection criteria are used to select an KSDGSOM model that optimizes the contribution of the unsupervised part of the patterns with the a priori knowledge (supervised part). Finally, we have presented the results of the application of the KSDG-SOM for the analysisofgene expression data sets. The methodology ofexploring a priori knowledge about functional gene class labeling with gene expression data, along with the adaptive expansion process, has been used to obtain clusters of genes that balance supervised with unsupervised criteria. 8. ACKNOWLEDGMENT



This work was partially supported by the Technological Educational Institute of Kavalas, with a European Union funded EPEAK II project "Arximidis", code 043-00115.



 288



Stergios Papadimitriou



REFERENCES [1] Van Hulle, N. M., "Kernel-Based Topographic Map Formation," Neural Computation, Vol. 14, No.7, pp.1560-1573,2002. [2] Van Hulle, N. M., "Joint Entropy Maximization in Kernel-Based Topographic Maps," Neural Computation, Vol. 14, No.8, pp. 1887-1906,2002. [3] Van Hulle, N. M., "Kernel-based equiprobabilistic topographic map formation, Neural Computation, Vol. 10, No.7, pp. 1847-1871,2002. [4] Alahakoon Damminda, Halgamuge Saman K., and Srinivasan Bala, "Dynamic Self-Organizing Maps with Controlled Growth for Knowledge Discovery," IEEE Transactions On Neural Networks, Vol. 11, No.3, pp. 601-614, May 2000. [5] Azuaje Franscisco, "A Computational Neural Approach to Support the Discovery of Gene Function and Classes of Cancer," IEEE Trans. Biomed. Eng., Vol. 48, No.3, March 2001, pp. 332-339. [6] Brazma Alvis, and Vilo Jaak, "Gene expression data analysis," FEBS Letters, 480 (2000) 17-24. [7] Brown Michael P. S., Grundy William Noble, Lin David, Cristianini Nello, Sugnet Charles Walsh, Furey Terrence S., Ares Manuel, and Haussler Jr., David, "Knowledge-based Analysis of Microarray Gene Expression Data By Using Support Vector Machines," Proceedings of the National Academy of Science, Vol. 97, No.1, pp. 262-267, 1997. [8] Campos Marcos M., and Carpenter Gail A., "S- TREE: self-organizing trees for data clustering and online vector quantization," Neural Networks 14 (2001), pp. 505-525. [9] Cheng Guojian and Zell Andreas, "Externally Growing Cell Structures for Data Evaluation of Chemical Gas Sensors," Neural Computing & Applications, 10, pp. 89-97, Springer-Verlag, 2001. [10] Cheung Vivian G., Morley Michael, Aguilar Francisco, Massimi Aldo, Kucherlapati Raju, and Childs Geoffrey, "Making and reading microarrays,' Nature genetics supplement, Vol. 21, January 1999. [11] Eisen Michael B., Spellman Paul T., Patrick O. Brown, and David Botstein, "Cluster analysis and display of genome-wide expression patterns," Proc. Natl. Acad. Sci. USA, Vol. 95, pp. 14863-14868, December 1998. [12] Friedman, N., M. Linial, I. Nachman, and D'Peier, "Using Bayesian networks to analyze expression data," J. Compo Bio. 7, 2000, 601-620. [13] Fritzke Bernd, "Growing Grid-a self organizing network with constant neighborhood range and adaptation strength," Neural Processing Letters, Vol. 2, No.5, pp. 9-13,1995. [14] Hastie Trevor, Tibshirani Robert, Botstein David, and Brown Patrick, "Supervised Harvesting of expression trees," Genome Biology 2001, 2(1), http://genomebiology.com/2001 12/[ [15] Haykin S., Neural Networks, Prentice Hall International, Second Edition, 1999. [16] Herrero Javier, Valencia Alfonso, and Dopazo Joaquin, "A hierarchical unsupervised growing neural network for clustering gene expression patterns," Bioinformatics, (2001) Vol. 17, no. 2, pp. 126-136. [17] Kohonen T., Self-Organized Maps, Springer-Verlag, Second Edition, 1997. [18] Mavroudi Seferina, Papadimitriou Stergios, and Bezerianos Anastasios, "Gene Expression Analysiswith a Dynamically Extended Self-Organized Map that Exploits ClassInformation," Bioinformatics, Vol. 18, no. 11,2002, pp. 1446-1453. [19] Papadimitriou S., Mavroudi S., Vladutu L., and Bezerianos A., "Ischemia Detection with a Self Organizing Map Supplemented by Supervised Learning," IEEE Trans. On Neural Networks, Vol. 12, No.3, May 2001, pp. 503-515. [20] Sable, Carl L. and Vasileios Hatzivassiloglou, "Text-Based Approaches for the Categorization ofImages, Proceedings of the Third Annual Conference on Research and Advanced Technology for Digital Libraries, Paris, 1999. pp. 19-38. [21] Si J., Lin S., and Vuong M. A., "Dynamic topology representing networks," Neural Networks, 13, pp. 617-627, 2000. [22] Tamayo, P., Slonim, D., Mesirov,J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E. S. and Golub, T. R. (1999) "Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation," Proc. Natl. Acad. Sci., USA, 92, pp. 2907-2912. [23] Troyanskaya Olga, Cantor Michael, Shelock Gavin, Brown Pat, Hastie Trevor, Tibshirani Robert, Botstein David, and Altman Russ B., "Missing value estimation methods for DNA microarrays," Bioinformatics, Vol. 17, no. 6, 2001. [24] VesantoJuha Alhoniemi, Esa, "Clustering of the Self-Organized Map," IEEE Transactions on Neural Networks, Vol. 11, No.3, May 2000, pp. 586-600.



 COMPUTATIONAL INTELLIGENCE FOR FACILITY LOCATION ALLOCATION PROBLEMS



SHING-HWANG DOONG, CHIH-CHIN LAI AND CHIH-HUNG WU



1. INTRODUCTION



Due to the advancement of information technology and other supporting mechanisms, today's companies are doing businesses in a global market. Many international corporations accept worldwide orders via their world-wide-web (WWW) sites. By taking advantages of the Internet and www, companies can now operate 24 hours a day, 365 days a year, and reach as many customers as possible. This phenomenon has created a business model-the Electronic Commerce (EC) model that has caught a lot of research interests lately. No matter how an EC company is set up, its operations normally involve information flow, cash flow and goods flow; and unless goods can be packaged in a digital format such as software, music, or video, most goods still need to be distributed by some sort of physical channels. In a business to customer (B2C) EC company, the product distribution system might involve warehouse location and vehicle routing problems. On the other hand, for a business to business (B2B) EC company, the global logistic system becomes an essential part of its worldwide supply chain management system. Likewise, the logistic system can involve warehouse location and customer allocation problems. 1.1. Facility location problem



A facility location problem (FLP) is a planning problem to locate some useful facilities like warehouses or hospitals in such a way that the total cost for assigning facilities to satisfy the demand of customers is minimized. It is frequently considered as a site 289



 290 Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



selection and resource allocation problem in Operations Research (OR). Suppose there are m possible sites for establishing the new facilities with the fixed opening cost of ii, i = 1, , m at each site, n existing customers each with a resource demand of d}, j = 1, , n, and the unit allocation cost from the i-th site to the j-th customer is given by C ij, then a FLP is defined as an optimization problem to select the best sites for serving customers' demand while the total cost is minimized. In terms ofmathematical notation, let Xi} denote the fraction of the j-th customer's demand served by the i-th site and Yi the opening decision variable for the i-th site, then a FLP becomes the following optimization problem:



~~~~i



(L Xi) d} Ci} 1,)



+ LYifi)



(1)



r



such that



'"



LXi)



= 1,



j



= 1, ... , n



(2)



1=1



Xi)



:s



Yi,



i



o:s xii :s 1, YiE{O,l},



= 1, ... , m, j = 1, ... , n i



= 1, ... , m, j = 1, ... , n



i=l, ... ,m



(3) (4) (5)



The minimization is taken over all feasible solutions of Xi}, Yi (decision variables) subject to the constraints in equations (2)-(5). Equation (2), called the demand constraint, states that each customer's demand must be satisfied; equation (3) says that the fractional request Xi} to site i cannot be greater then the opening decision variable Yi of the site, thus if the site is not selected for serving customers (Yi = 0), then all requests to the site should be denied. On the other hand, ifa site i is selected for service (Yi = 1), then it is possible to allocate portions of the customers' requests to this site. Since opening a site incurs a fixed cost of .Ii, which could be substantially large for some sites, it becomes obvious that we have to select the proper sites (Yi = 1) for service and determine the proper allocation (Xi}) in order to minimize the total cost in equation (1). The problem we just stated is an un-capacitated facility location problem since there is no restriction on the amount that a site can provide. In reality, this is not the case in general. A capacitated facility location problem (CFLP) is a problem that each site has a limited capacity qi; therefore in addition to the constraints in equations (2)-(5), we need to add the following capacity constraint for each site: n



LXi} d}



:s Yi qi,



i = 1, ... , m



(6)



j~1



If we further assume that the allocation variables Xi} can take only values of0 or 1, then the CFLPbecomes a Single Source Capacitated Facility Location Problem (SSCFLP).



 Co mputational intelligence for facility location allocation problems 291



~~



fixed opening cost: 0·1..;y.. : O capacity: q..



\



\mj=O



tb.ej-th



customer



\



facility J: not opened fixedopeningcost: O·fi ; YI =0 capacity: q I



fixedopening cost: J.Ji ; Y1 - J capacity: q1 Figure 1. A single source capacitated facility problem.



A picture demonstrating this facility location problem is illustrated in Figure 1. When Xi} = 1 customer j' 5 dem and is completely fulfilled by site r, otherwise the customer is not served by the facility in any fraction. We usually rephrase a SSCFLP as follows: (7)



such that L



'"



Xi}



= 1,



j = 1, 000, n



(8)



;= 1



L



"



v.s..



x ijd j :S



= 1, . .. , 1/1



i



(9)



1= 1



:s Yi,



;



=



Xij E {D,



1},



i



Xij



Yi E {D, I },



1,



0



••



,



In ,



j



= 1, ... , n



= 1' 0' 0' In , j = 1'0 00'



; = 1, 000, m ,



(10) /1



(11) (12)



 292



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



In this case, the cost coefficient (ii already takes into account the quantity factor d, therefore it is no longer necessary to include the quantity factor in the objective function in equation (7). However, equation (10) becomes redundant in the problem if we assume that each customer has a positive demand di . This can be verified in the following reasoning.



Proof Suppose constraints in equation (10) are not satisfied by all ; and



must exist indices;', j' such that the following is true. Xi',./'



i. then there (13)



> Yi'



Since xii' Yi are binary (0 or 1) variables, equation (13) implies that Xi'.j' = 1 and Yi' = O. Using equation (9) for ; = ;' and the non-negativity property of all quantities involved, we obtain /I



d"



= xi'.ldi , .:s I>i'jd, .:s



Thus



,=1



dj'



Yi'q,'



.:s 0, which violates



=0



the assumption that



(14)



dj'>



O.



o



The FLP we have discussed so far involves a cost factor (ii for servicing customer j' 5 demand from the facility at site i , and this cost factor is normally related to the distance between the parties. This can be fine if the customers are not too closely related geographically, for example, a B2B company needs to design its global logistic system so that it can set up a few warehouse facilities to serve its regional centers (customers). On the other hand, for a convenience store type company, the parent company may own its own fleet to deliver the products to its branch stores, which may be clustered in a few urban areas. Definitely, distance between the parties still matters in the determination ofthe cost factor, but a vehicle routing problem becomes obvious in this case too. In other words, if customers are clustered geographically such as the case of a B2C company, then a vehicle routing problem should also be considered in designing the facility locations. Klose and Wittmann consider this problem in [17]. 1.2. Location-allocation problem



Most OR approaches to FLP assume that the site locations are already known, and the problem just needs to select the best sites to provide the service [1, 15, 17, 18]. Cooper [3] extends the problem to include the determination of the best locations as well. In other words, in addition to the site opening (Yi) and customer allocation (xii) decision variables, the facility's actual location (Ui, Vi) will also need to be determined. In this chapter, we consider a special facility location allocation problem named the Euclidean Location-Allocation Problem (ELAP). In the ELAP we need to determine the facility locations together with the site opening and customer allocation variables so that a performance measure is minimized under the constraints of a SSCFLP. The



 Computational intelligence for facility location allocation problems



293



mathematical notation is summarized as follows: Min



xu. Yi ,Iii .v;



(Xi) aJ(Ui - aj)2 + (v, - bj)2 + y;Ji)



(15)



such that '" LXii ;=1



= 1,



"



LXijd;::S



j



= 1, ... , n



v.«,



i = 1, ... ,m



(16)



(17)



j~l



Xij' y,



E



{O, 1}



(18)



Here (aj, bi) denotes the known location ofthej-th customer, while (Ui, Vi) denotes the unknown location of the i-th facility. The cost factor [if in equation (7) is replaced by a constant a times the Euclidean distance between the facility and the customer. Equation (16) is the demand constraint, while equation (17) is the capacity constraint. Equation (18) indicates that a customer is served only by a facility (single sourcing), and a facility mayor may not open for service depending on the magnitude of its opening cost Ii. Of course, we have dropped the redundant constraints of equation (10) in the ELAP. Bespamyatnikh et al. [2] discusses the location-allocation problem under different / p norms besides the Euclidean distance which is the /2 norm in particular. 1.3. Mathematical programming



In sections 1.1 and 1.2 we have introduced various versions offacility location problems from CFLP to SSCFLP and ELAP. All of these problems are optimization problems in operations research. However, they fall into different areas of mathematical programming, for example, CFLP is a mixed integer linear programming problem since part of its decision variables (Xij) are continuous and part of the decision variables (Yi) are discrete; SSCFLP is an integer linear programming problem since all variables are discrete; and ELAP is a mixed integer nonlinear programming (MINLP) problem since the objective function (15) involves the continuous variables (Ui, Vi) non-linearly. All of these problems are non-convex programming problems because of the integer decision variables, thus they may present many local optima and looking for the global optimum becomes a non-trivial work. Indeed, it has been shown that these problems are all NP-hard problems [14, 18], i.e. there are no known algorithms with the polynomial growth constraint for the time complexity that can solve these optimization problems. Many NP-hard problems are caused by the facts that the search space for the discrete variables is too large, and there is no known necessary condition on these discrete variables that can be used to check for optimality efficiently. Therefore a smart enumeration of all feasible discrete solutions must be devised for (mixed) integer linear or nonlinear programming problems. For example, branch-and-bound and Lagrange relaxation [1,5, 15, 23] methods are frequently used to solve an integer



 294



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



Table 1 Characteristics of various FLPs CFLP



SSCFLP



ELAP



Variable types Variable attributes



Continuous and integer Customer allocation and facility opening decision



Integer Customer allocation and facility opening decision



Objective function Mathematical programming classification



Linear in decision variables, equation (1) Mixed integer linear programming problem



Linear in decision variables, equation (7) Integer linear programming problem



Continuous and integer Customer allocation and facility opening decision, and facility location Nonlinear in decision variables, equation (15) Mixed integer nonlinear programming problem



linear programming problem in SSCFLP. These tools will be introduced in the next section. We summarize the characteristics of various FLP's in Table 1. 1.4. Organization



The rest of this chapter is organized as follows. In section 2, we introduce some optimization tools and search techniques commonly used in operations research and artificial intelligence. Particularly, we discuss the branch-and-bound (B&B) algorithm, Lagrange relaxation and sub-gradient method, Genetic Algorithm (GA) and Simulated Annealing (SA) heuristic search methods. Section 3 is devoted to some hybrid methods for solving the ELAP viewed as a hierarchical optimization problem. We treat this mixed integer nonlinear programming (MINLP) problem via a two-layer procedure with GA as the top layer and GA, B&B or Lagrange relaxation as the bottom layer. In section 4, we treat the MINLP problem by using a mixed-type chromosome GA. In other worlds, the whole optimization problem is no longer layered as in section 3, and a single GA with mixed data types for genes is devised to solve the optimization problem. The traditional Alternate Location-Allocation (ALA) heuristic proposed by Cooper [3] will also be discussed here. Section 5 is devoted to a discussion of current optimization software, both commercial and non-commercial. In particular, we discuss the software packages and system used in this paper: PGAPack (Parallel GA package), UW IPMIXD (University of Wisconsin's mixed integer programming solver package), William Goffe's SA (Simulated Annealing) package, and NEOS (Network Enabled Optimization System) server. In section 6, we implement the proposed methods stated in sections 3 and 4 to solve an instance of the ELAP. We then compare the results with the one from the NEOS server. Finally, we conclude in section 7 with some remarks for future research. 2. BACKGROUND



This section provides the background knowledge for the tools used in this paper. Primarily, these tools come from two different areas in the research arena: operations research (OR) and artificial intelligence (AI). The OR based tools provide algorithmic and mathematically rigorous approach to the optimization problem, while the AIbased



 Co mputation al intelligence for facility location allocation problems 295



tools provide heuristic and computationally intelligent search meth od for finding the optimum of an optimization problem . 2.1. Branch-and-bound



This is a smart partial enumeration method for solving optimization problems with discrete variables. Total enumeration of all feasible discrete solutions in a combinatorial problem is practical only when the solution space is small. If this is not true , then wise divide-a nd-conquer methods for partitioning the solution space are commonly need ed in order to find the optimum solution quickly. Branch-and-bou nd (B&B) algorithm is a divide- and-conquer meth od , and it is traditionally the most efficient algorithm for solving integer programmin g problem s. The idea is illustrated below. Suppose we are dealing with the following integer linear programmin g problem:



"



!>!;" L C 'x,



(19)



i= 1



subject to the constraints (20) .~



= (XI ' . . . , x,,)'



integer vector



(21)



where A is an m by n matrix definin g m linear constraints on the intege r variables X l , . . • , X " . Equation (20) defin es a polytope (a n-dim ension al polygon) in R" space. When we assume that the variables must be integral, then the feasible solutions consist of int eger points inside the polytop e. Since there are good linear programming (L P) algorithms in OR, e.g. the simplex meth od or the interior point meth od [1 , 5], that can solve the L P problem in equations (19) and (20) efficiently, we may first relax the inte gral condition in equation (21) to get an optimum value La. Because equatio n (21) is not considered in the relaxed LP problem, we have L o S z* , the tru e optimum value of the int eger programming problem . If all variables from this relaxed LP solution are int egral, then La is the optimum value to the original integer programming problem . O therwise, this LP problem is divided into two sub-problems according to som e separation rules. For example, suppose variable X l is non-int egral, say Xl = 1.5 from the relaxed LPproblem, then since X l must be integral from the original problem, so we must have either Xl S 1 or X I :::: 2. Thus we can implement a simple separation rule by adding these additional con straints to the LP proble m and get two sub-problems. Each sub-problem consists of the same objective functi on (equation (1<))), ori ginal linear constraints (equation (20)) and the additio nal linear con straint (either Xl S 1 or X l :::: 2); these two sub- problems can be solved efficiently by the same LP solver to get two optimum values L[ and L 2 . N ow suppose the first sub- problem gets an integer solutio n, then this sub- problem is fatho med, i.e. it is no lon ger necessary to search in this sub-problem . Obviou sly, z* S L 1 since L I is the obje ctive value ofa feasible int eger solution in a smaller domain,



 296



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



and L, becomes the incumbent optimum value of the original integer programming problem. A sub-problem becomes fathomed when anyone of the following three conditions is met: • The associated LP problem is not feasible, i.e. there is no feasible continuous solution in this case. • The LP solution is integral, then this sub-problem deserves no further study since its LP solution provides the integer solution over the restricted domain ofsolution space. There are two possible outcomes from this case: (i) if the resulting optimum value is smaller than the incumbent optimum, then the incumbent optimum value is replaced by this smaller feasible objective value, the incumbent integer solution is updated, and the sub-problem is pruned from the branch-and-bound search tree; or (ii) if the resulting optimum value is larger, then the integer solution and the optimum value are discarded and this sub-problem is pruned from the branch-and-bound search tree, since we already have a better incumbent optimum solution. • The LP solution is non-integral, and its optimum value is larger than the current incumbent optimum. In this case, the sub-problem should also be pruned from the branch-and-bound search tree, since we know that the integer solution of the LP problem over the same restricted domain of solution space should be higher than or equal to the relaxed LP solution. Thus there is no chance for this integer solution to beat the incumbent solution (in the minimization sense), and the sub-problem should be discarded as above. When a sub-problem is fathomed, the branch ofthe search tree deriving from that subproblem deserves no further study. We conclude that a sub-problem is not fathomed and deserves further study only when its solution is not integral and the LP solution offers a smaller optimum value than the incumbent optimum value. In this case, the sub-problem will need to be divided into two or more sub-problems according to the separation rules, and each sub-problem will be examined furthermore. This iterative procedure can be seen from Figure 2, which shows a branch-and-bound search tree with each node being a sub-problem derived from the original relaxed LP problem. The outcomes for solving a LP sub-problem are illustrated in Figure 3. B&B is an exact method, i.e. it will find the optimum value if such one exists. However, from Figure 1 we see that it needs a tree search procedure to control the branching step in B&B. After solving a LP sub-problem and if it is not fathomed, then we put the resulted two or more sub-problems in a candidate list, which also includes sub-problems created from separating other non-fathomed sub-problems earlier. Which sub-problem from the candidate list becomes the next one to be examined is the important branching step in B&B. Since most integer programming problems are NP-hard, therefore if this branching step is not carefully designed, we might just search an extra ordinarily large tree before a solution is found. Besides the traditional backtracking or LIFO (last in first out) method, heuristic search methods such as Tabu search [11] may be used to improve the B&B algorithm. More information about B&B can be found in [1, 5].



 Computational intelligence for facility location allocation problems



297



Lo AX=S:b integral solution



~_:.!---I



fathomed



Figure 2. Branch and bound.



solvinga relaxed LP sub-problem



not feasible



integral solution, update incumbent or discard



non-integral solution with optimum higher than incumbent, discard



fathomed



fathomed



fathomed



non-integral solution with optimum lower than incumbent separate into subproblerns



Figure 3. Outcomes for solving a relaxed LP sub-problem.



Cutting plane is another exact method that can be used to solve the integer programming problem in equations (19)-(21). Each separation step in B&B can be considered as a partitioning ofthe feasiblespace into smaller regions (the divide step), and the branching and bounding step glues together pieces of information from these smaller regions into a global picture (the conquer step). On the other hand, a cutting plane method does not separate the feasible space into smaller regions, instead it shrinks the feasible polytope into a smaller polytope such that the relaxed LP problem in this reduced polytope produces an integer solution. Of course, in the process to reduce the original



 298



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



polytope, we should not exclude any feasible integer solutions inside the polytope, and the way to do this is by using proper cutting planes to reduce the original polytope. Each cutting plane is represented by a linear constraint, hence finding the proper linear constraints and adding them to the constraints in equation (20) becomes crucial in the cutting plane method. Traditionally, cutting plane method converges slowly to the needed reduced polytope, i.e. we may need many linear constraints added to equation (20) before we can find a proper polytope; B&B method is preferred in solving integer linear programming problem. However, late development of polyhedral theory has improved this situation [19]. B&B may be combined with cutting plane to produce the branch-and-cut method. More information of this may be found in [1, 5,19]. 2.2. Lagrange relaxation and sub-gradient method



If branch-and-bound is said to relax the integral constraint of the integer variables, then Lagrange relaxation is used to relax the hard linear constraints in an integer linear programming problem. Holmberg et al. [15] devises a Lagrange heuristic for solving a SSCFLP. Below we introduce the Lagrange relaxation and sub-gradient optimization methods that are pertinent to one of our hybrid approaches. Consider the following integer programming problem (P), II



(P)



lvfin LCiXi I



i=l



such that



x = (XI, ... , XII)'



integer vector



In other words, we separate the constraints in equation (20) into two parts: the hard part band the easy part J. Lagrange relaxation begins with the process to relax the constraints on the hard part by constructing a Lagrangian associated with the problem (P). Suppose Ii = (Uj, ... , um)t, U, ::: 0 is a non-negative real-valued vector whose length is equal to the row number of the matrix A, i.e. the number of hard constraints. Then, the Lagrangian associated with the problem (P) for multipliers Ii is defined by solving the following problem (L u):



Ax : : :



c: : : :



II



(L;;)



L(u):=



Mjn LCiX, + u'(Ax - b) j=]



such that



ex -: :. J x=



(XI, ... , XII)'



integer vector



(22)



 Computational intelligence for facility location allocation problems 299



where the minimum is taken over all integral vectors X satisfying the easy constraints only. It is easy to show that for each Ii ::: 6, L (u) s z*, the optimum value to problem (P) as follows.



Proof Let the optimum value z* be assumed by some integer solution x*, then x* satisfies j = z*, A;:* S b and C;:* s J, since it is an optimum solution to problem (P). No w, because x* is feasible to problem (Lii) and by the definition of minimum, we have



I:;'= 1cx;



1/



M ill "l...J e,Xj x



+ 1/~ ,(A x- -



II



b-) ::: "l...J Cj Xj• + II-, (Ax- . - /-J)



;= 1



;= 1



where the minimum is taken over the feasible space ofproblem (L ii). Th e left hand side c, x; = z* is L(u) by definition, and the right hand side is less than or equal to since the second term is non-positive due to the facts that u::: 6 and A;:* S b. 0



I:;'=1



Thus equation (22) provides a lower bound to z* for any non-negative multiplie r vector u. Fisher [8] shows that a vector ii* solving the followin g Lagrange dual problem (LD ),



(L D)



1* =



M.ax L (u)



(23)



n



gives the best lower bound on z* and that zLP S 1* S z*, where ~P is the obj ective value by solving the LP relaxation of problem ( P) . The Lagrange relaxation meth od for the problem (P) then proceeds to solve the optimization problem posed by the dual problem (LD) . Even though the dual problem (L D) is a continuo us optimization problem, its objective function L may not be differentiable at every multiplier ii ::: 6, i.e. L may not be smoo th. Common derivative based meth ods canno t be used to find an opt imum solution for problem (LD) . The sub-gradient method proposed by Polyak [21] is frequentl y used to handle optimization problems with non-smooth obj ective functions. A vector is called a sub- gradient of the Lagrangian functi on L at t~ if it satisfies



g



L (ii ) :::



L(/~)



+i ' (ii -



I~)



(24)



for any vector ii. L is sub- differentiable at J if it has at least one sub-g radient at l~. N ow, suppose i solves the optimization problem poised in equation (22) for L(/~) , i.e. L(tf)



= I" > Xi + tf'(A~ - h) i= l



(25)



 300



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



We will show that the following formula yields a sub-gradient of L at ~.



i:= A~ - b



(26)



Proof Let u be any vector, then by the definition of L(u) in equation (22) and the assumption of equation (25), we have L(u) - L(a) =



M,in (f:CiXi + ut(Ax 1=1



= (u t



-



~t)(Ai -



b)) - (tCiXI + ~t(Ai - b)) 1=1



b)



=i'(u -~) Therefore L is sub-differentiable at ~ and



i = Ai - b as a sub-gradient of L at ~.



o



It is noted that L may have several different sub-gradients at ~. Now, we illustrate these concepts in the following example. Example: Suppose we want to solve the following integer linear programming problem, (EP)



Mill (3Xl X1.X2



+ 4X2)



such that



Xl



+ 2X2 :s 8



Xl



+ X2 = 2



Xl , X2



are non-negative integers



Drawing the constraints of problem (EP) on the Cartesian plane (cf. Figure 4), we see that there are only 3 feasible solutions: (XI, X2) = (2, 0), (1, 1) or (0, 2). Plugging these numbers into the objective function we obtain the optimum value of 6 given by the solution (2, 0). Suppose we consider the first two linear constraints hard and would like to relax them via non-negative Lagrange multipliers U I, U2, then the Lagrangian is defined as follows: (EL u)



L(Ul, U2) =



Mill {3Xl Xl,X~



+ 4X2 + Ul (Xl + X2



- 5) + U2(Xl



+ 2X2 -



8)}



 Computational intelligence for facility location allocation prob lems 301



y



5



I .



7



x +)'=2



x +)'=5



.'.



K •••••••



x



x +2)'=8



Figure 4. Feasible do main to an integer prog ramming problem.



such th at



X l, X2



are non-negative integers



Constraints in problem (E L ii) are definitely easier than those in problem (E PJ . Indeed , from the co nstraints, we immediately see that there are on ly 3 feasible solutions to problem (E L ii) witho ut drawing a graph: (XI , X2) = (2, 0), (1, 1) or (0, 2). Therefore, L(l/ I, " 2) can be expressed as follows by substituting the feasible solutions int o th e objec tive function:



Since the multipliers are non-negative, so the first term correspo nding to the solution (XI, X2) = (2, 0) gives the smallest numb er. Thus, the Lagrangian function is given by L(u I, U2) = 6 - 3u 1 - 6112 , and th e dual problem becom es: (E L D)



such th at



Max L(II j, " 2) = Max (6 - 3111 - 611 2) 11 1. 11 2



11 1. 11 1



 302



Shing- H wang Doon g, Chih-Chin Lai and Chih - H ung Wu



Table 2 Sub-gradient iterative algo rithm choose an initial value ii(O) while (termination co nditio n is not met )



{



calculate a sub-gradient g (k) of L (ii ) at ii (k) ii (k+ l ) := max (O, ii (k) + Akg (k) / lIi (k) III



Ob viously this dual problem has an optimum value of 6, which is the same as the optimum value of the problem (£[1) . Also, the Largrangian function is differentiable at any legal point with a gradient (also, sub- gradient) of (-3, - 6). Looking back at equation (24) we notice that the direction that allows for the maximum variation of L(u) - L(/~) is given by a sub-gradient direction, i.e. u - ~ is in the same direction as g. Therefore one could set u= l~ + Ag for some positive A in order to maximize the function L(u). Based on this observation , a sub-gradient meth od is devised as an iterative meth od that is usually used to solve the Lagrange dual problem (LD) in equation (23). We list this method in Table 2. A max of 0and the adjustment along the sub- gradient direction is taken in order to guarantee that the multiplier vector is still non-negative in the next iteration:



u



(27)



Commo nly used termination conditions include wh en successive approximations do not change over a predefin ed threshold (IIu (k+ l ) - u (k) II < s) or the maximum iteration numb er has been reached . It is important to choose the right step size Ak' since improp er step size may make the iteration diverge. Polyak [21] proves that if the step size Ak satisfies the following condition, then the sub-gradient algorithm converges.



L 00



Ak



~ 0 and



k= O



Ak



= 00



(28)



2.3. Local search



A local search algorithm is a greedy algorithm. At any time of the local search process, there is always one incumbent point, whi ch is assumed to be the current besi solution of th e problem . The procedure must define a neighborhood struc ture in the solution space for any given point. For example, if the decision variables are all continuous, then one might use the Euclidean distance to define a neighborhood for a solution point. O n the oth er hand, if decision variables are discrete, the definition of a neighborhood stru cture becomes problem dependent. After the algorithm defines a neighborhood structure, it performs an iterative search procedure . In each iteration, the algorithm search for the best solution in the neighborhood of the incumbent solution, moves to that point and then continue the iteration . If a better solution cannot be found in



 Co mputational intelligence for facility location allocation problems



303



y.



y e



x



x



ji is a local minimum.



c is the global minimum



Figure 5. Local search may be trapped at a local optimum.



Table 3 A generic local search algor ithm choose an initial solution II while (better neighbor solution can be found)



{



find the best point b in the neighborhood of II lI= b



the neighborhood, then the procedure stops and the incumbent solution is returned as the best solution for the optimization problem. Since the procedur e always moves to the best neighbor and stops moving when a better neighb or cannot be found , a local search algorithm may be trapped at a local optimum point as illustrated in Figure 5. Because of the local optimality characteristics, whi ch depend very much on the initial choice of the soluti on , if a local search algorithm is used to solve a global optimization problem such as ELA P, we may have to restart the algorithm several tim es with different seeds to avoid being trapped at a parti cular local optimum point. For some nonlinear optimization problems, there are quite a few local optimum points, thu s many restarts of the local search algorithm must be done in order to get a good global optimum solution. Unfortunately, it is generally not known how many restarts must be done , and how the initial solutions must be chosen for a particul ar problem. A pseudo- code for the local search algorithm is given in Table 3. 2.4 . Genetic algorithm



Genetic Algorithms (GAs) [9,1 3, 201 are randomized search and optimization techniqu es guided by the principles of evolution and natural genetics, and have a large amount of implicit parallelism. They provide global near- optimal solutions of an



 304



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



Table 4 Standard genetic algorithm



(=0



evaluate population Po while (stopping criterion is not fulfilled)



{



P'+l = 8(P,)



t = (+ 1



objective or fitness function in complex, large, and multi-modal landscapes. In general, a GA contains a fixed-size population of potential solutions over the search space. These potential solutions of the search space are encoded as binary, integer or floatingpoint strings and called chromosomes. The initial population can be created randomly or based on the problem specific knowledge. In each iteration, called a generation, a new population is created based on a preceding one through the following three steps: • Evaluation: each chromosome of the old population is evaluated using a fitness function and given a value to denote its merit • Selection: chromosomes with better fitness are selected to generate next population • Mating: genetic operators such as crossover and mutation are applied to the selected chromosomes to produce new chromosomes for the next generation The above three steps are iterated for many generations until a satisfactory solution is found or a terminated criterion is met. A pseudo-code of a standard GA is shown in Table 4. In this pseudo-code, 8 denotes the so-called probabilistic transition operator that computes the next generation P'+1 from the current one P, by taking the fitness of the chromosomes into account. Normally, 8 is a composition of the various probabilistic operators such as selection, crossover, and mutation. GAs have the following advantages over traditional search methods: (i) GAs directly work with a coding of the parameter set; (ii) search is carried out from a population of points instead of a single one as in the case of the local search or simulated annealing algorithm; (iii) pay-off information is used instead of derivatives or auxiliary knowledge; and (iv) probabilistic transition rules are used instead of deterministic ones



[20].



2.5. Simulated annealing



Simulated Annealing (SA) is a one-point memory-less stochastic search method for finding global optimal solutions. Kirkpatrick et al. [16] presents a method of using the Metropolis Monte Carlo simulation to find the lowest energy orientation of a system. Their method simulates the annealing procedure used to make the strongest possible glass from melted glass. The SA procedure begins with an initial solution and a high temperature. Before moving to a lower temperature, SA tries a fixed number of neighbor points in random order to find a better solution with lower objective value,



 Computational intelligence for facility location allocation problems



305



suppose we are solving a minimization problem. It is important that the procedure does not alwaysmove to a better solution like a greedy local search algorithm does. At times, SA allows the current point to be replaced by an inferior solution with a higher objective value. However, these kind of inferior replacements depend very much on the current temperature. Roughly speaking, when the temperature is high, then the probability of moving to an inferior solution is higher, and when the temperature is cooled down, it is less likely to make an inferior movement. This phenomenon simulates the annealing procedure of melted glass, where atoms are allowed to make bigger movements at high temperature, and stabilize within a relatively fixed region when the temperature is down. Let f (x) denote the function to be minimized, Xo an initial solution and To an initial temperature. Besides these two initial data, one needs to define the following three elements before an SA procedure can be executed: • a neighborhood structure for each solution • the number N ofMetropolis Monte Carlo simulations performed at each temperature level • an annealing schedule, i.e. a schedule to bring down the temperature The procedure picks a random point from the neighborhood of the current solution. If the new objective value is lower, then movement to the new point is accepted and the current point is replaced. On the other hand, if the new point has a higher objective value, say f (xnew ) > f (xo), then the new point is accepted and becomes the current ( !(.'",,,,,)-fl'o)) point with the probability e" . r where T is the current temperature. Notice that when the temperature T is high, then by dilution through this high temperature a larger increase (f (xnew ) - f (xo)) of objective values is allowed with a reasonable level of acceptance probability. This random neighborhood search, a Metropolis Monte Table 5 Simulated annealing procedure for minimization /* minimize J(x)*/ picks xo, TrJ while (stopping criterion is not fulfilled) { do the following N times { pick a neighbor X new of Xo randomly if U(xnew) < J(xo) ) {



}



Xo of- x ll ew



else { pick a random number r from the uniform distribution over (0, 1) _( {(xnew)- {(XI))



if (r < e XO -+- Xnew



}



} Trl



= cTo;



IiI



) {



 306



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



Carlo simulation, is repeated for a preset number of times before the temperature is lowered down. The temperature is lowered down according to an annealing schedule. A commonly used schedule is to reduce the temperature by a fixed constant, for example Tnetv = eTo where e < 1. The pseudo code for an SA procedure is illustrated in Table 5. 3. HYBRID METHOD FOR LOCATION-ALLOCATION PROBLEM



After introducing the needed tools in section 2, we proposed three hybrid methods for solving the Euclidean Location-Allocation Problem (ELAP) considered as a Mixed Integer Non-Linear Programming (MINLP) problem. The hybrid methods separate decision variables into two groups: the continuous variables for locating facilities, and the discrete variables for allocating facilities to customers. The continuous variables are treated by a top layer Genetic Algorithm (GA) followed by a local search method, while the discrete variables are found by solving a Single Source Capacitated Facility Location Problem (SSCFLP), which results from the facility locations encoded in a chromosome ofthe top layer GA, and will be solved by three different methods, namely a second layer GA, a branch-and-bound (B&B) method, and a Lagrange relaxation with sub-gradient method. A separation ofvariables is mathematically rigorous because of a nice property of the Min operator:



¥.~nf(x, y)



= Mjn (~inf(x,



y))



(29)



We check this in the following.



Proof Let m = !11~ f x,y



ex, y) denote the minimum of the objective function over the



problem domain. Since we have m ::: f(x,



y)



for all possible (x, y) in the domain, taking the minimum of the right hand side repeatedly with respective to and then gives us



y



x



m ::: MJn (MJnf(x, y)) x



y



because the left hand side is a constant. This shows one direction of the inequalities needed to establish the equality in equation (29). Now we prove the reverse inequality. Suppose the minimum m is assumed by some xo, Yo, i.e. f(xo, Yo) = m, then by the definition of minimum with respect to we have



y,



 Co m putational intelligence for facility location allocation problem s 307



Co nsider g(x)



= Mjllf (.x, y) as a function of x, then by the definition of minimum y



again we have g (xo)::: A~illg(x), i.e. x



By the transitivity of inequality, we prove the reverse inequ ality



o In view of equation (29) wh ile trying to minimize the objec tive function in the ELAP, we may select to minimize the function with respect to th e allocation variables (playing the role of in equ ation (29)) first for a fixed set oflocation variables (playing the role of X in equation (29)), and then minimize the resulted function with respect to the location variables. More specifically, the facility location variables are encoded in chro mosomes of the top layer c A, which evolves throu gh sufficiently many generations to get a near optimal solution to the ELAP. The fitness value of a chromo some is evaluated by solving a SSCFL P that results from the ELA P by substituting the enco ded facility location data int o equation (15). A flowchart for the hybrid methods is illustrated in Figure 6.



y



3.1. Nested GA (GA



+ GA)



The struc ture of a two layer cA is illustrated in Table 6. Th e meth od is to create and com bine two different popul ation s consistently; i.e. the approach controls and optimizes both population s of the location and allocation variables simultaneously in order to solve the M INL P problem. Th e top layer cA first creates a population of chromosomes for the potential facility location, and then corresponding to a chromosome in this population a bottom layer cA creates a population of the allocation variables. The fitness value of a chromosom e in the top layer cA is given by the best objec tive value returne d from the bottom c A specifically prepared for that top layer chromosome. 3.2. GA



+ Branch and Bound



Doong et al. [7] recogniz e that when the continuous location variables are given, one can calculate the cost factors cij and the ELA P become s a SSCFLP, which is an inte ger linear programming problem. Branch-and-b ound (B&B) is commonly used to solve an intege r programming problem as indicated in section 2.1. Th erefo re, the seco nd hybr id meth od that we propose is to combine cA with B&B for solving the EALP. T he location variables will be encoded as a chromo some in the cA proce dure, and its fitness value is calculated by solving the SSCFLP problem with the B&B method . B&B method will determine the allocation variables once the location variables are given.



 308



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



Initialize chromoso mes with random posit ions for fac ility location



For each chro moso me, comp ute cost factors c1j and solve a SSCFL P by 8&8, GA or Lagrange relaxation to get its fitness value



Perfo rm genetic operations: select ion, crossov er, mutatio n



gen=ge n+ l



Figure 6. Flowchart for the hybrid methods. Table 6 Two layer genetic algorithm Top-layer genetic algorithm tt =0 initialize P(tt) evaluate P(IJl: for each individual do Bottom-layer genetic algorithm while (termination condition is not met)



{



3.3. GA



11 = II + 1 select P(II) from P(tt - 1) recombine P(lt) evaluate P(II): for each individual do Bottom-layer genetic algorithm



Bottom-layer genetic algorithm t2 = 0 initialize P(12) evaluate P(12) while (termination condition is not met)



{



12 = t: + 1 select P(12) from P(12 - 1) recombine P(tz) evaluate P (12)



+ Lagrange



Gong et al. [12] solves an ELAP by hybridizing GA with a Lagrange relaxation and sub-gradient method. However, they do not consider the fixed cost J; and its associated decision variable Yi for opening a facility. In this section, we will show that with these quantities considered, the resulted SSCFLP is still solvable by a Lagrange relaxation



 Computational intelligence for facility location allocation problems 309



and sub-g radient method . We sum marize the SSC FLP as follows: (30) where ei;



= a J (lli - aY + (Vi - by



(31)



subject to the following conditions



'"



L Xi;



= 1,



j



= 1, ... , II



(32)



i= l II



L



; =1



Xi; d; :":Yiqi ,



i=1, ... , m



(33)



Xij, Yi E 10, 1)



(34)



By considering equati on (33) as the hard constraint and include it in the associated Lagrangian we get: L (III , ... ,II ",): =



~~i~: l: Xij Cij+ L '.J



I



yJ i + L Ui( L XiJ ; I



J



Mi)



(35)



subj ect to constraints in equ ations (32) and (34). This Lagrangian can be easily solved for Xij' Yi given any mult iplier (111 , .. . , IIIn) ::::: 0, th us we can find a sub-gradient for the Lagrangian function L(IIJ, . . . ,11 11/ ) and use the sub-gradient method (equatio ns (26) and (27)) to maximize it. Th e first sum in equation (36) can be rewritten as



L (L, x., (eij + lIid; )) .



(37)



;



Given any j we can find an i o



= io(j), such that Cioj + 11 ;,/; is the smallest, i.e. (38)



By the dema nd and single sourcing constraints (equations (32) and (34)), we can define I,



Xi;



= { 0,



i i



= ioU) =I io(j)



Thi s will make equation (37) as small as possible, given



(39)



Cij ,



d;, II i .



 310



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



After taking care of the xi} variables, we can solve for the Yi variables by making the second sum of equation (36) as small as possible. The following definition of Yi will make the second sum of equation (36) the minimum, given qi, Ii, Ui. Yi



=



{l,0,



Ii - ».«. :s 0 Ii - «.«. > 0



(40)



Equations (39) and (40) together will solve the optimization problem in equation (36) subject to constraints in equations (32) and (34), given [ii, di , qj, I.. ui . In other words, they solve the minimization problem posed by the definition of the Lagrangian L(uj, ... , urn). Hence according to equation (26), the formula gi



= L x,jdj -



Yiqi



(41)



j



defines a sub-gradient for the Lagrangian l.Iu«, ... , urn). Now suppose u)k) is the k-th iterative approximation to the maximizing multiplier, xjj) and Yi(k) are the Lagrangian solutions defined by equations (39) and (40) given this value of the multiplier, then equation (27) gives us the following iterative formula for solving the Lagrange dual (LD) problem, provided that the step size Ak is chosen properly. (42)



4. MIXED TYPE CHROMOSOME AND ALTERNATE LOCATION ALLOCATION



Besides those hybrid methods in section 3 for solving the ELAP, we discuss two more possible approaches in this section. The first one is to utilize the highly adaptable feature of the data types encoded for chromosomes in a Genetic Algorithm (GA) to create a mixed type chromosome GA to handle the Mixed Integer Non-Linear Programming (MINLP) problem; and the second one is the traditional Alternate Location-Allocation (ALA) heuristic method proposed by Cooper [3]. Section 3 uses equation (29) to translate a joint multivariable minimization problem into a repeated multivariable minimization problem. From the separation of variables, we use a GA to treat the location variables, and a GA (section 3.1), a B&B (section 3.2) or a Lagrange relaxation (section 3.3) method to solve the allocation problem. Since GA is a versatile optimization algorithm, especially in its ability to encode complex data types in the chromosome, we can devise a GA with mixed-type genes to represent a solution vector for the ELAP. A mixed-type chromosome consists of real-valued genes for the facility locations, and integer-valued genes for the customer allocations. Pierrot et al. [22] used a mixed-type GA to solve an optimization problem with a simple objective function. In this chapter, we implement a mixed-type GA for the



 Co mputational intelligence for facility location allocation problems 311



Initialize problem with random choice for location



Yes



':>-- - - - .



Find the best allocation variables with location variables frozen



Find the best location variables with allocation variables frozen



iter



=



iter+ I



Figure 7. Flowchart for the A LA method.



ELAP, which has a more complicated objec tive function , and compare the result with the ones from other methods. The A LA heuristic meth od proposed by Cooper [3] was considered efficient for solving location-allocation type problems. The philo sophy of the method is simple: solve the location and allocatio n problems in alternating turns. For example, in A U method , initial location variables are first randoml y selected, then one solves the allocation problem with these location variables fixed. Once we get the new allocation variables, we can use them to solve for the location problem , this time with the allocation variables fixed. The pro cedures of solving location problem with the allocation variables fixed and solving allocation problem with the location variables fixed take turns in alternating style. Like a local search algorithm, ALA finds a local optimum solution that depends pretty much on the initial selection of the location variables. Therefore a couple of ALA procedures must be performed with different seeds to get a better global solution in ELAP. We will use a simulated annealing procedure (section 2.4) as the locati on problem solver and a branch-and-bou nd method as the allocation problem solver to implement the ALA method. A flowchart of the ALA meth od is depicted in Figure 7. 5. CURRENT OPTIMIZATION SOFTWARE



In this section we discuss some optimization software that are either commercially available with a fee or freely downloadable from the Intern et. The Optimization Techn ology C enter at Argonne N ation al Laboratory and N orthwestern Uni versity



 312



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



(http://www.ece.northwestern.edu/OTC/) provides a good resource for solving optimization problems. We summarize special features ofa couple ofoptimization packages in the following. Some of them can solve mixed integer linear programming problems very well, but cannot solve the mixed integer non-linear optimization problem presented in the ELAP.



• CPLEX-ILOG CPLEX (http://www.ilog.com/products/cplex/) provides optimizers for solving linear, mixed integer and quadratic programming problems. It is widely used to solve large systems of mixed integer programming problems. Many heuristic method designers for mixed integer programming problems seem to prefer to compare their results with the one from the CPLEX system. However, the mixed integer programming solver in CPLEX can solve only linear and quadratic type objective function, and thus is not applicable in the ELAP, where the continuous location variables appear inside the square root function (equation (15)). • LINGO-This is a commercial optimization package available from the Lindo System, Inc. (http://www.lindo.com).Itis one of the software that can solve linear and nonlinear mixed integer programming problems. The package offers its own modeling language that can be easily used to represent the users' problems. The system can be used to solve the ELAP like the following NEOS server. However, the downloadable test version of LINGO can only solve a very small size ELAP due to its restriction on the number of available variables. • NEOS-Network Enabled Optimization System (http://www-neos.mcs.anl.gov/) is an Internet-based optimization service [4]. The system is supported by the National Science Foundation and the Department of Energy, and currently maintained by Argonne National Laboratory. There are a couple ofremote servers around the world to help solve the optimization problems, which can be submitted via a Web interface. NEOS solvers represent the state-of-the-art technology in optimization software, and there are a couple of optimization packages available from the system. Among these packages, MINLP (Mixed Integer Nonlinearly Constrained Optimization package) has interested us the most since others are not able to handle the objective function in the ELAP appropriately. Program inputs can be prepared in one of two formats: AMPL and GAMS. We use NEOS for the comparison purpose since it is the only system that is freely available and can do mixed integer nonlinear programming problems needed in the ELAP. • PGAPack-Parallel Genetic Algorithm package is an implementation of GA in a parallel format. The package is developed by David Levine of the Mathematics and Computer Science Division at Argonne National Laboratory. (http://wwwfp.mcs.anl.gov I ccst/research/reports.pre 19981cornp.bio/ stalk/pgapack.htrnl). Objective function evaluation in a GA is. the most tedious part of the algorithm. It is highly desirable if one can parallelize this part of the GA procedure. Due to the facts that fitness evaluation in a population is quite independent for each chromosome and the current information technology has improved the parallel computing technology to a very usable stage, parallel GA computation thus is thought to be viable lately. PGAPack uses the MPI message passing interface library to distribute



 Co mpu tational int elligence for facility location allocation problem s 3 13



function evaluations to a cluster of computers or different C PUs in a multi-processor computer. Besides the parallel computing feature, PGAPack also has some othe r nice featur es such as • Callable from Fortran or C • Binary-, integer- , real- , and character-valued native data types • Parameterized pop ulation replacement • Multiple cho ices for selection, crossover, and mutati on operators • Easy integration of hill-climbing heuristics • Fully extensible to support custo m operators and new data types • Runs on uniprocessors, parallel computers, and work station netwo rks We use PGAPack as the CA procedure for solving either contin uo us or discrete optimization problems since it provides conven ient calling procedur e (from C or Fortran), versatile data structure for representing chromosomes, implementation stru cture (parallel or single), and the package is freely available. • UW IPMIXD-This is a mixed integer linear programmin g solver developed by the Division of Information Technology (DolT) at the Uni versity of Wisconsin - Madison. It is freely downloadable as Fortran source codes at the site: http :/ /ww w.wisc.edu/ mathsoft/ m I4e.html. The package uses a branch-and-bound (B&B) method (section 2.1) to solve small to medium sized integer programming problems. User must w rite their Fortran main programs to call the package as a subroutine. We primarily use this package as the B&B solver in the hybri d CA + B&B method (section 3.2) and in the A LA meth od (sectio n 4) for solving the allocation variables in the ELAP. • William Goffe's SA-This implementation of simulated annealing was used in the 1994 paper of Goffe et a!. in Journal of Econometrics. A Fortran source code is available from http ://emlab.berkeley.edu / Software/ abstracts/ goffe895.html . It implem ents the continuous simulated annealing global optimizatio n algorithm described in Co rana et al.s 1987 article "M inimizing Multi-modal Functions of Co ntinuo us Variables with the Simulat ed Ann ealing" . The software requires a user to write a Fortran main program to set up the optimization environment and call the package as a subroutine. It allows the user to specify an initial solution , a temp erature reduction factor, a number of Metropolis Monte Carlo simulation at each temperature level, and an acceptable tolerance for terminating the tem perature redu ction cycle. We use this package in the A LA method for solving the con tinuous location variables in the ELAP (section 4). 6. EXPERIMENTS



To test the efficiency and accuracy of the proposed methods in sections 3 and 4, we prepare a set of artificial data for experiment . There are 50 customers and 5 facilities in the data. Cu stomers' locations and demands are rand oml y generated as indicated in Table 7, and so are facility capacities and opening costs in Table 8. The obje ctive function and constraints are listed in equation (15)-(18), where we set a = 4. We summarize the parameters setting in each of the heuri stic meth ods in the followin g.



 314



Shing-Hwang Doong, Chib-Chin Lai and Chih-Hung Wu



Table 7 Customer locations and demands for the test set



j



2



3



4



5



6



7



8



9



10



aj bj dj



439.0 360.0 12



340.0 548.0 35



314.0 261.0 20



365.0 597.0 44



393.0 493.0 43



591.0 571.0 43



119.0 700.0 17



381.0 962.0 49



458.0 750.0 37



869.0 740.0 32



j



11 934.0 431.0 20



12 264.0 634.0 49



13 160.0 803.0 21



14 872.0 839.0 24



15 237.0 945.0 40



16 645.0 915.0 11



17 966.0 602.0 24



18 664.0 253.0 28



19 870.0 873.0 30



20 99.0 513.0 15



21 137.0 732.0 47



22 818.0 422.0 47



23 430.0 961.0 33



24 890.0 721.0 47



25 734.0 553.0 17



26 687.0 292.0 38



27 346.0 858.0 18



28 166.0 335.0 38



29 155.0 680.0 36



30 191.0 534.0 34



31 422.0 356.0 41



32 856.0 498.0 23



33 490.0 434.0 31



34 815.0 562.0 16



35 460.0 616.0 29



36 457.0 113.0 44



37 450.0 898.0 22



38 412.0 754.0 15



39 901.0 791.0 38



40 56.0 815.0 32



41 297.0 670.0 41



42 492.0 200.0



43 693.0 273.0 46



44 650.0 626.0 14



45 983.0 536.0 35



46 552.0 595.0 11



47 400.0 890.0 45



48 198.0 271.0 25



49 625.0 409.0 23



50 733.0 474.0 25



aj b) dj



j aj bj dj



j aj b) dj



j aj b) dj



17



Table 8 Facility capacity for the test set



q,



Ii



500.0 500.0



2



3



4



5



300.0 400.0



300.0 400.0



200.0 600.0



400.0 700.0



• Nested GA (section 3.1) The top layer GA for location variables has 10 real coded genes in a chromosome. A population of 20 chromosomes is maintained in each generation, and a stopping criterion of 2000 generations is implemented in a run of this nested GA hybrid method. The bottom layer GA will decide the discrete variables consisting of 50 allocation variables and 5 facility opening indicators. Because of the single sourcing condition, each customer can only be allocated to a single facility, therefore instead of using 250 binary variables Xij to indicate the allocation variables, we simply use 50 integer variables Z j where each Z j is between 1 and 5 and indicates the facility number selected for the j-th customer. This layer of GA has 55 integral genes in a chromosome, and a population of 110 chromosomes is used in each generation. A stopping criterion of maximum 500 generations is implemented in a run of this GA layer. Initial populations of both GA are randomly generated. Default crossover operator and rate, and default mutation operator and rate from PGAPack in last section are used in this experiment. The fitness value of a chromosome in the top layer GA is evaluated when the bottom layer GA has found the best allocation values with the given facility locations specified in the chromosome.



 Co mputational intelligence for facility location allocation problems



315



• GA + B&B (section 3.2) Again, the setting of the top layer GA pro cedure for the location variables is the same as the one in last paragraph. Now, once a chromoso me has fixed the facility location s, we can calculate th e coefficients a J (u; - aj )2 + (v; - bjf in equation (15) and use a B&B meth od to solve this mixed integer programming (MIP) problem. The near optimum value from solving this MIP problem becom es the fitness value for the chromosome giving the facility location s. We use the UW IPMI XD package to solve this MIP problem, where there are 255 binary variables (Xij and Yi) and 55 linear constraints (50 dem and constraints and 5 capacity constraints). • GA + Lagrallge relaxation (section 3.3) T he setting of the GA for the continuous location variables is ju st the same as the last two methods. However, this tim e we use a Lagrange relaxation and sub-gradient method to solve the resulted SSCFLP from a location chromoso me. A zero vector is chosen for the initial multipli er vector, and equation (27) is iterated 100000 times with the following step size 1



Ak = - - - 10000 + k



This choice of the step size fits equation (28) quite well. • Mixed type chromosome (section 4) Again, we use PGAPack in last section to solve the mixed int eger nonlinear programming problem posed by the ELAP. Since mixed type chromosome is not a native type chromosome used in PGAPack, we have to use a user-defined type chromosom e con sisting of 10 real coded genes and 55 inte ger cod ed genes to represent a solutio n point in the search space. A population of 4 times the length (65) of a chromosom e is maintained in each generatio n, and a maximum of20000 generations of genetic op eration s is implement ed in a single run of the procedure . • A LA (section 4) To start with, we generate the initial facility locations randomly from the square that encloses all the custom ers' locations. Then, the UW IPMIXD B&B solver is used to find the allocation variables from this set of facility location s. When the new allocation variables are found , we use William Goffe's Simulated Annealing (SA) in last section to determine the new location variables. Alternating steps of B&B and SA for deciding allocation and location variables are performed 100 tim es to get the final answer from the ALA heuristic method . A 0.5 temperature redu ction factor and 1000 function evaluations at each temp erature level are used in the SA procedure . The SA procedure terminates when the final obj ective values of the last four temperature levels have not chan ged more than 10- 6 . A Fermat problem is a location problem where one wants to find a point on the plane such that the sum of its distances to the thre e given points is minimum. This problem was solved by Torri celli by geometric argum ent , and the solution point is called a Fermat or Torri celli point. A generalized Ferm at problem is to find a point that minimi zes the sum of its distances to more than 3 given poin ts on the plane.



 316



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



Courant et al. [6] refers to these types of problems as the Steiner's problem. Looking back at the equations (15)-(18), when we fix the allocation variables, deciding the best facility locations becomes several independent generalized Fermat problems because of the single sourcing constraint. The single sourcing constraint assigns a group of customers to a facility, and the facility location is determined by minimizing the sum of distances between the facility and the assigned customers. (The actual objective function in equation (15) is a = 4 times the sum ofthe distances.) However, we are not using the closed form solution representation from Fermat problem to find the facility location, since SA is quite capable for solving the location problem numerically and flexible enough for us to extend the objective function to other non-Euclidean based distance. • NEOS server (section 5) In order to compare the results, we use the MINLP (Mixed Integer Non-Linearly constrained optimization Package) from NEOS and submit the same data in GAMS format to the server. A listing of the GAMS code is illustrated in Table 9. Having discussed the various settings for the proposed methods, we list the implementation result in Table 10. All of the above methods except the NEOS server are stochastic in nature, therefore a couple of runs should be implemented for each solution method. Multiple restarts are generally needed for local search heuristic such as the ALA method, however a global optimization method such as GA or SA may still get stuck at a local optimum in practice, given finite computing resources. Table 10 is the statistical result from 10 runs of each heuristic method and one run of the NEOS service. The Time row indicates average time for a single run of the method, Mean, Std Dev, Range, Min and Max represent the average objective values, the standard deviation, the range of objective values, and the minimum and maximum values of the 10 runs of the method. Nested GA and GA + B&B consume much more time than the other methods. All of these implementations are performed in an Intel Celeron 2.0G machine with 512 MB of memory. The operating system is Redhat Linux 9 and we use GNU's C (gee) and Fortran (g77) compilers to compile the programs. Though a parallel version of GA (PGAPack) is available, we do not turn the feature on. Thus, the execution time is for a single CPU machine. GA + Lagrange, ALA, and Mixed typed GA consume about the same order of CPU time. Regarding the accuracy, Nested GA has the worst result compared to the NEOS server. This could be attributed to the weak precision level of solutions obtained from the bottom layer GA for solving the allocation variables. Assuming a maximum of 500 generations in the bottom layer GA might not be able to produce high quality allocation solutions in this case. On the other hand, assuming a higher value of generation numbers will cause the program to consume more computer time and may produce higher quality solutions. The other 4 methods have better average objective values than the NEOS server. In particular, the GA + B&B method has the best average and smallest deviation values. This could be attributed to the high solution quality provided by the B&B solver, which is the UW IPMIXD routine. Unfortunately, high solution quality does come with a high



 C omputation al intellige nce for facility locatio n allocation problems



Table 9 GAMS input pro gram to N EOS serve r Sets



i l l *51 j I I*50/ ;



Param eters



dU) 1



1 12. · . . omitt ed to save spac e 50 25 . I a(j ) I 1 439. · .. om itted to save space 50 733. 1



b(j ) 1



1 360 . · .. omitted to save space 50 474 . I q(i) I 1 500 . . . omitt ed to save spac e 5 400. I f(i) I 1 500. · . . omi tted to save space



5 700. 1



Bina ry Variables y(i) x(i,j); Positive Variables u ti) v(i); Variables dist; Equatio ns deman d (j ) capac ity(i) obj ; de rn and jj) sum( i, x(i,j)) e 1; capacity(i) sumjj, x(i,j) *dG)) = I = y(i)*q(i) ; obj .. dist = e = 4*sllm ((i,j), x(i,j) *sqrt(sqr(u(i)- a(j)) +sqr(v(i)-b(j)))) + sum( i, y(i)*f(i)); M odel proce ss l all/ ;



= =



Table 10 Statistics of 10 ru ns of each meth od



Ti me M ean Std D ev R ange M in M ax



N ested GA



G A + B&B



G A + Lag



ALA



M ix GA



NEOS



570 sec 46572. 92 3591.50 9982 .86 4 1519.27 51502.13



> 4000 sec 30 1 11.37 450.35 1699.93 2940 1.99 31 101.92



150 sec 30397.80 766 .64 27 55.26 29 186.55 3 1941.8 1



84 sec 30 73 9.38 1154.43 3776 .20 29159.82 329 36.02



65 sec 3 1309 .70 143 1.93 4327.4 7 2989 1.73 34219 .20



3159 5.90 0.00 0.00 3 1595 .90 3 159 5.90



317



 318



Shing-Hwang Doong, Chih-Chin Lai and Chih-Hung Wu



tag of computing time. The GA + Lagrange method offers the next best solution at a reasonable price. The average and the minimum values all compare favorably to the NEOS server. The local search ALA heuristic method provides the best minimum value (29159.82), but with a larger mean (30793.38) and deviation (1154.43) values than the GA + Lagrange hybrid method. Mixed type chromosome GA is the speed winner with less quality for solution. Though a higher generation number (20000) is implemented in a mixed type GA, it is still fast since unlike the other 3 hybrid methods, there is no tedious computation in the bottom layer of the hybrids to solve a SSCFLP. The computation of the fitness value ofa mixed type chromosome is straight forward by plugging in the objective function formula in equation (15). On the other hand, the result is not as good as others and this might be attributed to the fact that since real coded genes (the location variables) and integer coded genes (the allocation variables) appear differently in the structure of the objective function, different designs of crossover and mutation operators should be used for different types of genes to improve the accuracy of the algorithm. Over all, all studied methods except the Nested GA provide better average and minimum objective values than the NEOS service. When computation efforts and solution quality are considered together, it seems that the GA + Lagrange hybrid offers the best value, and the tradition ALA method comes next in this particular experiment. 7. CONCLUSIONS



In this chapter we first introduce facility location problems, which are becoming more important than ever for planning facility or service locations to serve customer demands well. Most studies in this area actually deals with the service allocation problems that are special cases of mixed integer linear programming problems in operations research (OR). We consider an extended facility location problem that involves the decision of facility locations and service allocations simultaneously. This ELAP (Euclidean Location Allocation Problem) is a nonlinear mixed-integer programming problem that many optimization software packages cannot solve directly. Some ofthese optimization packages will do continuous problems, and some of them can solve mixed integer programming problem with linear or quadratic objective function, but not many of them can solve a mixed-integer nonlinear programming problem ofthe type in equation (15). Because of equation (29), we can translate a joint minimization problem into a repeated minimization problem that allows us to treat the decision variables separately. By using this separation of variables we introduce three hybrid methods in section 3 to tackle the ELAP. Continuous location variables and discrete allocation variables are taken care of by different optimization techniques respectively. For example, we treat the continuous variables by a genetic algorithm (GA), and the discrete variables by a second layer GA, a branch-and-bound (B&B), and a Lagrange relaxation and subgradient method. We also couple a Simulated Annealing (SA) procedure with B&B to implement the traditional Alternate Location Allocation (ALA) heuristic method for solving the ELAP. Some of these optimization techniques are deterministic algorithms from OR, e.g. B&B, Lagrange relaxation and sub-gradient methods. Some



 Computational intelligence for facility location allocatio n prob lems



319



other techniques such as GA and SA are computationally intelligent search meth ods from Artificial Intelligence (AI). In section 6 we show that it is possible to combine freely downl oadable software packages from the Intern et to implement a customized solution package for solving the ELA P, which is a comp utationally hard problem. We comp are the results with the one from a state- of- the- art optimization system (the NE O S) and find that the custom made solutions are quit e competitive. From this experience of dealing with the ELA P, we feel that some of the work that we have don e can be easily extended. We summarize it as follows: • The facility location probl em that we dealt wi th does not involve any obstacles in the plane region . Gon g et al. [10] consider a special planar location allocation problem with obstacles in th e region . The hybrid methods proposed in section 3 can be extended to this type of ELAP. • Since the fixed cost Ii for setting up a facility is most likely location dependent in many businesses, it is more realistic to assume that Ii is position specific, e.g. we may assume that Ii = 1,(u, v) is a constant when the facility is opened in a specific region, and ano ther con stant in a different regio n. The hybrid meth ods can also handle this kind of problem extension . • The ELA P we treated before is a special type of mixed integer non-linear programming (M INL!!) problem with the continuous variables appearing inside a square roo t function in equation (15). T he hybrid meth ods in section 3 can be easily extended to other M IN L P as lon g as the obj ective fun ction has goo d separation prop erty of the variables. • From th e experiment, we see that it is easy to combine different optimization techniques from OR and A I using readily available Intern et resources. Some met aheuri stic methods such as Tabu search and scatter search are becom ing more important search meth ods in A I. T hus. in the futur e it will be interesting to experiment hybrid methods to solve hard optimization problems by incorporating advanced techniques from OR and A I. • In order to make the proposed meth ods more efficient , it is important to fine tune the parameters in the respective met ho ds, for examples the various system paramet ers required for the GA and SA procedures. REFERENCES [11 Bradley, S. E, Hax, A. C, and Magnant i, T. L., Applied Mathem atical Programm ing, Addison- w"sley Publishing Company, 1977. [21 Uespamyatnik h, S.• Kedem, K., and Segal, M ., O ptimal Facility Locatio n under Various D istance Functio ns, InternationalJOllrnal if Computational Geometry and A pplications, Vol. 10, No.5, pp. 523-534.



zono.



[3J C oo pe r, L., Heur istic Methods for Location -Allocation Problem s, SIA A1 Review, Vol. 6, N o.1 , pp. 1- 18, 1964. [4] Czyz yk,j. , Mesnier, M . E, and More,j. j ., The NEO S Ser ver, IEEE Computational Science & EIlJi lleerillJi, 5, pp. 68- 75, 1988. [51 Ca rte r, M . W., and Price, C. C , O perations research: a practical intro duction , CRC Press, Bora R atoll, 2001. [6] Co urant , R ., and R obbins, H ., W hat is Math ematics' Ox ford Ullivm it},Press, Lolldo.., 1978.



 320



Shing-Hwang Doong, Chih-Chin Lai and Chili-Hung Wu



[7] Doong, S.-H., Lai, C-C; and Wu, C.-H., A Hybrid Model for Solving Single Source Capacitated Facility Location Problem, 14th lASTED International Conference on Modeling and Simulation, pp. 395400, Palm Springs, USA, 2003. [8] Fisher, M. L., The Lagrangian Relaxation Method for Solving Integer Programming Problem, Management Science, 27, pp. 1-18, 1981. [9] Goldberg, 0. E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addision- U!esley, Reading, PA, 1989. [10] Gong, D., Gen, M., Yamazaki, G., and Xu W., Planar location-allocation with obstacles problem, IEEE International Conference on Systems, Man, and Cybernetics, Vol. 4, pp. 2671-2676, 1996. [11] Glover, E, and Laguna, M., Tabu Search, KluwerAcademic Publishers, Norwell, Massachusetts, 1997. [12] Gong, D., Yamazaki, G., and Gen, M., Evolutionary Program for Optimal Design of Material Distribution System, Proceedings of IEEE International Conference on Evolutionary Computation, pp. 139-143, 1996. [13] Holland, J. H., Adaptation in Natural and Artificial systems, The University of Michigan Press, Ann Arbor, 1975. [14] Hakimi , S. L., and Kuo, C. c., On a General Network Location-Production-Allocation Problem, European journal of Operational Research, 55, pp. 31-45,1991. [15] Holmberg, K., Ronnqvist, M., and Yuan, D., An exact algorithm for the capacitated facility location problems with single sourcing, European Journal of Operational Research, 113, pp. 544-559, 1999. [16] Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. E, Optimization by Simulated Annealing, Science, 220, 4598,pp. 671-680, 1983. [17] Klose, A., and Wittmann, S., Facility Location Based on Clustering Methods, Proceedings of the Second International Workshop on Distribution Logistics, pp. 93-96, Oegstgeest, The Netherlands, 1995. [18] Love,R. E, Morris, J. G., and Wesolowsky, G. 0., Facilities Locations: Models and Methods, NorthHolland, New York, 1988. [19] Mitchell, J. E., Branch-and-Cut Algorithms for Combinatorial Optimization Problems, Handbook of Applied Optimization, Oxford University Press, 2000. [20} Michalewicz, M., Genetic Algorithms + Data Structure = Evolution Program, Springer- Verlag, Berlin, 1996. [21] Polyak, B. T., Minimization of un-smooth functions, USSR Computational Mathematics andMathematical Physi0, 9, pp. 14-29, 1969. [22] Pierrot, H.J., and Hinterding, R., Using Multi-chromosomes to Solve a Simple Mixed Integer Problem, 10th Australian Joint Conference on Artificial Intelligence, Lecture Notes in Computer Science, 1342, pp. 137146,1997. [23] Ronnqvist, M., Tragantalerngsak, S., and Holt, J., A repeated matching heuristic for the single-source capacitated facility location problem, European Journal of Operational Research, 116, pp. 51-68, 1999.



 INDEX



abduction 3 11. See also deduction; induction



abductive inference types 3 11. See also deductive expert systems; inductive expert systems abductive reasoning 3 12. See also deductive reasoning; inductive reasoning



ABKM. See agent-based kuowledge management ABox 1 348 absolute hypovolaemia diagnosis 4 89 abstraction patterns 3 206



long-term changes tests 3 213 short-term changes tests 3 213 TA algorithms for 3209, 210 accelerate SAP. See ASAP acceptability of risk ALARP approach 4 308, 310 decisions 4 308 problem 4 307 access control 1 273, 283; 4 123 accessibility constraint. See under assembly sequence



planning ACIRD system 1131,132,133 ACL 1 178; 3 85-86; 4 135 for c-lcaming system 3 194



actively process 3 123 passively process 3 125 acquisition cycle 3 274 Dempster-Shafer technique for 3 280, 282, 283 incident and reflected ray alignment (example) 3 278, 279,280 long term memory 3 275 scope of realization 3 278 short term memory 3 275



action and coordination state 3 278 action selection strategies



fixed sequence of actions 3 186 probabilistic reasoning 3 186 rule-based reasoning 3 186



active knowledge possessors 1 182 active objects 1 189. See also passive objects active server pages 2 262 activity models 1 326 business process interactions 1 314 defined 1 294 process models 1 312 reference models 1 330 state transitions 1 294



387 See also FlPA ACO. See ant colony optimization algorithm acquire agent in AKBM system 3 123



See also behavioral models activity/space ontology 3 9 actuation analysis. See induced strain actuator AD. See attribute dispensable adaptability of monitoring systems 3 224, 228



acquire knowledge



adaptation formulae in CBR systems 1 91



relationship with agent and content language



321



 322



Index



adaptive control



for mean arterial blood pressure 4 Y8 for multiple drug infusion 4 103 in patient monitoring 4 97 multiple model adaptive controller 4 YH adaptive fuzzy logic systems for forecasting 5 102 adaptive infrastructure building 1 69 adaptive materials and smart structures. See adaptronics



adaptive network-based fuzzy inference system. See ANFIS adaptive resonance theory 4 2H, 80 adaptronics 4 330. See also mechatronics adjacency matrix



attributed 5 10 V-,5 10 adjustabibty parameter 1 74 administrative interface 2 279. See also Web data extraction



system architecture advanced process control 2 29



UML. See agent-UML work-flow 1 201 See also multi-agent system agent-based knowledge management acquire agent 3 122, 123 acquire knowledge actively process 3 123 passively process 3 125 architecture 3 123 distribute agent 3 123 distribute knowledge actively process 3 125 passively process 3 125 external entities object 3 123 organize agent 3 123 profile object 3 123 agent-based supply chain processes 3 105 agent architecture 3 112



AEC. See architecture-engineering-construction



ICSA agents 3 106



AES encryption 1 273 agency 3 83 higher-level 3 81 lower level 3 HI virtual 3 81 agent active objects 1 189 AI-ESTATE standard for fault diagnosis 4 233 architecture 1 176 artificial 1 1H3 based e-learning systems 3 1H2 cloning 3 245 communication management 1 200 conversation policy 3 88 coordination 3 18H, IH9



agent-based system 3 79 conversation policies 3 89



intelligent 3 7Y, 80 IOOA 3101 object-oriented design 3 93, 95 agent communication



ACL 3 H5 content language 3 86 agent communication language. See ACL



agent inference model



behaviors 3 137 customer-centric features 3 138 defuzzification 3 137



identification 3 188



domain knowledge 3 138 factor object 3 136 fuzzification 3 137 fuzzy set modeling 3 137 impact object 3 136



inference model 3 136, 150



implementation 3 142



decision-making 1 177



defined 1 175, 176



intelligent. See intelligent agent interaction 3 189



knowledge 1 181, 189, 202 knowledge base 3 1YS learning. See agent learning loosly-coupled interaction 3 81



migration 3 245, 24H models 3 H3 naming sub-system (ANS) 3 Y3 notification 3 1HY object-oriented design 3 94 personal 1 200 program 1 176; 3 77 programming. See agent programming language smart 1 201 state control of3 161 systems 1 226 technology 3 77, 78 tightly-coupled interaction 3 81 typology 3 77



user's knowledge 3 13H agent interaction protocol 4 409, 413 agent learning



inductive 1 177 reinforcement 1 177 agent-oriented programming



agent-oriented language 3 15Y from object-to 3 ISH IOPF 3 172 message passing 3 160 object-oriented approach 3156, 157 role definition 3 162 RPC schemes 3 160, 161 senario definition 3 162 state control of agent 3161 Sec also object-oriented programming agent programming language 1 180



ACL 1 17H;3 85 genericity and composability 3 170 KIF 1 178



 Index 323



KQML 1 17H



funda mentals 1 70



roles as prog ramming arti facts 3 167



fuzzy rul es 1 73



sce narios as progra mming art ifacts 3 167



holi stic schem e 1 7 1



state co nt ro l aspects 3 167



if- th en ru les 1 6H, 72



state space inh eritan ce 3 168



know led ge- based 1 6H, 71



sub- proto co ls 3 171 agent ro les



pro du ction agility 1 74 agil ity rnc tr ics



explorer 3 H4



case 1 69



manager 3 84



range 1 69



o ptimiz er 3 H4 scheduler 3 H3



agent searches data-driven 3 134 goa l- d riven 3 134



agent system architectu re, B2B system IGA 3 146 mu ltiage nt systems 3 145 SIG A 3 146 age nt-U M L 3 157,162,170172



time 1 69 AGV. See auto mated gu ide d vehicles A HP. See analytical hierarchy pro cess Al l 166 based business forcasting 4 149 based CApp system 5 30 based fau lt diagn ostic solutio ns 4 23H based int egrated int elligent de sign 4 34 based real time expert systems 4 124 distributed. See OA I



Dh eli 3 176



expert system s 3 56



for int era ction pro to co l mo deling 4 426 interaction among agents repr esen tatio n 3 158



for in telligent patient mon itor ing 4 60, 64, 83



ove rall pro to co l representation 3 ISH



for product desig n 4 9



ro les repr esen tati on 3 167



in medicine 3 198



for KM 1 170



agg regatio n 3 I H



kn owledge rep resentation paradigm 1 175



Agg regaror I 140 , 143



mi ssion cr itical system an d 4 124



case study (d ig ital cam eras) 1 159



suppo rte d Inte rn et- enabled virtua l protory p ing 4 41



for knowledge- based com parison chart bui lder 1 157 syste m arch itecture 1 156, 157 Agg regator modu les



use in e-learni ng 3 I H3 Al- supp or ted KM to ols



evalua tor 1 156,157



de cision support systems I 174



interactive wrapper creator 1 156.15 7



expert systems 1 174



kn owledge-based 1 156, 157



in telligent agen ts 1 174



pub lisher 1 156,157 agile bu sine ss pro cesses 3 76 ag ile manu facturing 2 35 conditions I 70 information system 2 4 1 requ irem ents t 70 ag ile m anufactu ring method s co m put er and communicatio n netwo rks 1 67 flexible m anufacturing system s 1 67 j ust- in- time man ufactu ring 1 67 agility co nce pt 1 6H



defined 1 67 dime nsio ns 1 71 m odeli ng 1 70 multi-path 1 76



agility infr astru ctures



m anagem ent information system t 174 virt ua l reality 1 174 AIM . S ee agent infe ren ce model airfo il vane actu ation (ISA app lication exam ple) 4 365 aerodyn ami c stiffne ss 4 367 displacement amp lificati on 4 367 , 368 , 370 energy trans fer 4 36H, 369 ki nemati c gam 4 36H airline ticketing system (e- bu siness case study) B2B model 3 139 decision m aking aspec ts 3 140, 143 knowledge base 3 140



See also B2B mul ti- agent systems AIS. See anaesthesia info rma tio n syste m ALA. See altern ate locatio n- allocatio n ALARp 4 30 H, 310 altern at e loc ation -allocati on



in formation 1 7 1, 70



bran ch-and-bound m eth od 5 3 10



market 1 7 1,75



heuristic m eth od 5 3 11, 3 15-3 16



peopl e 1 7 1,75



Lagrange relax at ion m ethod 5 3 10



prod uction 1 7 1,74



agility measuremen t adaptive infrastru ctu re managem ent 1 69 dim en sions 1 7 1 direct schem e 1 70



See also operatio n research AM .



St!l' adjacen cy mat rix anaesthesia information system architecture 4 106 data display for int ellige nt m onitors 4 108



 324



Index



display of anaesthesia information 4 107 layout 4107 anaesthesia monitoring 4 85



absolute hypovolaemia disgnosis 4 89 diagnosis 4 86 Intelligent alarms 4 88 intelligent anaesthesia monitor (lAM) project 4 86 malignant hyperpyrexia diagnosis 4 87 ORAMA system 4 86 SENTINEL system 4 87, 88, 89 See also depth of anaesthesia; leu monitoring analogical reasoning for product-process design 4 21, 22 analysis agent, defined 3 123 analytical hierarchy process 1 101; 5 186 analyzer agent 3 146 AND as fuzzy conjunction 1 73 connective 1 78 operator 4 72 AND/OR graphs 5 238 ANDES system 2275,281 ANFIS 5 124 architecture 5 129 training 5 129 See alsoRBFN animated simuation



ARENA-based 2109 denefits 2 109 for system performance investigation 2 108 ANN 1 254, 255; 5 4 algorithm for obstacle avoidance learning 3 300 autoregressive 5 102



based LCC estimation model 5 42, 48 for DOA indiciation 4 103 for EEG analysis 4 92, 94 for esophageal intubation detection 4 91



for fault diagnosis 4 228 for financial time series prediction 5 61



for for for for



LCC estimation 5 38, 40 patient monitoring 4 63, 68, 79 product design 4 28 product-process design 4 15



ANN techniques for CAPP 518,29-31 hybrid approach. See ANN-based hybrid approach to CAPP input representation 5 22 ordered binary form output format 5 24 topology 5 19 traming method 5 25 ANS. See under agent ant colony optimization algorithm 1 255, 256 anti-virus program 1 274 aortic flow. See cardiac output



APC. See advanced process control apparent stiffness ratio 4 388 application agents (as e-business agent) 3 133 application level protocols 3 155 approximate image recognition 1 368



aprior i algorithm 1 118, 119 AR. See attribute replaceable arc objects in goal model 3 184 architecture-engineering-construction 3 15 architecture modeling 1 9



ARENA 2 69 ASCII file formats 2 84 based PCBA system 2 74 DIF file formats 2 84 Input Analyzer 279,80 Output Analyzer 2 84 simulations 2 96, 104 software 2 73 See also SIMAN simulation language AREl'iA 3.0 based integrated model flowchart 2 95 modeling simulation package 2 70 simulation tool 2 93



ARENA model 2 73, 74 component modules 2 76, 77 data modules 2 76, 77 distribution definitions 2 82 input data acquisition 2 82 line balancing requirement 2 107



for time series prediction 5 76, 77



model development methodology 2 75 of PCBA system 2 97 probability distributions 2 91, 102



forecasting with 5 98, 101



stochastic system models 2 82



ANN-based feature recognition input representation 5 8



output format 5 15 topology 5 4 training method 5 16 ANN-based hybrid approach to CAPP ANN using CAPP 5 29, 30 expert system using CAPP 5 27-30 fuzzy logic using CAPP 5 29, 30, 31 GA using CAPP 5 31



workcentre modules 2 75, 76 See also COM NET model ARENA simulation model experimental framework development 2 78



operational system development 2 74 simulation data analysis 2 84



ARG. See attributed relational graph ARIMA. See autoregressive integrated moving average ARMA. See autoregressive moving average ARNN. See autoregressive neural networks



input representation 5 31



ART. See adaptive resonance theory



output format 5 34 topology 5 34 training method 5 34



artificial agents 1 183 robots 1 189 software agents 1 189



 In de x



artificial inte lligence . See AI



C A-based optimi zation 5 234, 239 -240 , 26 3



artificial know ledge 1 185



H ighL AP 5 238



art ificial know ledge possessors



simulated annealing 5 238



active 1 182 passive 1 182



SLMC assembl y 5 238 assembly sequence planning quality me asures



art ificial neural networks. See ANN



fitne ss fun ction 5 25 9



artific ial system components



multi - cr iteri a optimization 5 260. 261



man agement & co ntrol system 1 290



ma nagem ent information system 1 290 physical system 1 290 A/S o ntology. See activity/s pace onto logy



partial fitness fun ctions 5 260 reorientation constraint 5 261 separabi lity co nstraint 5 2W single optim ization crite ria 5 260



As Low As Reasonably Possible. See ALARP



Asse rtional Box. See ABox



ASAP 2 5



assistant age nt 1 200 assoc iatio n mining algor ithm s 1 11H



ASC lI file form at 2 84, 85 text pages 1 112 AS- IS process model 1 327. See a/50 TO- BE pro cess m odel s



325



association rul e 3 2 14 sign ifican ce of rule accur acy/ confi dence 3 215 signifi can ce of ru le antec edence 3 215 significanc e of rul e conseque nce 3 215



developme nt of 1 328



association rul e mi nin g 1 132-133; 3 256-257. See a/so



Q MS aspects 1 329



classification methods assumption- based truth maint enanc e system 4 220



ASP. Seeassem bly sequence planning assem bly process 1 5 assembly process modeling assemb ly sequence plan ning and 5 240



AT MS . See assum ption- based trut h main ten anc e system atom ic state object 3 1H4. Seealso composite state goal



SLMC plan 5 241



mod el obje cts attacks. See security attacks atte ntion state 3 277



T SP mo deli ng 5 24 1



attraction mo dels 2 328



wave model 5 243



attribute dispen sable



grap h o fl iaiso ns 5 241



Sec also assembly sequence m odel assem bly sequence gene ration assem bly cons traints 5 247-249 assemb ly table 5 25 7 optimi zation co nstraints 5 24 8, 24 9 assembly sequence model EM AS 5 245 non -li near sequences 5 246 non-mono to ne sequences 5 246 non-seq uential assem bly plans 5 245 no n-SLMC sequences rep resentatio n 5 245 pseudo -non-coherent assemb ly plans 5 247 SLM C sequences rep resent ation 5 244 Sec also assembly process mo deling assem bly sequen ce planning accessibility constraint 5 236 , 246, 249-252 , 260 - 26 1 back gro un d 5 235 geome tr ical co nstraint 5 236 , 250, 252 hard 5 248 hydra ulic m ot or (exam ple) 5 265 preced ence 5 236, 249-250 reorient ation cons train ts 5 26 1 separability co nstra ints 5 260 weak 5 248 assembl y sequence planni ng op timi zation AN D /O R gra ph s 5 238



iu E-SQ L 2 239



param eter 2 229 attri bute replaceable in E- SQ L 2 239



in view schema 2 229 parameter 2 229 attributed adjacen cy matri x 5 10 attributed relatio nal graph 3 259 AU M L See age nt-UML au thentication 1 272 , 277



in missio n cr itical systems 4 123 probl em in mobile co mmunicatio n 1 279 protoco ls 1 282 WAP dev ices 1 2HO



WPKI1 2HO authenticatio n system Kerberos 1 278 RADIUS 1 278 au thent icity 1 279 authorizatio n 1 277 , 282 autoc orre lation 5 95 , 103 auto mata-related ope ration s co m posi non .:\ 1(l9 conca tenati o n 3 169 cross product 3 169 union 3 169



assem bly process m od el 5 240 , 241



auto mated design for prod uct 4 5



co mparison w ith job shop scheduling problem 5 236



aut om ated guided vehicles 2 28



com parison w ith traveling salesm an problem 5 236 de fined 5 237



automa tic categorization 1 119 . SC(' also manual catego ri zation



 326



Index



Automatic Classifier for Internet Resource Discovery.



See ACIRO system automatic closed-loop control 4 102 automatic control in patient monitoring



Baseliner tool 2 102 basic model in management modeling 1 221 basic probability assignment 3 281 batch shop 3 66



adaptive control 4 97, 103



batch training 5 26



CAMAC 4104



Bayesian classifier 3 258



closed loop control 4 102



Bayesian inference 3 186, 187



fuzzy logic control 4 103 generalised predictive control 4 98 GPC 4 97, 98 MMAC 4 97, 98 MPC 4 97, 98 multiple model adaptive controller 4 98 ncuro-fuzzy control 4 103 PIO control 4 98 automatic drug delivery system 4 100 automatic generalization 1 146, 149



Bayesian inference for error estimation 5 93



Bayesian networks for inteIIigent patient monitoring 4 77 BB model. See building block model BD! agents 3 134 Beghera application 4 414 behavior AIM 3137 defined 3 37 behavior-driven function-environment-structure.



automation in manufacturing 2 29



See B-FES behavior layer in functional modeling framework 3 33



autoregressive integrated moving average 5 72



behavioral models 1 294, 330



autoregressive moving average 5 67, 72



business process interactions 1 314



autoregressive neural networks 5 102



control flow representation 1 295



axiomatic design (flywheel example) 2 180, 182, 200, 204



exception handling mechanisms 1 295 for fault diagnosis 4 218



B&B method, Sec branch-and-bound method B2B e-commerce 2 34; 3 132 agent architecture 3145,146,147



airline ticketing system (case study) 3 138 intelligent agents 3 135 knowledge base creation 3 141 multi-agent system and 3 135, 150 ontology 3 148 See also agent inference model B2B multi-agent systems architecture 3 145 implementation 3 150 N-tier 3 146 B2C c-commcrce 2 34; 3 135 backpropagation algorithm 4 158; 5 16, 105 batch training 5 26 delta rule 5 25 far ANN-based CAPP 5 25 for r.cc estimation 5 50, 51, 52 for neural network training 5 104 for RBFN training 5137-138



process model 1 312 process synchronization 1 295



See also activity models belief, desire, intension agents. See BDI agents



beliefs feature in CPOL language 4 424 Bell membership function 5 117, See also gaussian function best-first heuristic search 3 26, 27, 41, 43 Best-in-Class criteria 1 89, 99 B-FES 3 32 functional modeling framework 3 32-34 path types 3 33 bi-directional associative memory 3 308. See also temporal associative memory biometrics 1 274 black box type model 1 220 blackboard architecture 330-31; 4135 BLAST server 1 131, 132



for time series analysis 5 82



blood flow 4 94. See cardiac output blood pressure mean arterial 4 97 openEHR example 4 279 boolean relations as precedence relations 5 254



Levcmberg-Marquardt approximation 5 26 through-time algorithm (BPTT) 5 92, 105



bootstrapping



Sec also conjugate gradient algorithm; unsupervised learning algorithm



backpropagation neural networks 3 308; 4 28, 171; 5 30 for LCC estimation 5 51 recurrent 5 6 backward chaining 322,23, 134. See also forward chaining backward recovery 4 129. See also forward recovery bagging 5 94 balanced scorecard, See BSC methodology BAM. See bi-directional associative memory Base Practice Adequacy Rating Scale 1 218



boolean replaceability 2 230 bagging idea 5 94 for variance estimation 5 94



RMS error and 5 94 See alsojackknife technique bottom-up approach 1 303See also top down approach Box-Muller algorithm 4 200 BP. See business process BPM. See business process modeling BPN. See backpropagation neural networks BPR. See business process re-engineering BPTT. Sec underbackpropagation algorithm



 Index



Brain -State-in-a-Box 5 21. See also MA XNET



G RAI me thodology 1 3 11, 314



branch-an d-bou nd method



in business pro cess re-engineering 1 326 in KM 1 289



for integer programming problem 5 295- 2% for linear programming pro blem 5 295- 296 for loca tio n allocation problem 5 30 6-307



ISO 9000 family of standards 1 309 , 3 10 , 3 19 langua ges and too ls 1 299



branch objects in goal mo del 3 184 branchi ng tem poral logic 4 433. See also linear tem poral



real world mapping 1 292 reference model s 1 294



logic breadth-firs t search 3 24 , 25 . See also depth- first search 3 24



workflow managem ent 1 30 5



USB networ k 5 21. See also MAXNET



wor kflow mod eling languages 1 308 business pro cess model ing languages CIMOSA langu ages 1 300



USC me tho do logy 1 324



D FD 1 300



UTO paradigm . See build-to-o rder paradigm bu ilding block mo del 1 5



EP C 1 300 for mal 1 300



hierarchical representation scheme 1 5



info rmal 1 300



tree architec ture 1 5



semantics 1 299 semiformal 1 300



building blocks, hie rarchial 1 15 build-t o- order paradigm 2 36, 53 bui lt-in kno wledge 1 185



business forcasting Al techniques for 4 149 co nventio nal forecasting techniques 4 t 48 FSD me thod 4 192 int elligent fo recasting system 4 147, 149 qualitative forecasting tech niqu es 4 148 quantitative fore casting techniques 4 148, 190



See also IFS business knowledge base 3 195 business management server 3 195 business process 1 29 1



activities t 266 conce pt 1 265 decom position 1 304 entities 1 266 ill-stru ctu red 1 295 int eractio ns 1 3 14



modeli ng langu ages 1 300 obj ects 1 266 stru ctured 1 295 unstructur ed 1 295 business process mo del activity models 1 294 as ent erp rise mo del 1 293 behavi oral mo dels 1 294 bottom-u p approach 1 303 define d 1 291 enti ties in 1 29 1 inference rules 1 292 real worl d mapp ing 1 292 reference model s 1 325 , 330 top -d own approach 1 303 business pro cess modeling. See also enterprise mo delin g and knowledge management 1 330, 337 artificial system mo del 1 290 C IM O SA language 1 304 ent erpri se mo delin g 1 294 enterprise reference mod els 1 302 GERAM fram ework 1 296



327



symbology 1 299 syntax 1 299



UML 1300 workfl ow languages 1 300 business process modeling principles modeling app roach 1 303 process decomposition 1 302 process models granularity 1 303 business pro cess re- en gin eer ing 1 289; 2 35 BPM in 1 326 relation with IC T 1 266 ten-step approa ch 1 327 TQM 1 326 busine ss systems



e- co mm erce 2 264 requirement s 2 263



Web and 2 263 business- to- bu siness. Sit' B2B e-co mme rce bu sine ss- to-co nsumer. See B2C e-co m merce bu y de cision . See make or buy decisio n C language 3 158 C++ 3 158 C ABR O prog ram 3 215 C AD 2 28, 29 for int elligen t produ ct design 4 10 for product design 4 9 for product developm ent 4 5, 10 for mass customi zatic n 2 37 software 3 4, 7



See also C AE; C AM CA D evolnti on along the commun ication axis 3 5 along the kn owledge axis 3 5, 6 alon g the model ing axis 3 4, 5 C AD mo deling for multi- heterogen eous ma terial 2 1117 exam ple 2 214 geo me tric model 2 205 material constituent composition models 2 207 material microstructure m odels 2 208 material mo del 2 205



 328



Index



requirement analysis 2 203



case-based knowledge base 3 134



sub-models integration aspects 2 213



pitfalls 3 135



unified CAD model 2 205



advantages 3 135



CAD systems process threading 3 4 process-centric design 3 4



CAD/CAM integration, ANN-based



See also rule-based knowledge base case-based reasoning 1 84, 90; 3 29 agents 3 134. Sec also BDI agents library 1 93



CAPP 518



process 1 91



feature recognition 5 4



for fault diagnosis 4 223, 232-235, 245



hybrid approach 5 27



See also CAPP CAE 2 29, 37



for product-process design 419-21 case-based reasoning systems



adaptation formulae 1 91



for product development 4 6



in make or buy decision 1 92



for mass customization 2 37



in make or buy model 1 93



software 3 7



in purchasing decision 1 91



See also CAD; CAM CAM 2 28, 29



in strategic purchasing 1 92



prototyping approach 1 93



for product development process 4 5



ReMind 1 93



for mass customization 2 37



requirements 1 92



See also CAD; CAE



case specific data 3 57



CAMAC. See control advance moving average controller



case structure. See quality case structure



capability maturity model 2 13, 14



CAT. See computer aided testing



capability of an organization



core competencies 1 318



defined 1 317 capacitated facility location problem 5 290 capnography monitoring 4 91 CAPP 4 5,12 expert system control module 5 27



for CAD/CAM integration 5 4 neural network control module 5 28 CAPp, ANN-based 5 18



CATS diagnosis engine 4 221 Cauchy membership function. See Bell membership function causal behavioral



inference (CBl) 3 33, 35, 39, 41 ptocess 3 33 causal dependency 5 61 causal models 4 218 CBL See undercausal behavioral CBIR. See content-based image retrieval CBJ~ See causal behavioral process



BSB network for 5 21 competitive networks for 521



CBR. See case-based reasoning



feedforward networks for 5 19 Hopfield network for 5 19



CC guidelines. See common criteria guidelines CCL. See collaborative coaching & learning



input representation 5 22



COFMC. See customer-driven design for mass customization



MAXNET for 5 21 cardiac cycle. See ECG cardiac output



flow-pressure-volume measurement 4 96



CD- ID IS. See lmder concurrent design COM. See chronic disease management



CDMA networks 1 262



multiple drug infusion and 4 103



CDPD. See cellular digital packet data



non-invasive pulse contour measurements 4 95



COPS. See co-operative distributive problem solving 4 24



thermodilution technique 4 94



CEo See concurrent engineering



ultrasonic measurement 4 95



cellular digital packet data 1 262



waveform classification 4 96



center-of-area defuzzification method 1 79



cardio-anaesthesia 4 108



central information system 2 133



CARDIOLAB system 4 84



centralized distribution process 1 167



care plan



centralized monitoring 3 225, 226



defined 4 260, 261



centralized multi-user system 4 47



objectives 4 261 properties 4 261 See also clinical guidelines; clinical workflow; pathway



centralized organizational structure 2 138 centroid-based defuzzification method 5 200



in patient care; protocol in patient care carrier sense multiple access/ collision detection.



See CSMA/CD CASE 1 200; 2 37 case-based design for electromecanical system 4 43



CFLP See capacitated facility location problem



CG. See conceptual graph CG wrapper comparison chart building with 1 152 dual wrapper approach 1 155 generalization operation 1146, 149



 Index 329



informa tio n extracti on wit h 1 156 lo opin g wrappe r 1 151 ,152 on to logica l knowled ge 1 153



classifier Bayesian 3 258 maximum likelih ood 3 258



o ntology I 153



Clementine data minin g system 3 214 , 215



proj ectio n operation 1 146



client-serve r paradigm 3 230



reusing 1 151



clinical decision support systems



spec ialization operatio n 1 J46



in patient care 4 250



trainin g 1 147, 149



Int ern et utilizing 4 250



training 1 153



intr anet utilizing 4 250



CG wra ppers modeling



clinical gu idelines



generalization ope ration 1 149



and decision su ppo rt 4 27 1



HTML clem ent 1 147



characteristics 4 272



naive execution model 1 150



computer interpretable (C IG) 4 276 defi ned 4 255 , 256, 257



op timi zed exec utio n model 1 150 specialization ope ration 1 149 CG I 2 271 ; 4 45 chaining co ncepts backward 3 22 forw ard 3 22 changeover effort parameter 1 74 chan nel utilization 2 111, 115



vs. LAN utilization 2 114



in COM 4 276. 277 integration with clinical wo rkflow 4 276 objectives 4 258 practice guidelines 4 254 properties 4 257



See a/50 care plan; clinical wor kflow; pathway in patien t care; protocol in patient care clini cal guidelines represent ation s



vs. maximum message delay 2 116



Arde n Syntax 4 27 2



vs. maximum message size 2 116 chao tic mod el for time serie s mo del ing 5 73-75



Asbru 4 272 EON 4 272 , 273



chao tic time series 5 99 , 105



GLIF 4 272 , 274



C HA RS ET tag 1 113, 127



lim itatio ns and challenges 4 275 Prod igy 4 272 , 273



ch arting methods 5 60 ch ecksums 1 272 ch ro moso mes. See dec ision varia ble s chronic disease management 4 25 1, 253 clinical gui delines in 4 276 , 277 clinical workflow in 4 276 , 277 chronic disease prevalence 4 250 C IG. Sec co mp uter interpretable clinical guid elines C IM 2 13. 14, 2H-29 , 64 C IMOSA system . S ee C IMOSA for mass cusro m izatio n 2 37 C IMO SA 2 45 (C IM Open System Architec ture) langu ages 1 300 , 304 busin ess pro cesses 1 305 fun ctional operations 1 305 process logic 1 305 C IS. See ce ntral information system class frequ ency 2 H3 class-inherit ance 3 18 classificatio n image I 36 1 in image data 3 258 in o bejeet-o rient ed representation 3 18 performance. See under KSOG -SOM algori thm reaso nin g se rvice t 349 classification meth od s C 4.5, 117 C N 2, I I7 ID3 .1 17 SLlQ I 117 Sec also assoc iatio n ru le min ing ; document classificati o n



PROforma 4 272 . 274 clinical pathway. See pathway in patient care clinical wo rkflow defined 4 264 . 266 EH R system architecture 4 in COM 4 276 , 277



2~2



integratio n with clinical guide lines 4 276 managemen t aspects 4 265 proper ties 4 266 Sec also care plan; clinica l guidelines: pathway in pati ent care; protoco l in patient Care C LIPS 3 13; 4 47 closed -loop co ntro l 4 102 clus ter anaylsis codification configuratio n 1 57 network-based KM configuratio n 1 57 traditional configur ation 1 57 clustering algorithm 5 130 for RBFN design 5 137 unsu per vised 5 114



See also fuz zy cluster ing cluster ing in im age data 3 257 clusterin g meth ods BIRC H 1 117 CHAMELEO N 1 117 C LARANS 1 117 C LIQUE 1 I IH CUR E 1 117 DB SC AN 1 11H hie rarchi cal meth ods 1 117



 330



Index



k-rneans 1 117



k-medoids 1 117 OPTICS 1118 partitioning methods 1 117 ROCK 1117 See alsoassociation rule mining; classification methods; document clustering clusterization methods 1 227 eM. See conversation manager eMIp. See common management information protocol CMIS. See common managementinformationservices CMM. See capability maturity model CNC 2 28, 37; 3 305 COCOMO model 1 210-211; 5 228 code division multiple access. See COMA networks code mobility concept 3 234 code mobility mechanisms



higher-order mobility 3 235 object migration 3 235 process migration 3 235 strong mobility 3 235 weak mobility 3 235 code on demand 3 231 code pulling 3 231 code pushing 3 231, 241 codification KM configuration 1 51 cluster analysis 1 57, 58 factor analysis 1 61 ICT technology and 1 62 Probit model 1 59 codification strategy 1 50. See also personalization strategy CoGITaNT library 1157,159 cognition model



acquisition 3 274, 275, 278 evolution 3 273 learning and coordination 3 274, 275 perception 3 274, 275, 285 cognition model states action and coordination 3 278



attention 3 277 learning 3 278 planning 3 278 reasoning 3 277 recognition 3 277 sensing and acquisition 3 277 cognitive failures 4 128 collaboration 1 174 collaborative agents 1 200 collaborative coaching & learning 3 86 collaborative design computer-supported .. 9



DAI-supported 4 42 distributed 4 25 collaborative intelligence 1 172 collection agent 3 123 collision-based protocols 2 113 color, image 1 376 colored Petri Nets 3 90



COM 3100 combinational digital circuits, fault diagnosis of 4 219 commerce between businesses. See B2B e-commerce commercial off-the-shelf systems. See COTS common-agent-ontology 3 148 common criteria guidelines 1 270 common gateway interface. See CGI



common management information protocol 3 238.



See also CORBA common management information services 3 238 common object request broker architecture. See



CORBA common subset of attributes, defined 2 231 CommonKADS 3 10, 12 communication acts 3 177 explicit state-change 3 178 implicit state-change 3 178 communication administration 3 195 communication in virtual enterprise integration of communication forms 1 253



reliability and quality aspects 1 253 security features 1 253, 254 traceable communication 1 253 communication management agents



assistant agents 1 200 collaborative agents 1 200 cooperative agents 1 200 messaging agents 1 200 communication protocol 3 155, 160 engineering 4 410 in distributed system 4 410



reachability analysis 4 411 communication protocol description language. See CPDL



language communication server 3 194 communication technologies, security features of 1 254 communications in intelligent agent 1 177, 178



COM NET model 2 73 information processing system and 2 100



LAN networks 2 100 network traffic load 2 100 of PCBA system 2 100 probability distribution functions 2 102 probability distributions 2 91 SImulation models 2 100 simulation results 2104, 108, 113, 117 See also ARENA model COMNET III model 2 72 arcs 2 87 based integrated model flowchart 2 95 Baseliner tool 2 102 communication system model development 2 85



for peBA system 2 85 LAN 2 85 links 2 87-88 MAN 2 85 modeling simulation package 2 70 network load profile 2 89



 Index



network modeling 2 85 network simulation 2 85, 93, 94 network topologies 2 87



composition operation 3 169. See alsoautomata-related operations



computational level design 3 13



network traffic 2 90



computational level in knowledge engineering 3 7, 8



nodes in 2 87 simulation tool 2 85 , 93



computer aided design. See CAD computer aided engineering. Sec CAE



transit networks 2 88



computer aided manufacturing. See CAM computer aided process planning. See CAPP computer aided software engineering. See CASE



user-interface 2 86 WAN clouds 2 85, 87-88 comparative shopping 1 145 comparison chart 1 156 builder program 1 143 for digital cameras (case study) 1159 comparison chart builder



Aggregator 1 140 for e-shopping 1 140 knowledge-based 1 140 comparison chart building collecting and merging specification pages 1 153 information extraction 1 142



specification page locating 1 152 with CG wrapper 1 152 comparison shopping 1 145 competencies



core 1 318 non-core 1 318 competitive network



for ANN-based CAPP 5 21 for feature recognition 5 5 MAXNET 5 21 competitiveness 2 149, 151 competitor entry defensive marketing and manufacturing 2 325



computer aided testing 4 25



computer-based expert knowledge 3 7 computer integrated manufacturing. Sec CfM



computer interpretable clinical guidelines 4 276, 282 computer numerical control. See CNC



computer security 1 283 computer supported collaborative design 4 9 computer supported cooperative work 1 200; 2 36 computer system



and network security 1 269 reliability 1 269 concatenation operation 3 169. Sec also automata-related operations



concept of project patterns 1 220 conceptual graph background 1 145 based wrappers 1 143 formalism 1 144 generalization operation 1 142, 146 projection operation 1 146 specialization operation 1 142, 146 wrappers as 1 145 conceptual graph elements



concepts 1 145 concept-types 1 145 relations 1 145 relation-types 1 145



marketing-production perspective 2 324 product pricing aspects 2 324, 332-335 producr redesign aspects 2 324, 332-335 complement of fuzzy sets 5 118 complex systems 3 154 component modules. See underARENA model component planning 2 153



conceptual models



component standardization 1 6



concordance relation 5 160



cornposabiliry 3 170



concurrent design 4 6



domain 3 8 generic 3 8 ontological 3 8



composite material 2 178



evaluation system 4 30



composite material model 2 214. See also microstructure



integrated distributed intelligent system (CD-lOIS) 424



models composite shape description. See shape description



concurrent engineering 2 36



objects 3 184 Petri Net theory 3 184



flexible design 1 44 for product design 4 6, 7 in product development 1 44 in Product Innovation 1 44



See also goal modeling



intelligent hybrid system for 4 29



composite state goal model extended. See extended composite state goal model



composite state goal model objects



arcs 3184 branches 3184 states 3 184 tokens 3 184 transitions 3 184 See also atomic state object



331



inter-organizational design 1 48 knowledge management and 1 44



loop 2 130, 167 multidisciplinary product development team 2 133 multi-project management 1 45 organizational structures 2 138



PACT system 4 25



 332



Index



principles 2 126



Row co ntro l 3 109



produ cr develo pment team 2 132



inventor y co nt rol 3 109



Q FD met hod 2 146



logistic co ntrol 3 109



SH AD E system 4 25



prod uction con trol 3 lOS



te am wo rk aspec ts 2 131



supply control 3 109



transformation process in 2 130 Set' also sequential engineering con curr ent engin eering in SME (mini-loader exa mple)



control strategie s in generic knowledge based techniques backward chaining 3 22 forwa rd chaining 3 22



Gan tt chan 2 174 hou se of qualiry, building of 2 164



See also search strategies co nversatio n manager 3 92



produ ct develop ment phases 2 163



co nversatio n manager co mpo nents ANS 3 93



proj ect structure plan 2 167 projec t team goals 2 166



inferen ce eng ine 3 93



proj ect team struc ture 2 170



co nversatio n poli cies 3 89-90



two - level team stru cture 2 167, 17 1



coo kies 2 273



co nc ur rent engineer ing to ols 2 147



cooperating exp ert system s 4 133



FM EA 2 15H



co -operative agent s 1 200



Q FD 2 146



co-operative distributive prob lem solving 4 24 co-operative SQ L 2 252 , 253



value analysis 2 154 concu rrent integrated design 4 33 concurren t product design 4 2H conc urren t produ ct development cost cur ve 2 146 data transfer between activiti es 2 127 goals 2 145 loops o f 2 127 pro cess 2 123, 126 requi remen ts and restricti o ns 2 128



team struct ure 2 131 track and lo op pro cess 2 127, 130 co ncurrent wi th (transition) 3 185 C O N D EN SE syste m 4 2~ confident iality 1 273, 279 confor mance testing of protocols 4 4 14, 428, 437- 439 conj ugate gradient algor ith m 5 17 co nstraint pro cessing 3 27 co nstrain t satisfaction pro blem 3 27. 28 co ntains relationships 3 18 co ntent analyzer 1 128 co ntent-based image retri eval feature-base d approaches 1 349 logic-based app roach es 1 351 spatial co nstraint-based approaches 1 350 cont ent language 3 86-87 cont ingenc ies 1 5 1- 54 co dification con figuration 1 60 impa ct on KM confi gurations 1 58 net work-based configur ation 1 60 traditional con figuration 1 60 co ntin uo us Produ ct Innovation 1 4H, 52 co nt inuo us reasoning 4 125



co ntract management 1 210 Contract N et proto col 3 162, 170; 4 135 in UAM L 4 429 in UAMLe 4 430 co ntrol advance mo ving average contro ller 4 104 co ntrol agen ts 3 106-108 demand co ntrol 3 109



co- ordinate mea surem ent machin es 2 64 cop ier design (design pri or itization requirements exam ple) 5 162 COREA 3 100, 239 ; 4 46, 48 for mon itoring systems 3 238 for H IS 4 270 See also DCOM COREAmed 4 270 core co m petence capabiliry 1 3 18 cor relation matrix 2 152 cost factor , view rewriti ngs based on bytes of data transferred 2 235 , 24 2, 248-249 based on 1/0 oper ations 2 235, 242, 248-249 based on numb er of messages exc hanged 2 235 , 242 , 248-249 costs and ben efits, life cycle 2 .10 1 C OTS 2 37. See also RA D co upling co efficient electromechanical 4 346, 354 piczoma gn etic 4 347, 355 co urse man agem ent server 3 195 covariance matr ix 5 132 Ma hanalobis distance 5 132 random fuzzy 5 139 Toepl irz ma trix estima tor 5 133 See also fuzzy cluster ing C POL 4 414 , 416 graphical (GrCPOL) 4 426 loop predi cate 4 424 , 425 time predi cate 4 424 token predi cate 4 424 C PD L language exam ple 4 425 pro toco ls execu tion in 4 435 See also inte ractio n mod eling langu age C PD L language features beliefs 4 424 tim eou rs 4 424 toke n 4 424



 Index



CPI. See continuous Product Innovation



cyber security 1 281



CPLEX. See underoptimization software



cycle time efficiency 1 14



CPM networks, fuzzy logic-based 5 152



cyclic variations as time series component 5 65, 70



333



CPN. See colored Petri Nets crawler, Web 1 122, 129 crawler-based data extraction



crawling depth 2 265, 267 seed URLs 2265, 266, 267, 268 crawling1114,122



D2MS data mining system 3214 CABRO program 3215 LUPC program 3 215 DAI3230 as multi-agent system 4 42



crawling depth 2 267. See also seed URL



based MCIS 4 127



crawling scope 2 265 CRM. See customer relationship management



for DMCIS 4 133



cross product operation 3 169. See alsoautomata-related operations cross-correlation 1 376



for product-process design 4 23 multi-agent issues 1 179 virtual design activities 4 42 damped dynamic operation



crossover operation 2 196; 5 264



of smart structures with ISA 4 363



cryptography



stiffness match concept 4 364



in mission critical systems 4 123



private key 1 277 public key 1 277 CS paradigm. See client-server paradigm CSCW See computer supported collaborative work CSMAlCD 2114 manufacturing systems and 2 65, 66 network 2 66 protocol 2 65,111, 113 CSMAlCD LANs channel utilization vs. maximum message delay 2 116



stiffness match principle 4 363 stiffness ratio 4 364 data abstraction



methods 3 200 temporal 3 198, 199



Seealso data mining data abstraction types definitional 3 200 generalization 3 200 qualitative 3 200 data analysis tools



channel utilization vs. maximum message size 2 116



cluster analysis 1 56



channel utilization vs. transmission rates 2 114 maximum message delay vs. maximum message sizes 2



factor analysis 1 56 See also intelligent data analysis



117 transmission rates vs. maximum message delay 2 114 CSP. See constraint satisfaction problem



CSQL 2 252, 253 CTP. See conformance testing of protocols culturally shared knowledge 1 340 curve fitting 5 70 for detrending estimation 5 68 least square method 5 68 customer-driven design for mass cusromization 5 188, 189 knowledge support framework for 5 192 product family design 5 192 product planning 5 192 customer focus 2 152 customer relationship management 1 134 customer requirements 2 149 customer satisfaction 2 34



customer's view in defining products life cycle 2 298. Seealso manufacturer'sview in defining products life cycle customers and suppliers interactions 2 34



custornization 1 3, 53; 2 37. See also mass customization customization stage in product family evaluation 1 13 cyber crimes



active attack 1 267 passive attack 1 267



data checker 2 278. See also Web data extraction system architecture data collector wrapper 1 151 data corroboration method 4 130 data driven agent searches 3 134 data exporter 2 278. Seealso Web data extraction system architecture data extraction patterns, creating 2 286 data extractor 2 276, 283. See also Web data extraction system architecture data flow diagrams 1 300; 2 68 data management agent 3 109 paradigm shift 1 167 technology 2 29 data mining 1 116, 120 algorithms 3 214 and knowledge discovery 3 20 I association rule 3 214 data pre-processing for time series data structuring 5 Y6 detrending 5 95 gene expression. See gene expression data mining in medicine 3 198 normalizing and scaling data 5 96 smoothing 5 95, 96 See also data abstraction; image mining; Web mining



 334



Ind e x



d ata m ining m ethods associat ion 1 11H classificatio n 1 117



de cisio n mak ing process eart hquake-pro ne stru ctural design (example) 4323



cluster ing 1 117, 118



in airline ticketing system (case stud y) 3 140 , 143



hep atitis database 3202,214-21 5



in wisdom engi nee ring 4 30H



tempor al abstraction and 3 200



learning featur es 1 177



data mining system C lem enti ne 3 2 14, 2 15 D2M S 3 214 , 215 data min ing techn ology int ellige nce kn owledge and 2 349 OLAP 2 348 da ta modul es. See ,,"derARE NA mo del data pre-p rocessing for tim e ser ies an alysis 5 94 data pt e- processo r 1 130 data redu ction (he patitis) 3 204 data repr esent ation (M C IS) 4 126 data retri ever 2 276. Sec also Web data ex traction system arch it ecture



making or buying of 1 86 resource allocation to transport networks (exam ple) 4 31 H risk analysis based 4 303 decision m aking proce ss in engineeri ng basic con cep t 4 300 decisio n trees 4 30 I defining u tility cr iteria 4 302 expe cted value analysis 4 302 decision sourcing . See so urcing de cision decision supp ort problem 5 185 decision supp ort system s 1 174, 197; 3 60, 78 in patient care 4 27 1



data wa reh ou se 1 174,324; 2 349



SPMz 1 237



data wareho use maintenance system



SPMz -RFM 1 24 1



cos t model acc uracy 2 247



decision syste m 1 290, 323



D OD 2 24 5



decision tree



Q C -M ODEL 2 241



based m eth ods 1 117



Q C - value 2 243



induction algorithm 4 225



view maintenan ce co st 2 242 , 243 view rewr itings 2 240 , 24 1 data wareho use me tho do logy phases



str uctu re 4 301



decision var iables



crossover ope ration 2 I Y6



cons truction 1 325



enco ding 2 194



design 1 325



evaluatio n 2 194



org anizational 1 324 verifi cation and validation 1 325



mutation 21 9M population size 2 194



data wareho use view s cost of 2 224 q uality 2 224 data ware housi ng techn ology 2 3 1 data/ information mo del 1 25 database agents 1 199 ce nt ric data extractio n 2 264 in co- learning syste m 3 195 metadata 2 264 representatio n mo del 1 26 DBMS for EVE proje ct 2 240 , 24 2 DCOM 3 100 . See also CORBA



DO. Sec degree of divergence D DI S. See des ign-de pendent and d esign-indepen de nt



system



Reproduction 2 199 selection 2 1Y6 sto p criterio n 2 199 declarative knowled ge 1 185, 187 dedu ction 3 11. See also abd uc tio n; inductio n deduct ive exp ert systems 3 12. See also abductive inference types ; indu ctive exp ert system s ded uct ive reasoni ng 3 12. See also abd uctive reason ing; ind uct ive reaso nin g defensive manufactu ring strateg ies 2 325 defensive marketing strategies 2 325 de fuzzificatioo 1 224; 2 353; 3 42 ; 4 72 , 156 centroid-b ased 5 200 for prod uct fam ily design 5 202 in AIM 3 137 in C AP P 5 30



D EByE too l 1 144



in fault diagnosi s 4 227



dece n tralized dec ision m aking 1 167 dec ision m aking 1 84 ; 3 30



in FSD m odel 4 196, 199 in Mam dani FIS 5 121



agents 1 177; 3 III decentralized 1 167 fuzzy logic mod eling 1 107 m ulti- attri but e analysis (M AA) 1 101 mul tiple attri bute (M AD M) 1 101 mul tiple criteria (M C D M) 1 101 multiple objec tives (MODM ) 1 101 Sec also m ake or buy deci sion



in Sugen o F1S 5 121 p rocess 2 355



See also fuzzificarion dcfuzzificatio n methods centroi d 1 HO



largest- of- m axim um 1 80



mean -of-maximum 1 80 smallest- of-maximum 1 80



 Index



degree o f d ivergence



evaluation sche ma 3 96, 97



exa m ple 2 233



function -obje ct transfo rmatio n 3 9H



on qu ery ex te nt 2 233



requirements and constraints 3 ()6



on qu ery int erface 2 232 to tal DOD 2 234



de sign wi th obj ects (D wO ) 3 94 architect ur e 3 97



delivery agent (in ABKM syste m) 3 123



de sign o bje ct co nce pt 3 95



delta rule 5 25 , 137



design pro cess m odel 3 96, 97



dem and control agents 3 109



m od ular software co m ponents 3 98



Dem pste r's rule 4 76



design- cen tric ex pert system 3 7



D emp ster Shafer theo ry 3 28 6; 4 75



dc trcnding



D em pster Shafer techniq ue 3 2HO, 2H2, 2H3



defined 5 67



denia l o f service attacks 1 268



estimation by curve fitting 5 6X



de pendable int elligent system 4 130-132



for time series data preparation 5 95



de pends-o n rel atio nships 3 18



metho ds 5 6H



depth first search 3 24



time series 5 103



algo rithm 3 288 for bui lding 3D world map 3 293



w ith m oving average 5 68 , 103 O FA. See design for assembl y



map bui lding by 3 2H7, 2H9



DFD. See data flow diagram



Sec also brea dth-first search



OFM. Sce design for m anu fact u rin g



depth of ana esth esia



DFMC



cont rol 4 102



defin ed 1 5



EEG analysis 4 92



for famil y-based design 1 5



m id-la tency auditory evo ked po tent ials 4 92 DE S 1 273 ,278 descriptio n logics 1 346 All ox co m po ne nt 1 348 and co nc rete do ma ins 1 35 1



DFT. Scc D esign for Test Dheli co m munication interfaces FTP3 175 HTTP3 175 Dheli lan gu age



kn ow led ge repre sentatio n and 1 347



co m municatio n acts 3 17 7



reaso ning se rvices 1 34 9



enti ties 3 176



sema ntics 1 348



met a-ro les 3 179



T Box co m pon ent 1 348 D esign Advisor system 1 28; 5 204 fuzz y cluster ing ana lysis 1 30 ranking mo d ule 1 30 design - de pende nt and design -in depen den t system 4 24



de sign for assembl y 2 147; 4 7. Sec also design fo r manufacturing



ro le variable s 3 179



Variables 3 179 D h eli system inp ut lang uage so urce code 3 174



targe t language source co de 3 174 Dheli tool IO PF 3 172 syste m run tim e interfaces 3 173



design fo r manufactur ing 2147. See also design for assem bly



diag nostic inference 4 24 1



design for mass custornization. See DFMC



diag nos tic inference model 4 22 1, 22 2. See also causal



D esign for Test 4 237 design functi on deployme nt (DFD) 1 15



mo dels; fault mod els. D IA N A diagn osis eng ine 4 221



design inform ation and knowl edg e m odeling 1 21



OIF file forma t 2 H4



design interaction protocols 4 413 , 428



digital cameras (comparison chart case stu dy) 1 159 digital certificates 1 27 6, 2HO. Sec also digit al



design know ledge 4 38 activity classificatio n 3 9



D H P algori thm 1 I I H



sign ature



bu ilding 3 9



digital econ omy 1 24H



m odeli ng 1 23



digital factory 2 307



space classificatio n 3 9



digital factor y in futur e (exam ple o f ) 2 3 16



design loo p 2 127 , 13H design o bje ct co nc ept 3 95 de sign patte rn s 1 220 , 226 design pro cess m odel. See design process model objec t-orient ed. Sec design w ith o bjects desig n pro cess model 1 26 design algo ri thm 3 96 , 97 design objec ts 3 96



dat a transfer 2 3 1S platfo rm struc ture 2 3 17



digital life cycle product managem ent 2 30 4 data co nt inuity aspec t' 2 307 digital produ ct tracing 2 305 indu stria l pro totypes 2 3 15 P LC usage 2 30 5 utilizatio n perfo rmance boos ting 2 305 digital pro d uct trac ing 2 30 5



335



 336 Index



digital signature 1 272, 283. See also digital certificates digitally networked product LCM prototypes digital factory 2 316 online process monitoring 2 315 Web controlled products 2 318 DIP. See design interaction protocols Dirac function 4 384 direct hashing and pruning algorithm 1 118 direct numerically controlled machines 2 64 direct to transition (in goal model) 3 185 directory services 1 114; 3 194 discordance relation 5 160, 163. See also concordance relation; fuzzy outranking relation discrete fourier transform 1 375 dispensable attributes 2 229, 232 displacement analysis of induced strain actuators



compressive stress analysis 4 333 in displacement-amplified actuators 4 339 stiffness analysis 4 334 stiffness ratio 4 335 with compliant support 4 337 displacement vector 2 189, 219, 220 displacement, electric 4 345 displacement-amplified actuators amplification ratio 4 343 displacement analysis 4 339, 340 displacement coefficient 4 343 kinematic gain 4 342, 344 optimal kinematic gain 4 343 output energy analysis 4 341 stiffness match concept 4 342 stiffness ratio 4 343



dissemination 3 224, 225 distribute agent in AKBM system 3 123 distribute knowledge



actively process 3 125 passively process 3 125 See also acquire knowledge distributed artificial intelligence. See DAI distributed collaborative design system 4 25 distributed command and control systems 4 133



distributed distributed distributed distributed distributed



component object model. See DCOM computational economy 4 135 computing technologies 1 265 design modeling 4 45 genetic algorithm 5 226



distributed intelligent systems coordination mechanisms 4 135



MINUTE 4135 mission criticalintelligent systems. See DMCIS resource allocation protocol 4 139



task allocation protocol 4 138 TRACE 4137 distributed management paradigms



distributed object technology 3 230 mobile code 3 230 distributed monitoring 3 228, 229 management by delegation 3 230



mobile agents for 3 244 See also dynamic monitoring; centralized monitoring; hierarchial monitoring distributed monitoring architectures



strongly distributed co-operative paradigms 3 229 strongly distributed hierarchical paradigms 3 229 weakly distributed hierarchical paradigms 3 229 distributed monitoring, software mobility for 3 233 location transparency and awareness 3 234



mobile code systems 3 234 mobility mechanisms 3 235 distributed objects 3 156; 4 45 distributed parallel genetic algorithm 5 226 distributed planning and control 4 133 distributed systems 1 282; 3 160 distribution center agent 3 108 distribution processes



centralized 1 167 hierarchial 1 167 DL. Sec description logics DMCIS 4134 DNe. See direct numerically controlled machines DOA. See depth of anaesthesia document categorization automatic 1 119 document classification 1 120



document clustering 1 120 manual 1 119 See also text categorization document classification 1 120. Seealso classification methods document clustering 1 120. See alsoclustering methods document management 1 174



document document document DOM.See



object model 1 128, 142, 144, 157 parser 1 125, 126, 130 type definition 2 278; 3 85 document object model



domain closure assumption 1 354



domain expert profile 3 123 domain knowledge 1157; 3 8, 21, 22,135,138. See also inferential knowledge domain knowledge base 3 112 domain model 3 10 functional models 3 35 object models 3 35 domain ontology 1122, 123 drug delivery system 4 !OO drug infusion 4 96 monitoring 4 98, 104 multiple drug infusion 4 103 DSP Sce decision support problem DSS. See decision support systems DTD. Sec document type definition dual wrapper approach 1 155 DW. See data warehouse Dwo. See design with objects dynamic design for smart structures with ISA dynamic displacement 4 360



 Index



optim um stiffness 4 360



eco no m etri c met hods 2 328



stiffness 4 359



economic j ustificatio n and measurement systems



stiffness mat ch co nce pt 4 360 stiffness ratio 4 360 See 11150 static de sign for smart struc tures wi th ISA dynam ic evaluation in dex 3 43



dynamic monitoring Int ernet m anagement aspec ts 3 242 MA -ba sed managem ent benefits 3 246 managem en t by delegation 3 241



benefit factors 2 20 cost factors 2 18 en terp rise features 2 20 methods used in 2 21 overvi ew 2 17



proces s 2 21



See also int egr ated meth od ology for EIS economic systems



O SI mana gement aspec ts 3 243



beha vior predictio n 5 6 1



Seealso distributed monitoring



causal laws 5 6 1



dynami c stiffness 4 359 -369



See also fin ancial tim e series



m atch 4 363-364



ED !. See elect roni c data int e rchang e



ratio 4 360



EEG



See also static stiffness dyn am ic stro ke 4 348 . See also static stroke



an d depth of anaest hesia 4 92 auditory evo ked respon se m easureme nt 4 92



EAM . See ep isodal associative m em or y approac h



elated DOA co ntro l 4 102 EEM. Sec under ent erprise enginee ring EET. Sec under ent erprise engineering



early supported discha rge (openEHR case stud y) 4 287



effective capaci tance



earthquake- pron e structural design (decision maki ng



EFQM Excellence M odel 1 324



EAI. See ex ternal authoring int erface



exam ple)



337



4 355



EHR. See elec tro nic health record



cost estimation 4 325



EI II . See evaluatio n indices of industrial info rmauza rio n



decisio n cr ite ria 4 323



EIS 1 197 EJMS. Sec eco no m ic ju stification and measurement



gro und motio n probabilistic mo del 4 323 optim izatio n concep t 4 326



syste ms



prob ability of model failure 4 325



ELAP. See euclidean location -allo cation prob lem



reliability analysis 4 32 7



elastic servers 3 241 , 242



See also reso urce allocation to transpo rt networks EAU M L for int eractio n proto col m odeling 4 426 eavesdropping. telecom 1 268 EIlL. See ex planation - based learning EIlM . See evidenc e- based m edicine e-bu siness age nts



applicatio n 3 133 general bu siness activiry 3 133 person al 3 133 syste m level 3 133



e-learning agen ts 3 183 e-learning m ode l devel op m ent stages 3 190 enterprise kn owled ge m anagem ent and 3 19 1 goal-based m ode ling 3 18Y modeling aspects 3 190 See also multi-agent m odeling by goal model e-learni ng processes



ECA algo rit hm 2 253



co urse de liver y process 3 19 2 e mp loyee selectio n pro cess 3 192 learn er prep aratio n process 3 1Y1, 192



EC G QRS detectio n 4 69



pre-assessm ent pro cess 3 192



sign al proces sing 4 69 w avelet transfo rm atio n 4 70



See also in telligent patient monito ring e- comme rce and manufac turing infor m atio n systems 2 33



learning path gc ner aric n process 3 192 role iden tificatio n pro cess 3 192 e-learning system



advantages of 3 1Y1 agent-based 3 182 Al usage 3 183



1l21l service 2 34; 3 132, 135



archite cture 3 193, 1Y4



1l2C servi ce 2 34; 3 135



based on mu lti-agent architec ture 3 183



customers and supplie rs int erac tion s 2 34 EDI bene fit 2 34 info rmation systems 2 39 Int ern et- based 2 34, 35 know ledge base for 3 136 organizational development levels 2 33 server 2 264



develop me nt 3 195 e- lea r ning system co nsu me rs knowledge o fficers 3 1') 1 learn ers 3 19 1



e-learn ing system provid ers administrators 3 191



COnt en t provid ers 3 191



supply ch ain m anagem ent aspe c ts 2 34



electric adm ittance 4 3 54



See also c-b usincss agents



ele ctri c capac itance 4 347 . 354



 338



Index



electric displacement 4 345, 353



reference energy 4 336



electric impedance 4 354. See also electromechanical



stiffness match principle 4 336



impedance



with compliant support 4 338



electric reactance 4 354



energy minimization criteria 3 301



electricmotive force 4 350



energy transfer



electroactive actuators 4 333, 348. Sec also magnetoactive actuators



electrocardiograph. See ECG electroencephalogram. See EEG



condition for maximum energy transfer 4 388



optimum 4 387, 388 through pin-force model 4 387 energy transfer through shear-lag model



electromechanical coupling coefficient 4 346, 354.



actuator energy evaluation 4 386



Sec also piezomagnetic coupling coefficient 4346 electromechanical impedance 4 370, 371



shear stresses evaluation 4 387



electromechanical system



elastic energy evaluation 4 387 engineering decisions 4 301 acceptability of risk 4 308



case-based design 4 43



optimization concepts 4 311



fuzzy neural network for 4 43



risk-cost combinations 4 313



optimization with GA 4 43 electromotive effect 4 353 electromyogram 4 105



See also risk management engineering design 3 3 CBR 3 2q



electronic commerce. See e-commerce



constraint processing 3 27



electronic data interchange 2 29, 43



domain model 3 10



for mass customization 2 37



intelligent CAD 4 9



in e-commerce 2 34



knowledge-based application in 3 31



electronic fault diagnosis. See fault diagnosis of electronic systems



electronic health record 4 265, 268 confidentiality features 4 270 guideline. See clinical guidelines



guidelines and workflow integration 4 276



See also openEHR electronic shopping. See e-shopping ELH. See entity life histories elliptical radial basis function network. See



ER13FN Elman nets 5 91 for time series prediction 5 100 neural networks 5 99 recurrent networks 5 100 EM. Sec enterprise models



predicate logic 3 20 engineering design, knowledge-based techniques for generic 3 22 implementation-specific 3 13 engineering facility



basic optimization concept 4 311 decision-making process in 4 300 See also wisdom engineering engineering interaction protocols Beghera application 4 414 communication protocols 4 410 CPDL language 4 424 CTP tool 4 414, 428 DIP tool 4 413, 428 graphical modeling languages 4 425 in multiagent systems 4 409



EM AS 5 245, 246



interaction modeling language 4 419



embedded modal sensor 4 371



micro-protocols 4 421



embedded projects 1 211



model checking approach 4 433



embedding theorem 5 87



protocol synthesis 4 412



EMG. See electromyogram



protocols management 4 436



EML. See enterprise modeling languages



reachabiliry analysis 4 431



EMO. See enterprise modules



Testing Agents and Protocols (TAP) tool 4 414,



encapsulation 3 18 encryption 1 272 AES 1 273



428 validation example 4 431 engineering interaction protocols design



algorithms 1 277



analysis stage 4 416



DES 1 273 for WAP devices 1 280 public key cryptography 1 277



conformance testing stage 4 437



RSA 1 273 wireless 1 277 energy analysis of induced-strain actuators



component-based approach 4 420 formal description stage 4 418 protocol synthesis stage 4 433 validation stage 4 430 engineering knowledge axis 3 5



in displacement-amplified actuators 4 341



engineering systems, design of 3 3



output energy coefficient 4 336



enhanced primary care 4 253



 Index



enterprise agility defined 1 67 knowledge-based 1 67 measurement 1 67 enterprise agility measurement parameters information infrastructure 1 71



ERBFN 5138,139 vs. GRBFN 5140,141,145 vs. TB-RBFN 5 145, 146, 147 ergodicity 5 74 ergonomic and product design 4 7, 8 ERP. See enterprise resource planning



market infrastructure 1 71



ES. See expert systems



people infrastructure 1 71



e-shopping 1 140, 157



production infrastructure 1 71 enterprise architecture integration 4 127 enterprise engineering



esophageal intubation detection 4 90, 91 E-SQL 2 238, 253 example query 2 239



methodology (EEM) 1 296



query 2 238, 239



tools (EET) 1 297, 301



relaxation parameters 2 239



enterprise information systems



EIlI 2 7, 12 EJMS 2 6,17 integrated development techniques 2 3 integrated methodology 2 5 ISP 2 9 ISPM 2 6, 9, 10, 11 53 IE 2 7, 25 enterprise integration and management 2 36



enterprise knowledge management 3 191 enterprise model BPM and 1293 GEMC 1297



Estelle 3 167, 168 ethernet networks 1 261



euclidean location-allocation problem 5311 GA + branch-and-bound (hybrid solution) 5 307 GA + Lagrange (hybrid solution) 5 308 GA-based solution 5 307 hybrid method for 5 306 local search algorithm for 5 303 minimization 5 293 evaluation indices of industrial informatization 2 7.



See also integrated methodology for EIS evaluation of



product family 1 13



GERA 1298



product platform 1 13



GERAM framework 1 296



product variant 1 6, 13, 17



reference models 1 302 enterprise modeling



evaluator module 1 156, 157 EVE. See evolvable view environment



fuzzy SPM model 1 227



event driven process chain 1 300



GERA 1298 ISO 9000 family of standards 1 319 tools 1 297 , 301



event-handling knowledge base 3 112, 113



See also business process modeling enterprise modeling languages



meta schema concept 1 297, 299 ontological models 1 297, 299 semantics 1 299 symbology 1 299 syntax 1 299 enterprise modules 1 297 enterprise operational system 1 297 enterprise reference models 1 302



generic 1 302 paradigmatic 1 302



event selection mechanism 3 118 evidence-based medicine 4 254 evidence-based reasoning 4 74-75 evolution representation of product family 1 10 evolvable view environment 2 237 evolvable view environment project concurrency control 2 23R data warehouse maintenance system (example)



2240 DBMS usage 2 240, 242 Java-based implementation 2 240, 242 meta knowledge base 2 238 MKB Evolver 2 238



entities meaningful for the assembly sequence. See EMAS



QC-computation 2 238 relaxed SQL query model 2 238 view knowledge base 2 238



entity in business process model 1 291



view maintainer 2 238, 242



entity life histories 2 68 F.ON model in patient care 4 273 EOS. See enterprise operational system



view synchronizer 2 238



enterprise resource planning 2 29



EPe. See enhanced primary care episodal associative memory approach 5 25 EPR. See extrinsic precedence relation equilibrium in positions. See under Nash equilibrium prices. See under Nash equilibrium



wrappers 2 23H See alsomaterialized views; query rewritings evolvable view environment system



evaluation 2 241 implementation 2 240 legal view rewriting 2 240 QC-Model 2 241 evolvable-SQL. See E-SQL



339



 340



Index



exact solution methods in scheduling 5 216. See also heuristic methods in scheduling exception handling mechanisms 1 210, 295 execution model naive 1 150 optimized 1 150



extended composite state goal model action selection 3 186



Bayesian inference 3 186, 187 multi-agent modeling 3 187 eXtensible markup language. See XML external authoring interface 4 44



executive information systems 1 197



external entities



executive teams



in AKUM system 3 123 objects 3 123 external models agent 3 110



advisory teams 1 213



problem teams 1 213 project teams 1 213 expansion ability parameter 1 75



expected value analysis 4 302 expert knowledge 3 6, 7. Sec also engineering design expert system 1174,197,254 advantages 3 58 cooperating 4 133 deductive 3 12 defmed 33 design-centric 3 7



development approach 3 58 film feeding unit (example) 3 44 for CAPP 5 29, 30 in manufacturing system 3 59, 61 in medicine 3 198 in production planning 3 55, 58 in production scheduling 3 55, 58 inductive 3 12 knowledge acquisition 3 57 knowledge-based. See knowledge-based expert systems real time 4 124 research in production planning & scheduling



362 rule-based 3 62 shells 3 57, 68 technology 3 56 expert system components inference engine 3 56



knowledge base 3 56 user interface 3 56



external presentations



formal 1 341 not formal 1 341 external supplier agent 3 108 externalization (knowledge). See under knowledge



externalized knowledge 1 184-186, 341 extracting rules. See fuzzy rule extraction



extraction templates 2 283, 284 extrapolation 5 60, 70 extrinsic precedence relations



for assembly sequences generation 5 252 for automatic generation of assembly sequences 5 264



for groups of liaisons 5 255, 256 for hydraulic motor (ASP example) 5 267 implementation 5 253



Sec also intrinsic precedence relations face adjacency matrix code 5 9 face score vector 5 9



facility location problem 5 289 capacitated (CFLP) 5 290 single source capacitated (SSCFLP) 5 290 Sec also alternate location-allocation method factor analysis 1 60 in codification KM configuration 1 61 in network-based KM configuration 1 61 in traditional KM configuration 1 61



of PI performances 1 61 factor object 3 136. Sec also impact object F-adjacency matrix 5 10



expert system development tools



failure analysis 2 161



expert system shells 3 57 high level languages 3 57 expert system for planning and scheduling GENESYS (case study) 3 64 HESS 362 ISIS 3 62 KDPAG 3 63 MARS 3 62 PATRIARCH 3 62 PEPS 362



failure detection and recovery



explanation-based learning for fault diagnosis



4224 explanation subsystem 3 57



explicit knowledge 1 184, 193, 333, 341 explorer mechanism (in ICSA) 3 121 explorer role of agent 3 84, 104 exponential distribution 2 84, 91



backward recovery 4 129 data corroboration method 4 130 detection knowledge base 4 131 forward recovery 4 129 recovery knowledge base 4 131 rereading inputs method 4 130



retrieve and apply principle 4 130 failure detection knowledge base 4 131 failure mode and effects analysis 2 147 advantages 2 159 defined 2 158 design FMEA 2 159 drawbacks 2 160 failure analysis 2 161 goals 2 158 method 2162



 Index



process FMEA 2 159 product design optimization 2 163 results assessment 2 163 risk assessment 2 161



service FMEA 2 159 system FMEA 2 159 rypes 2 159 Seealso concurrent engineering tools failure probabiliry 4 298, 299



341



fault diagnosis tools diagnostic inference 4 241 knowledge representation 4 238 fault dictionaries 4 216, 217 fault history 4 244 fault hypotheses discrimination 4 213, 219 generation 4 213,219 testing 4219



failure recovery 4 130



fault information generation 4 213



family-based DFMC approach 1 5



fault models 4 216, 217. See also causal model; diagnostic inference model



family based product design modification aspects 1 6 re-design aspects 1 6



fault prioritization 4 244 fault tolerance in intelligent systems



family level evaluation, platform-based 1 13 investment efficiency 1 15



cognitive failures 4 128 deadline violations 4 128



market efficiency 1 14 family of products. Sec product family fault (decision) trees 4 215 fault detection 4 71, 227



failure detection and recovery 4 129 hardware faults 4 128



fault diagnosis 4 227 block diagram example 4 238 fault hypotheses generation 4 213 fault hyporhesis discrimination 4 213 fault infonnation generation 4 213 interconnect example 4 239, 242 fault diagnosis approaches case-based 4 235 fuzzy logic-based 4 225, 235 fuzzy neural networks 4 233 genetic algorithms 4 233 hybrid-based 4 231, 236 learning-based 4 223 model-based 4 234 neural network-based 4 227, 235 rule-based 4 214, 234 fault diagnosis of electronic systems 4 212 ATMS 4 220 behavior model 4 218 CATS diagnosis engine 4 221 causal model 4 218 combinational digital circuits 4 219 DEDALE approach 4 220 diagnostic inference model 4 221 diagnostic standards 4 233 DIANA diagnosis engine 4 221 fault dictionaries 4 216, 217 fault models 4 216, 217 fault (decision) trees 4 215 future researches in 4 236 GDE 4 220 machine intelligence model 4 213 rule-based systems for 4 214 structure model 4 218 fault diagnosis of electronic systems, learning based case-based reasoning 4 223 explanation-based 4 224 learning knowledge from data 4 225



heuristic failures 4 128 software faults 4 128 transient or permanent faults 4 128 fault tolerant system 4 123 FBS. See function-behavior-structure model FCM. See fuzzy C-means



FDKB. Sec failure detection knowledge base feasibiliry loop 2 127, 138 feature-based image retrieval 1 349 feature-based virtual model representation 4 43 feature recognition, ANN-based competitive networks for 5 5 feature class output format 5 15 feedforward networks for 5 4 input representation 5 8 matrix file-based output format 5 16 recurrent networks for 5 6 training method 5 1() feature selection in hepatitis data 3 204 feedforward neural networks based time series 5 80 for ANN-based CAPP 5 19 for fault diagnosis 4 228 for financial time series prediction 5 62 for forecasting 5 1()1 for patient monitoring 4 80 for time series prediction 5 76, 77



MLP 5 82 RBF 5 H4 RBFN 5 125 training 5 93, 105 use in time series analysis 5 H2 validation 5 93 Sec alsocompetitive networks; recurrent networks feedforward neural networks for feature recognition 54 five-layer 5 8 four-layer 5 8 three-layer 5 7 FE Sec fitness function



 342



Index



FFNN. See feedforward neural networks file transfer protocol. See FTP FILER learning algorirhm 4 86 film feeding unit (knowledge-based expert system example) behavorial schema 3 48 functional design process 3 44 heuristic search tree 3 45



filtering agent 1 200, 201 financial fraudsecuriry attacks 1 268 financial indicators 1 324 financial time series



charring merhod 5 60 currency exchange rates forecasting 5 99



defmed 5 64 ergodic behavior 5 74 extrapolation method 5 60 neural networks application 5 98 stock market indices forecasting 5 99-100 See also switching time series financial time series analysis cyclic movements 5 65, 70 predicrion problem 5 70 seasonal indices 5 65, 69 short-term fluctuations 5 65 trend estimation 5 65, 67 weighted moving average 5 67 weighted smoothing 5 67 See alsotime series data preparation fmancial time series models additive noise 5 72



ARIMA 5 72 ARM A 5 72 chaotic 5 73, 74, 75 linear 5 71 multivariate 5 71



F1PA ACL 3 85-86, 88 FIPA-KlF 3 86 F1PA-RDF 3 86, 88 See also KQML FIR. Sec finite impulse response firewalls 1 273 F1S. See fuzzy inference system fitness function 5 258, 259 computation of 5 262 for GA-based ASP optimization 5 265 for hydraulic motor (ASP example) 5 268 for multi-criteria optimization 5 261 for single optimization criterion 5 260



partialS 260, 268 FKBC. See fuzzv knowledge base controller FL-BPN. Sec ""der fuzzy logic FLC See fuzzy logic control flexibiliry measurement 1 68 of monitoring systems 3 224, 228 of products 1 6 ofSNMP 3 237 flexible assembly systems 3 66 flexible manufacturing system 2 65; 4 12 ARENA model 2 97 for mass customization 2 37 integrated model establisment 2 93 integrated simulation modeling (example) 2 69 intelligent planning 4 12 model 2 82 operational system 2 97



See also PCBA system flow control agents 3 109 flow shop 3 65; 5 214 FLP. See faciliry location problem FMCDM 3 37; 5196



nonlinear 5 71 stochastic 5 73



FMEA. See failure mode and effects analysis



univariate 5 71



FMS. See flexible manufacturing system



financial time series prediction



ANN-based 5 61 FFNN-based 5 62 MLP-based 5 62 TDNN-based 5 62 finite difference method 2 189, 190 finite element analysis model 2 184 displacement vector 2 219 for multi-heterogeneous material 2 218



homogenization theory 2 218 stiffness matrix 2 218



See also CAD modeling for multi-heterogeneous materials



finite impulse response 5 72, 90 finite state machine 3 162



role definition 3 164 theory 3 169 F1PA3 85, 149 compliant]ADE 3 150



FMLE. Sec fuzzy maximum likelihood estimation forecasting fmancial time series 5 99 stock market indices 5 99



with ANN 5 98 with NN 5 99 forecasting currency exchange rates 5 99, 101 MLP-based 5 102 neural networks for 5 102 forecasting errors



MAPE 4166 MSE 4166 formal aspects of KM. See under KM classification criteria formal external presentation 1 341



formal knowledge 1 340 formal languages 1 300 formalizable knowledge 1 338, 340. See also not-formalizable knowledge forward chaining 3 22, 134; 4 125. See also backward chaining



 Index



forward reasoning 3 23



disadvantages 2 139



forward recovery 4 129. See also backward recovery



See alsomatrix organizational structure; project



Foundation for Intelligent Physical Agents. See FIPA fourier transform 4 70



organizational structure functional reasoning



FP-growth method 1 118



best-first heuristic search 3 41



F-R mechanism. See fuzzy rule mechanism



CBI (knowledge-based) 3 39, 41



frame representation



of knowledge 3 15



of semantic network 3 16 frequency analysis. See Probit model



FSS (knowledge-based) 3 39, 41 functional representation. See knowledge-based functional representation functional supportive synthesis. See FSS



frequency distribution 2 82, 83



functionally graded materials 2 178, 203



frequent pattern growth method 1 118



fuzzification 1 224 ; 2 352-353



frequent viewpoint pattern 3 260



for product family design 5 202



FRKB. See failure recovery knowledge base



in AIM 3137



FSD method



process 2 355



membership states 4 195 decision set 4 192



Seealso defuzzification fuzzy adjuster 4 150-151, 199, 207



defuzzification 4 196, 199



fuzzy associative memory 4 27, 28



factor set 4 192



fuzzv-Cvmeans algorithm 5 114



fuzzy set 4 193



fuzzy clustering 5 139



single-factor adjustment, combination of



4196-197 single-factor adjustment, transformation of 4 196, 197-198 FSD subsystem modules



and design ranking methodology 5 196 and ranking 5 207 covariance matrix 5 132, 133 FMLE 5133 for optimal RBFN design 5 130



acquisition 4 199, 200



for product family design 5 196



fuzzy inference engine 4 199



fuzzy Ccmcans 5 130



knowledge base 4 199



fuzzy clustering/classification and multi-criteria



FSD validation 4 200



decision-making. Sec FMCDM



human expert measurement 4 201



Mahanalobis distance 5 132



use of FA 4 201



optimal number of clusters 5 135



FSM. Seefinite state machine



priori probability 5 133



FSS 3 33, 35, 39, 41



unsupervised 5 138



FTP 1 276; 2 269; 3 175 function-behavior-structure model 4 38 function points analysis 1211, 212



functional decomposition 3 35, 39 functional design activity flow in 3 32 knowledge-based application in 3 31 functional design knowledge acquisition 3 33, 35 representation 3 36 functional design paradigm



functional modeling framework 3 32



fuzzy clustering for cardiac output 4 96



for AIS system 4 108 for fault detection 4 75 for fault diagnosis 4 225, 233 in patient monitoring 4 64, 71-72 neural network hybrid system 4 153 fuzzy Ccmeans 5 130



algorithm 5 131 theorem 5 131 fuzzy complement 5 118 fuzzy decision modeling for product development



5 151



knowledge acquisition 3 33



fuzzy decision support system 5 207



knowledge-based reasoning 339,41



fuzzy if-then rules. See underfuzzy rules



knowledge-based representation 3 36 functional mechanism (in ISCA system) 3 118 functional modeling framework 3 32-35 CBI 3 33 CBP 3 33 FSS 3 33 functional modeling languages 1 316 functional modularity 1 6 functional modules 1 15 functional organizational structure



advantages 2 138, 139



343



fuzzy indexing 3 29 fuzzy inference architecture 2 353



rules 3 21 fuzzy inference system (FlS) adaptive network-based (ANFlS) 5 124 and RBFN 5 112, 128 ANFlS architecture 5 129 fuzzy reasoning and 5 119 rules. See fuzzy rule if-then rules 5 111



 344



Index



Mamd ani 5 121. 122



fuzzy op erators



Sugeno 5 12 1 fuzzy intersection 5 11H fuzzy kn owl edge 3 22 fuzzy kno wledg e base controller 2 353 fuzzy log ic 1 73; 2 352. 355 backpropagation ne ural net work (FL-BPN ) 5 30



AN D 4 72 O R 4 72 fuzzy o utranking preferen ce model fo r product select io n incomparability index 5 168



indifferen ce index 5 167 o utranking relations co nstructio n 5 166



OWA op erato r 5 166



co nt rol 1 225; 4 98. 103 FMC D M model 3 37 for function al design knowledge 3 37 in decision- makin g pro cess 1 107 mod el. Scc fuzzy model neu ral networ k 5 30



preference index 5 167 fuzzy o utra nking preference mod el to priorir ize design co nco rdance relatio n 5 160



stee ring law 1 225 systems 1 80 techni que for C APP 5 29- 31



discordance relation 5 160 imprecise info rmati on in Q FD 5 158 outranking relati on 5 159



fuzzy outrank ing relatio n 5 159 , Sec also co nco rdance



relation; discordance relation



Sec also predi cate logic fuzzy map 4 74



fuzzy parallel scheduling procedure 5 176, 177



fuzzy maximum likelihood estim ation 5 133 fuzzy measure theo ry for intelli gent patient mon itor ing 4 67 fuzzy mo del 1 107



fuzzy project sch edulin g example 5 17H fuzzy tem po ral parameters comparison 5 173



co nstructio n stages 1 234



GA for 5 174,1 75 parallel scheduling 5 176



for ma nagement mo deling 1 222



perfo rmance measure 5 174



glo bal 1 223 in m anagem en t processes



linguistic rype 1 222 , 223 lo cal 1 223



problem for mulatio n 5 172



1 220



Mamd ani 1 22 2, 223 neural-fuzzy mo del 2 352 , 353 , 354 of knowledg e- based system 1 235 self-organising 1 226 self- tuning 1 226, 232 51'M . See fuzzy SPM mo del Takagi-Sugeno- Kanga 1 222 , 223 fuzzy scheduling mo del 5 173 Mamdani 5 121 Suge no 5 122 fuzzy modeling clusteriza tion me tho ds 1 227 defuzzificatio n 1 224 fuzzificatio n 1 224 fuzzy neuro n network method 1 227 fuzzy regulato r theory 1 225 inference 1 224 mem bership function 1 224 of produ ction infrastructure agility 1 75 search met ho d 1 227 tuni ng processes 1 227



unfuzzy ne uron netwo rk me tho d 1 227 use of AND 1 73 fuzzy multi- criteria dec isio n- maki ng. Scc FM C DM fuzzy neu ral netw ork fo r electro -mechanical system mo deli ng 4 43



for fault diagnosis 4 233 method 1 227



fuzzy rankin g for design 5 199, 200 fuzzy reasoning and Fl S 5 119,121 appro ximat e reason ing 5 120 for fault diagn osis 4 220, 227 fuzzy if- then rules 5 120 proce ss 2 355 fuzzy regulato r 1 225 dynami c 1 226 fuzzy retrie val 3 29 fuzzy rule 1 225; 2 35 4, 355 and agility 1 73 and FIS 5 119 based neur al network 4 154



based sch edul er 3 63 for infor matio n infrastructu re agility 1 76 for peopl e infrastru cture agility 1 76 for product family design 5 20 2 for RBFN train ing 5 137 for triggering alarm s 4 89 F-R me chanism 1 235, 239, 240 iden tificati on for IBF 4 154, 157-15H if-then ru les 5 120, 128 , 20 2 inference rules 3 2 1 fuzzy rule e xtractio n



using grid partitioning 5 113 using multidime nsio nal fuzzy sets 5 114 using proj ection 5 113 fuzzy-r ule mechanism 1 235 , 239 , 240 fuzzy set analysis fo r prod uct design selectio n and evaluatio n 5 185 fuzzy set me mbership 3 2 1



 Index 345



fuzzy set operatio ns



fuzzy steering 1 226



co mplem ent 5 I \ H



fuzzy synthe tic de cision . Srr FSD method



int ersecti o n 5 I I H



fuzzy templ ate for inadeq uate analgesia diagno sis 4 ('7-f>X, 77



u nion 5 119 fu zzy set the ory 3 2 1; 5 114 application in mana gement m ode lling 1 111



G A (genetic algo ri thm s) 2 193; 3 3 14



FMCD M problem solution 5 1%



as o ptimi zation tool 5 31l3, 30 4, 307



for fault diag nosis 4 225 for product develo pment 5 152, 153



con cep t 5 2 17 crossover concep t 5 2 ] 7 distr ibut ed 5 226



SI' M m od el. Sec fuzzy SI' M m od el



exam ples 5 2 1H



for inte lligent patient m on itor ing 4 117



fuzzy set operations 5 I I H



for assembly seque nct' opt im izatio n 5 23 4, 240



lingu istic variables and values 5 115



for CAP P 5 3 1



m emb ership function fo rm ulatio n 5 I 15



for fault diagnosis 4 233



param etri zation 5 115



for fuzz y proj ect sche d uling 5 174



SfC



also fuzzy sets



t\.lzzy set th eory tor product de sign pri oritization



for management system s 5 213 for MI NLP problem 5 310



co pier desig n (example) 5 162 fuzzy ou tranking preference model 5 158



for time serie s 5 lJX



Q FD too l 5 1')7



GA



fuzzy set theory for produc t design sele ctio n



for produc t-process design 4 17, 1H, 19



GA



+ 13&13 method 5 315- 316 + Lagr ange 5 .lOX, 3 15, 31H



fuzzy outranking preference model 5 166 M CDM mo del 51 65



in itialization m ethods 5 213



moun tain bikes design (exam ple) 5 16H



mu tatio n co ncept 5 2 17



proble m formulation 5 164



optim izatio n wi th 4 43



fuzzy set th eo ry for produ ct devel opmen t



design requ irem ent s pr ior itization 5 15h



me thod It" location allocatio n prob lem 5 306-30H



parallel im plem e ntation s 5 226 reorderin g meth od s 5 224



design selectio n co ncepc 5 164



repr esentation issues 5 211



imp recision mod eling 5 154



selectio n co nce pt 5 2 17



mem ber ship fun ction 5 153, 154



GA appli catio ns



n ecessity me asure 5 155



C AD 5 220



possibility me asure 5 155



econ om ics 5 210



preference m od eling 5 154 sche duling. Sec fuzzy proj ect sche duling fu zzv sets 1 73. 123. 227; 3 2 1 defined 5 114, 115 lo r acq uisitio n cycle 3 2H4 in nuke o r bu y decision making 1 107 m em bership funct ion 5 114 mu lridim ensio nal 5 112 . 1 t 4no n- random ness 5 t 15 subj ec tivity 5 115 fuz zy SPM model I 21lH, 241l



building I 227



in de cision making 5 220 optim izatio n and plann ing 5 220 G A- base d ASP o ptim izatio n au to matic generation o f assembly 'i.e-que nee'i.5 263 crossover opera to r 5 264 fitness fun ctio n 5 20 5 se lec tio n functi on 5 265



GA fo r so ftware proj ect nun agern cnr COCOMO 5 22H task-based m od el 5 227 timelinc-based m ode l 5 230 GA packages



construction 1 22H, 235



GAlib 5 2211



exam ple I 23 1



GALOPPS 5 2 19



!tlZZy set I 229



GAucsd 5 2 1H



information and ma nage m ent m ethods and tools 1 nil, 23 1



G EN ESIS 5 2 1H



me mbership function 1 241



of kn owled ge-based system 1 235



Lib G A 5 220 GA tec hniqu es fo r sched uling



parameter " adapting 1 233



co mparison with sim ulated annealing 5 224 co mpa riso n with tabu search 5 224



parametc.'. r s tuning 1 233 regu lato r theor y 1 22Y



distributed parallel G A 5 226 hybrid GA 5 22')



scalar states defining 1 2J O, 23 t



o perato rs 5 223



struc tural m odeling on ligu istic level 1 2J J str uc ture adapting 1 234struct ure of the exper ime nt 1 22X



to



rcprcscnrarion issues 5 121 so ftware e ng inee ring 5 226



ga m e m ode ls 1 220



 346



Index



gaussian membership function 5116,117. Sec also Bell membership function



Gaussian RBFN 5 139 GD. Sec genetic design methodology GDE. See general diagnostic engine



life-cycle dimension 1 298 view dimension 1 299



Sec also GERAM framework GERA model function view 1 319



GDSS 1 197,200



information view 1 319



GEMC 1297



organisation view 1 319



meta-model 1 299 natural language explanation 1 299 ontological theories 1 299 gene expression data mining 5 272, 285. Sec also KSDG-SOM



resource view 1 319



GERAM framework EOS 1 297 GEMC 1297 GERA 1 296, 297, 298



general business activity agents 3 133



GIM. Sec GRAI integrated methodology



general category images (viewpoint pattern example) 3



GIP. See generic information platform



264 general diagnostic engine 4 220 general packet radio service. See GPRS generalization 1 120 generalization operation 1 146. See also projection operation; specialization operation generalized enterprise reference architecture. See GERA generalized enterprise reference architecture and



methodology. Sec GERAM framework generalized predictive control 4 97, 98. See also model predictive control



Gl-SIM 2 68, 69 GL. Sc': graph of liaisons



GLIF model in patient care 4 274 global stiffness matrix 2186,188,189,219 global system for mobile communications. Sec CSM networks



goal based agent model, layers of environment 3 196



function 3 196 goal-model 3 196 interface 3 196



generalized scalar management state 1 231-233



goal driven agent searches 3 134



generalized scalar state ofknowledge management 1 231



goal model



generic enterprise modeling concept. See GEMC



e-learning agents 3 183, 193



GENeric Expert SYstem for Scheduling. Sec GENESYS



firing rules 3 185



generic information platform 1 24 generic knowledge-based techniques blackboard architecture 3 30 C13R 329 constraint processing 3 27



control strategies 3 22 search strategies 3 24



Sec also implementation specific knowledge-based techniques generic models 1 302



generic parallel distributed genetic algorithm 5 226



Sec also composite state goal model Coogle's Page Rank 1 115 GPe. Scc generalized predictive control GPDGA. See generic parallel distributed genetic algorithm GPRS technology 1 262 GRA! 1311, 314; 2 67 GRA! integrated methodology (GlM) modeling methods GRAI268 J[)EFO,68 MERlSE 2 68 GRAI Nets 1314



generic task 3 12



GRAIIJ[)EF-Simulation. See GI-SIM



genencity, defined 3 170



GRAI-Grid model 1 314, 315



GENESYS



GRAl-Grid modeling language 1 311



expert system development 3 64



expert system shell 3 68



graph model of assembly 5 241. See also wave model



of assembly



knowledge base 3 67



graph of liaisons 5241,242



performance evaluation 3 70



graph parameters specification phase 3 142



problem analysis 3 65



graph representation for product modeling, 4, .5



production systems 3 65



graphical CPDL 4 426



genetic algorithms. Sec GA genetic design methodology 4 19 genetic optimization schemes 4 19 geometric optimization design 2 1H3 geometrical constraint 5 236, 250, 252



graphical modeling languages 4 425 agenr-Ulvll, 4 426 EAUML 4 426 UAML 4 426, 428 UAMLe 4 426, 428



geometry model 2 205



graphical user interface. See GUI



GERA 1296



GR13FN vs. ERBFN 5 140-141, 146, 147



CIMOSA activity 1 305 genericity dimension 1 299



vs. T13-R13FN 5 146, 147



 Inde x



greedy algori thm . See local search algo rithm



347



top ol ogy 2 179 with pe riodi c microstr ucture 2 17S



grid archit ect ure 1 264 pro to co ls 1 282 security infrastruc tur e (C SI) 1 282 grid co m pu ting Internet usage 1 265



See also mu lti- he teroge neou s mate rials heuristic classificatio n 3 12. t J heu ristic fun ction for pro duct family design evaluatio n 5 2IJI he uristic meth ods in scheduling



limi tations 1 265



geneti c algo rithm 5 217



virtual organizations and 1 264. 265



sim ulated ann ealing 5 217



group deri sion support system s 1 197



tabu search 5 2 17



C SI. See und er g rid



Sec also ex act solut ion meth od s heuristic techniques fin managem ent processes 1 220 HHIS. See ho spital in the hom e info rm ation syste m



CS M network s 1 262



hidden layer



group tech nology 4 5 gro upwa re 1 174, 200



C T. See group technology 4 5 C U I 1 28 agent 3 147 imp lem en tatio n as Web ~browser 1 3IJ Web 1 28, 30 gui deline int erc ha nge format. See CLIF m od el in patient care



defined 5 127 in R BF network 5 127



o LS m eth od 5 127



See also outpu t layer



hierarchial buildi ng blo cks 1 15 hierarchial distribution process 1 167 hier archial m anagem ent mode ling 1 222 basic model 1 22 1



hackers 1 28 I



integrated m odel 1 221



haem odynami c state . See cardia c o utput



real systcm 1 22 1



hard kno wled ge 1 185



struc ture o f rheexperimcne t 22 1



harmo nic operation o f strai n- induced actuato rs 4 353



hi erarchia l model



has relatio nship 3 18



management levels 1 2.15



has-a relatio nship 3 15



o f SPMI' 1 236



HA SE. Sct' hig h assurance syste ms enginee ring hash algo rithms 1 272 , 277 HDO M . Se,' hybr id design object m odel hea lth informa tio n ne two rks



SPM 1 208, 235, 236 hierarchical mon itor in g 3 227. 22X. Sff' also ce nt ralized moni tor ing; distr ibut ed m oni toring hie rarchi cal representa tion sche me 1 9. 10



Int ernet- enabled 4 253. 267



high assurance systems engin eering 4 124



intranct-enabled 4 253 , 267



high level program m ing lang uages 3 57



sec urity featur es 4 270 Health Level Seven 4 269 health m onito ring. struc tu ral 4 370



higher-o rder m obili ty 3 24; hill clim bing search 3 25 . 26 HI N . See health infor matio n netw or ks



health carc info rm ation systems 4 253 . See also patient care H E!.. See und er H T ML helpdcsks 1 174



H IS. See health care information systems H ITS algor ith m 1 u s. 121 H L7. Sce Health Level Seven holistic agility 1 7 1



hepatit is da ta prep rocessing data re duction 3 204 extraction of data subsets 3 205 feature selec tion 3 204 h ep atiti s database com bining temporal abstract ion w ith data minin g 3 202 preproc essing for data 3 204 prob lem' 3 2IJ2 hep atitis do mai n da ta min ing method s in 3 214, 2 15 tem poral abstraction method, in 3 206 HER syste m architec ture for C1C s 4 282 syste m arch itec tu re for wo rkflows 4 2H2 H erfindahl- H irschrnan index 1 53 H ESS expert system 3 62 h etero gen eo us mareri al Poisson 's ratio 2 171)



holon concept 1 251 holon ic m et ho d 1 252 , 256 homogenization 21 71), 218 Hopfield network 3 285; 5 19 . See 0/.<0 ne ural network hospital in th e h om e information system 4 268 Ho tSpots 1 279 house of quality 2 146, 147 advantages 2 152 co mponen t plann ing 2 153 co nstru ctio n 2 149-1 52 for ma nu facm rin g and assem bly plann in g 2 166 for min i- load er examp le 2 165 for planning part s and co m po nents 2 166 for pro du ctio n proc ess planni ng 21 53, 166 m ini-lo ader exam ple 2 1(,4 pro cess plann ing 2 153 prod uct planni ng 2 153



 348



Index



production planning 2 153



hyper text transport protocol. Sec HTTP



room in 2 14H structure 2 14K



hyperhnk analysis 1 115 hyperlink induced topics search. Sec HITS algorithm hyperlinked documents 1 115, 121



team work aspects 2 154



Sec also QFD HTML 1 111, 113 document 1 12H, 142 element 1147,149,155 extraction language (HEL) 1 143 features 2 2H5 for Web data extraction 2 263 forms 2274 nodes 1 155 page 1111,112,142,147; 2 261, 272, 274, 276, 2H7



hypertension in diabetes (openEHR case study) 4 2HS 13 knowledge learner (13KL) 1 123 ACRID system 1 132 association rule mining 1 132 I3 meradata extractor (I3ME) 1 123, 129



data flow 1 130 data pre-processor 1 1311 tokenizer 1 1311, 131



13 system data mining 1 116



parser 1 157



document categorization 1 119



tag 1 114, 141, 144, 153, 155; 22H5



HTML pages 1 111 hyperlink analysis 1 115 information extraction 1 116



XML data extraction from 2 2H5



Sec also XML HTTP protocols 2 261; 3 176 request 2 261 Dheh system 3 175 HTTPS 1 2HO; 2 269 human-computer interface 4 109 human knowledge 1 192 human machine system development 4 31 human role in smart organization 1 251, 283



humans and



objects 1 190 robots 1 190 software agents 1 190



Hurwicz criterion 5 174



hybnd approach tor fault diagnosis 4 236 for MINLP problem 5 306 for modular product family 1 9



hybrid design object model 4 37. Sec also function-behavior-structure model; integrated design process model hybrid GA solutions 5 225 hybrid intelligent systems for product-process design 4 26 fuzzy logic 4 27 hybrid learning for fault diagnosis 4 229 hybrid method for location-allocation problem branch-and-bound 5 306 GA 5 306



GA + B&B 5 307 GA + Lagrange 5 30H Lagrange relaxation 5 306, 30H nested GA 5 307 hybrid systems 1 255 hybrid systems for fault diagnosis associative system 4 23 t



information retrieval 1 113 intelligent Web agent 1 122 machine learning 1 116 semantic issues 1 123, 124



semantic Web 1 113 Web 1111 Web documents 1 111 Web mining 1 1211 13 system applications CRM 1134 intelligent agents 1 134 intelligent content management 1 134 personalization 1 134 text analysis and summarization 1 135



Web documents retrievall 134 I3 system architecture domain ontology 1 123 13 KL 1 123, 132 13ME 1 123, 129 I3WA 1122,125 I3 Web analyzer (13WA) 1 122 content analyzer 1 12H document parser 1125, 126



document type constraints 1 126



LAM IS system 1 12H, 129 linguistic constraints 1 126 linguistic detector 1 127 structural analyzer 1 127 structure constraints 1 126



topic constraints 1 126 Weh crawler 1 125, 126 IA. Sec intelligent agent IBF 4 2115 architecture 4 153, 154



combination systems 4 231



neural network model ~ 153



fusion systems 4 231



nonlinear regression model 4 153



transformation systems 4 231 hyper text markup language. Sec HTML hyper text transmission protocol run on SSL 1 2HO



setting up procedure 4 206



vs. multiple regression method 4 1611 IBF operating procedure 4 2116



 Index



forecasting and retraining 4 ] 56, l 59 fuzzy rule identification 4 l 5H identification affazzy rules 4 ]56, ]57



IDISE. See integrated distributed intelligent simulation environment IDPM. Sec integrated design process model



self-organised learning 4 l 56



IE. See information engineering



supervised learning 4 ]56, ISH



IEEE protocols



IBF vs. conventional neural networks



EPN 4 ]71 currency exchange rate forecasting (example case) 4 ]63 electricity consumption forecasting (example case)



H023 2 II 1. 113 H02.4 2 Ill, IIH BOl.5 2 111, 11H IEEE standards 2 66



416H



B02.11a1261 H02.1]b 1 261



REFN 4 ]71



H02.11g 1261



IBM project management 1 210



B02.3 2 65



IC card. Sec smart cards



H33 standard 2 39



IC system 1 247 ICAD 4 ]4 ICAF See implied cost of averting a fatality



IEEE standards for fault diagnosis IEEE] 232.1 4 233 IEEE] 232.2 4 233



ICAM 2 67



IETF monitoring 3236,237



ICAM DEFinition. See IDEF simulation



IFS 4 149. Sec also business forecasting



ICSA architecture



IFS architecture modules



event selection mechanism 3 11H



database 4 ISO-lSI. 207



explorer mechanism 3 121



fuzzy adjuster 4 ISO-IS], 199,207



functional mechanism 3 11H



lBF module 4 I'iO-152, 205



negotiation mechanism 3 120



intelligent scenario generator 4 ISO-lSI,



ICT 1 37, 24H, 266. See also information and knowledge ICT levers 1 52, 53, 54. See also ICT tools ICT security computer system security 1 269



]75,206 knowledge base 4 I51I-! 5], 207 user interface 4 I'iO-I.) I, 207 if-then rules 1 6H, 72; 4 ISH; 5 III, 120, 12H



network security 1 269



for fault detection 4 227



role of trust 1 271



for information infrastructure agility 1 76



JCTtools 2D CAD 1 5H 3D CAD 1 5H



for market infrastructure agility 1



7~



for people infrastructure ;lgility 1 76 for production agility 1 75



CAD 1 54



IGA 3 146



CAE 1 54 CAM 154 in Product Innovation 1 50



IGA sub-agents. Scc SIGA IHS. Sec intelligent hybrid systems IIDAP. See integrated intelligent design and assembly



knowledge management aspects 1 50 PDM 1 54 SMEs and 150 ICU 4 60 leu monitoring automatic control in 4 96 for respiratory therapy 4 83 ventilation monitoring 4 83 Sec also anaesthesia monitoring lD3 algorithm 4 225



planning 4 3H I1S. See Internet Information System ill-structured processes t 205 image classification 1 3() 1; 3 25K



IDA. See intelligent data analysis



image features



image data clustering 3 257 image databases general category image (experiment) 3 264 kitchen plan images (experiment) 3 260 retinal image (experiment) 3 267 viewpoint pattern 3



idea processing systems 1 ] 97



color 1 376



lDEF modeling methods



orientation t J 7()



lDEFO, 67 lDEFl, 67 lDEF2,67 lDEF SImulation 2 65 lDEFIX notation 2 ] 91 lDF1]]4 IDIDE. Sec integrated distributed intelligent design environment



349



2~-l-,



255



position and size 1 376 shape 137'i texture t 376 image insertion 1



.1,H I



image mining ARC 3 259 association rule mining 3 256. 2S7 classification 3 25K



 350



Index



clustering 3 257



basic proababiliry assignment 3 281



image preprocessing 3 255



Dempster-Shafer technique for 3 280, 282, 283



image-specific considerations 3 259



fuzzy sets 3 284



pattern discovery 3 256



Hopfield neural net for 3 285



viewpoint patterns 3 260, 261



sensor-detector pair assembly 3 279



Sec also data mining image preprocessing 1 361 feature extraction 3 255



Image adjustment 3 255 image recognition 1 355, 357, 361-362 pose similarity 1 374



rotation similarity 1 372 scale SImilarity 1 373 semantics 1 368 spatial similarity 1 370 image refinement 1 358 image retrieval



and rcsoniog 1 361 basic shape insertion 1 379 content-based (CBIR) 1 349



inclusion dominant complex union 2 212 incomparability index 5 16H. Sec also indifference index; preference index incremental view maintenance 2 226, 253 indexing



string-based 1 114 term-based 1 113, 114 indifference index 5 167 induced strain actuation application (airfoil vane example)



4365 aerodynamic stiffness 4 367



displacement amplification 4367,368,370 energy transfer 4 368, 369 kinematic gain 4 368 induced strain actuator



feature-based 1 349



actuator-structure interaction 4 332



image insertion 1 381



displacement analysis 4 333



KBS application 1 346



displacement-amplified 4 339



knowledge base management 1 379



electrocactive 4 333



knowledge-based 1 352



output energy analysis 4 336



logic-based 1 351



smart structures with 4 355



objects insertion 1 380



induced strain actuators for dynamic application



prototype system 1 378



dynamic stroke 4 348



query by sketch 1 381



electric response 4 353



spatial constraint-based 1 350 image segmentation 1 366, 375 image similarity computation color similarity 1 377 shape similarity 1 377 smoothing function 1 376 texture similarity 1 378 image-specific considerations



semantic information 3 259 spatial relationship 3 259 IMAGIM 2 68 iMode 1 262 Impact object 3 138. Sec also factor object impedance match principle 4 405. See also stiffness



match principle implementation specific knowledge-based techniques



mechanical response 4 351 resonance of complete system 4 348 induced strain actuators with compliant support



displacement analysis 4 337 output energy analysis 4 338 induced strain actuators, displacement-amplified displacement analysis 4 339 kinematic gain 4 342, 344 optimal kinematic gain 4 342, 343 output energy analysis 4 341 stiffness ratio 4 344 induced strain actuators, electric response of



admittance 4 354 capacitance 4 354, 355 electric capacitance 4 347



electric displacement 4 345, 353



frame-based representation 3 15



electromechanical coupling coefficient 4 346



fuzzy logic 3 21



electromechanical coupling coefficient 4 354



logic-based representation 3 19



electromotive effect 4 346, 353



object-oriented representation 3 17



harmonic operation 4 353



rule-based representation 3 14



impede nee 4 354



semantic networks 3 15



magnetoactive actuators 4 347



See also generic knowledge-based techniques implied cost of averting a fatality 4 314 imprecision modeling. See WIder fuzzy set theory for product development



inadequate analgesia diagnosis 4 68, 78, 87 incident and reflected rays alignment (acquisition cycle example) 3 278, 280



mechanical displacement 4 346 piezomagnetic coupling coefficient 4 347 reactance 4 354



stiffness 4 346 induced strain actuators, mechanical response of dynamic damping 4 351 dynamic stiffness 4 351, 352



 Index



resonance frequency 4 351, 352 strain-displacement compatibility 4 351 induction 3 11. Sec also abduction; deduction inductive expert systems 3 12. See also abductive inference types; deductive expert systems



information infrastructure methods cross-platform 2 356 neural fuzzy model 2 352 NOLAI'S 2 34H information management 1 167. Sec also data



inductive learning 1 177



management



inductive reasoning 3 11, 12. Sec also abductive reasoning;



information processing system, pellA 2 100



deductive reasoning inference engine 1 ]76-177; 3 56, 93; 4 125



information projects, knowledge-based 1 207



inference rules 3 12



fuzzy 1 224; 3 21 in IlI'M 1 292 inference structure of heuristic classification 3 13



inference types abduction 3 111



information retrieval query reformulation techniques 1 114 string-based indexing 1 114 systems. Sec IR systems term-based indexing 1 113 Web mining and 1 ]211, ] 21



Sec also information extraction; search engines



deduction 3 ]0, 1 ]



information sharing language 3 81



induction 3 10, 11



information strategic planning (ISP) 2 6



infi-rcncing 1 196



defined 2 9



inferential actions 3 13 inferential knowledge



features 2 ]0



abductive 3 HI, 11 deductive 3 10, 11 inductive 3 111, 11



Sct' also domain knowledge info-communication system. See Ie system Info Discoverer method 1 12R informal external presentation afknowledge



1 341 informal languages 1 300 information and communications technologies.



Sec icr



information and interface agents 3 106 information and knowledge capital 1 17] characteristics 1 147 society 1 247 information engineering (IE) 2 4, 5 information engineering systems 1 116 mctadara attributes 1 11()



tckenization 1 116 information extraction 1 113, 116, 120 for personalization 1 142 in comparison chart building 1 142 ontology based approach 1 144 problem 1 ] 43 systems. Scc IE systems



lSI' 2 9 objectives 2 9, ]0 information strategic planning methodology lSI' evaluation 2 ] 1 ISPM-El, 11 ISI'M-E2, ]]



Sec also integrated methodology for EIS information systems acquisition process 2 40 contribution to manufacturing 2 56, 58 development methodologies 2 5 for knowledge management 1 166 for mass customisation 2 36 HIS 4 253 in manufacturing. See manufacturing information systems in supply chain systems 2 27 ISIS 3 62 management 2 13 requirement analysis techniques 2 21 See also enterprise information systems; information engineering systems information systems aspects, peBA 2 10<) channel utilization 2 111, 114, 116 IEEE protocols 2 ] I I LAN transmission rates 2 111, 114, 117 maximum message delay 2 112, 114, 116,



techniques 1 142 with CG-wrappers 1 ] 56 wrappers 1 142, ] 4(,



Scc also information retrieval information filtering 1 142



117 maximum message size 2113,116,117 token passing protocols 2 11H information systellls development cycle 2 3H



information flow in supply chain 2 4H, 49



111



information gathering agent. Sec IGA



SDLC 2 4



information infrastructure agility



manufacturing 237,39,40



information systems performance



madding 1 76



evaluation framework 2 17



parameters 1 76



evaluation model 2 14



Sec also market infrastructure agility; people



improvement cycles 2 15, 16 improvement model 2 12, 15



infrastructure agility



351



 352



Index



information technology



application layer 4 3R



evolution in manufacturing 2 30 for mass customization 2 36 knowledge and 1 247 information technology track KM 1165,170,177. See a/50 people track KM information view 1 319, 323 informed search strategies 3 24. Seealso uninformed search



core system and control layer 4 37



strategies



infrastructure as manufacturing IS element 2 31



inherit knowledge 1 1R5 inheritance 3 1R initialization methods. Sec undergenetic algorithms,



Input Analyzer (ARENA) 2 79-RO, R3, 102 input representation for ANN-based CAPP component complexity factor 5 31 input vector in binary form 5 23 input vector in mixed form 5 23 input vector with integer value 5 23



input vector with value 0-1 5 23 production batch size factor 5 33 production urgency factor 5 33 standardized image data 5 22 input representation for feature recognition



2D feature representation 5 9 3D feature volume 5 14 attributed adjacency matrix 5 10 face adjacency matrix code 5 9



face score vector 5 9 F-adjacency matrix 5 10 partitioned view-contours 5 14



simplified skeleton 5 15 Vcadjacency matrix 5 10 InsertWholeColumnO function 5 2R3 integer linear programming 5 293 branch-and-bound method 5 295, 296 integer programming problem 5 295 Lagrange relaxation method for 5 298



sub-gradient method 5 299 integral platform 1 9 integrated & distributed intelligent system 4 23-24 integrated design for product 1 5; 4 5 integrated design process model 4 38. Sec also functionbehavior-structure model; hybrid design object model



integrated development techniques 2 3 integrated distributed collaborative design 4 39 integrated distributed intelligent design environment 4 24



designer communication layer 4 36 working environment 4 34 integrated intelligent design systems implementation 4 3H



AI-supported Internet-enabled virtual proto typing 441 I1DAP 4 39 integrated manufacturing system 2 107 integrated metliodology for EIS approach 2 R EIII 2 7,12 EJMS 2 6 features 2 7 ISPM 2 6, 9, 10 patterns and scenarios 2 5



process 2 R repository 2 7 road map 2 6 S3IE 2 7 UMT 2 7, 23 integrated model 1 222 data model for product life cycle management 2310



flowchart based on ARENA 3.0 2 95 flowchart based on COMNET III 2 95 in management modeling 1 221



ofSPM 1 237 Sec also integrated simulation model integrated modeling simulation



PCBA system (example) 2 69, 70 research objectives 2 69 integrated modeling simulation methods



GIM 2 6R GI-SIM 2 6R GRAI266 ICAM 2 67 IDEF 2 67 SADT 2 6R SIM 2 6R SSADM 2 6R integrated modeling simulation tools ARENA 3.0, 72, 73 COMNET III 2 72 integrated operations modeling techniques 2 64 integrated order management systems 2 310



integrated product development 5 lRR integrated simulation model



ARENA 3.0 simulation tool 2 93



integrated distributed intelligent simulation environment



communication system aspects 2 69, 85



425 integrated distributed intelligent system 4 24, 25



COMNET III simulation tool 2 93



integrated intelligent design



information system aspects 2 109 operational system aspects 2 69, 73, 97, 106



AI protocol-based 4 34 and assembly planning (I1DAP) 4 3R-40 issues 4 30 requirements 4 32 integrated intelligent design and assembly planning 4



3R-40 integrated intelligent design framework 4 29



establishment of 2 93



SIMAN simulation language 2 73 integrated simulation modeling tools



ARENA 2 70, 73 COMNET 1Il 2 70, R5 integration agent 3 123 intellectual capital 1 171, 192, 193



 Index



Intelligence knowledge 2 349



drug infusion 4 96



intelligent agent 1134,191; 3133



ECG 4 69, 84



based system characteristics 3 79



esophageal intubation detection 4 90, 91



communication management agents 1 200



evidence-based reasoning 4 74



for agile business processes 3 76 for OKMS 1 198



fuzzy logic 4 71, 72 in ICU 4 77,81,83,90,94,96



framework 3 79



in OR 4 81,86, 90, 94, 96



functions 1 175 ICSA 3 106



inadequate analgesia diagnosis 4 68, 77, 78, 87



in KM 1 166 in multi-agent system 1 256



intelligent alarms 4 88 malignant hyperpyrexia diagnosis 4 87 mean arterial blood pressure 4 07



learning agents 1 177 object-oriented (IOOA) 3 94 organization's 1 196



multiple drug infusion 4 103 probabilistic reasoning 4 77, 79



paradigm 1 174, 198 personal agents 1 200 providing decision support 3 78



reasoning methods 4 71



schematic diagram 1 176 use of 378



temporal pattern recognition 4 65



Web agent. See intelligent Web agents Sec also multi-agent system intelligent agent architecture



pulmonary disease 4 96 smart sensors 4 89



template-based methods 4 66



See also automatic control in patient monitoring intelligent patient monitoring system



ALARM 4 79 CADIAG-2,75



ADEPT agent model 3 83



CARDlOLAB 4 84



agency 3 81 agent body 3 81



KARDIO 4 84 ORAMA 4 86



ARCHON agent model 3 83



PATRICIA 4 83



intelligent agent system environment



SENTINEL 4 87



prorocol 3 81



SIMON 4 61, 84



service requirements 3 HO



TrenDx 4 66, 67



intelligent alarm system 4 88, 109



intelligent business forecaster. Sec IBF intelligent CAD 4 9 intelligent data analysis data abstraction 3 198, 200 data mining 3 198, 200



intelligent Petri net model 4 35 intelligent planning for FMS 4 12 intelligent process planning 4 11-12 intelligent product design 4 10 intelligent production system 4 12



intelligent forecasting system. Sec IFS



intelligent scenario generator 4 150-1S1 architecture 4 180



intelhgent hybrid system 1 254; 4 26 for business forecasting 4147,149-150 for concurrent engineering 4 29 intelligent information infrastructure 2 347 intelligent, integrated, and interactive CAD. Sec mCAD intelligent integration of design and assembly planning 439 intelligent Internet Information system. Sec I3 system intelligent methodologies for product development 1 6 intelligent monitors 4 108



learning ability (example) 4 185 learning algorithm for 4 183 operating procedure 4 206 setting up procedure 4 206 truth valued flow inference 4 178, 179 working (example) 4 187 Sec also fuzzy adjuster; intelligent business forecaster intelligent scheduling and information system 3 62 intelligent scheduling technology 2 29 intelligent simulation 4 13



intelligent object-oriented agents. See IOOA



intelligent simulation multi-agent tools 3 111 intelligent supply chain agent. See ISCA intelligent systems dependable intelligent system 4 130 distributed 4 134 failure detection and recovery 4 129



intelligent patient monitoring



absolute hypovolaemia disgnosis 4 89 Al-bascd 4 64 anaesthesia monitoring 4 85 ANN-based 4 63, 68, 79 auditory evoked response 4 92



aotomatic drug delivery system 4 100 Bayesian networks 4 77, 79 capnography 4 91 cardiac output 4 94 depth of anaesthesia 4 91, 102



fault tolerance in 4 127 for business forecasting 4 147 for production system 4 12



hybrid system 4 26, 147 in fault diagnosis of electronic system 4 212 mission critical 4 120



353



 354



Index



intelligent systems for product/process design



ANN 4 15



inter-organizational design 1 48. Sec also multi-project management



CER systems 4 19



intersection of fuzzy sets 5 118



CD-IDIS 4 24



intranet 1 174



ClMS 4 24



enabled HIS 4 267



concurrent design 4 24



for clinical support systems 4 250



DAI 4 23 DDIS 4 24



Sec also Internet intrinsic precedence relations 5 250



genetic algorithms 4 17, 1H, 19



for assembly sequences generation 5 251



hybrid intelligent systems 4 26



for automatic generation of assembly sequences 5 263,



IDIDE 4 24



264 See also extrinsic precedence relations



IDIS 4 23, 24 IDISE 4 25



inventory control agents 3 109



lIICAD 4 27



investment efficiency 1 15; 5 195



KBE 4 14



100A 3 94



symbolic reasoning systems 4 13



100A architecture 3 102



Intelligent Web agent



explorer 3 104



crawling 1 122



manager 3 103



extracting ability 1 122



optimizer 3 103



interacting ability 1 122 learning ability 1 122



intensive care unit. Sec leu



interaction modeling language 4 419. See also CPDL language interaction modeling language characteristics



case-of-design 4 419 reuse 4 419



scheduler 3 101 lOPE Sec interaction-oriented programming framework IP. Sec 1i1lder Internet IPR. Sec intrinsic precedence relations



IPS1197 IR systems 1111,113 Web compatibility 1 114 Web IR systems 1 115



synchronization 4 419



is-a relationship 3 15



tools 4 419



is-a-kind-of relationships 3 15,



validation 4 419 interaction-oriented programming framework 3 172 interaction protocols development cycle 4 412 in multi-agent systems 4 409



See also engineering interaction protocols interactive wrapper creator



DOM tree component 1 157



is



is-an-instance-of relationship 3 15



rSA. Sec induced strain actuator IsBoundaryNodeO function 52HZ ISCA 3 106-109 ISeA classification



control agents 3 106, 10H, 109 problem agents 3 109



problem-solving agents 3 106



Web browser 1 157 interfacial shear force 4 384 interfacial shear stress 4 379, 3HO, 3H4



structural agents 3 106, 10H ISG. See intelligent scenario generator ISIS. Sec intelligent scheduling and information system



internal-agent-ontology 3 14H



ISM 213,14



internalization, knowledge. See knowledge internalization



ISMAT 3111



Internet



IsNearBoundaryNodeO function 5 2H2



based electronic commerce 2 34, 35



ISO 9000 family of standards



enabled HIS 4 267



BPM requirements 1 310



for clinical support systems 4 250



business process interactions 1 314



monitoring 3 236



business process reference models and 1 325



protocol (IP) 1 260



capability definition 1 317



security attack 1 267



enterprise modeling 1 319



Sec also intrancr



information view requirements 1 323



Internet-enabled learning. Sec e-learning



organizational view requirements 1 319



Internet Engineering Task Force. See IETF monitoring Internet Information System 2 133



organizational responsibilities and authorities 1 320



Internet Learning Agent (ILA) 1 122 Internet technologies in manufacturing information systems 2 34



resource view requirements 1 321



wrapper induction 1 142 intcroperability 1 76, 261



product realization and support processes 1 317



ISO 9000:2000 standard. Sec ISO 9000 family of standards ISO/IEC 101H1-(30) standard 1269 ISO/lEe 1540R standard 1 no ISP. See information strategic planning



 Index



ISpM. See information strategic planning methodology IT. See information technology



355



KBS models for SPM fuzzy 1 20B



iteration for product family designing 1 6, B



hierarchial 1 20B



IWA. See intelligent Web agent



structural 1 20B KBS technologies



jackknife technique 5 94. See alsobootstrapping



ant algorithm 1 254, 255



JADE 3150



artificial neural networks 1 254. 255



ApI3 149



data mining 1 254



for e-Iearning system development 3 195, 196



fuzzy logic 1 254



JAIN community 3 240 Jango 1 145 Java 3 15B applications in manufacturing 2 30, 37, 41, 42



genetic algorithms 1 254 KDD 1117 goals of3 202 techniques and heptatitis problems 3 2113



as monitoring technology 3 239



KDD methods



for B2B business system implementation 3 150



classification 3 202 clustering 3 2112 dependency modeling 3 202 regression 3 202 summarization 3 202 KDPAG 3 63 Kcrberos 1 27B kernerl-based self-organized maps 5 272, 273. Scc also KSDG-SOM kernel supervised dynamic gr-id SOM. Sec KSDG-SOM key performance indicators 1 324 KIF. Sec knowledge interchange format kinematic gain 4 342, 343. Scc also optimal kinematic gain kitchen plan images (viewpoint pattern example) 3 26~ KM (knowledge management) agent 3 1111 agent-based (ABKM) 3 122 AI and 1166,1711 architecture 1 1(/)



for EVE project 2 240, 242 for MCIS 4 127



programming language 3 158, 159 RMI3239 Java Agent DEvelopment Framework. SeeJADE Java Dynamic Management Kit 3 239 Java Management Extensions 3 239 Java Runtime Environment 3 173 Java Server Pages 2 262 J-DMK. See Java Dynamic Management Kit JIT 2 35. See just-in-time technology JMX. SeeJava Management Extensions Job processing 3 66 Job rotation parameter 1 76 Job shop 3 65 model 5 214 problems 5 224 scheduling 3 62 scheduling problem USSp) 5 225, 236 JoinSmoothlyNeigliboursO function 5 282 JRE. See Java Runtime Environment JSP 2 262 JSSp. Sec IIllder job shop



as a process 1 1!J9 BPM and, 2B~, :nll, 337 cognitive approach 1 43 company"s organizational performance and 1 257



jump to transition 3 185 just-In-time technology 2 29; 3 66



concurrent engineering aspects 1 44 defined 1 3(" 411. 41, 16') for business process improvement 1 193 generalized scalar state 1 231 ICT tools 1 SII in Product Innovation 1 36, 3H, 42. 57 intellectual capital concerns 1 171 knowledge capital concerns 1 171 multi-agent systems for 1 196 need for 1 332 organizational aspects of 1 166-16H paradigm shift 1 167, 16B performances 1 51, 52 resource-based view 1 42 social capital concerns 1 171, 172 systems. Scc KMS tracking. Sec KM tracking within SMEs 1 37, 54 work-flow agents 1 201 Sec alsodata management



KADS 1 209 KARDIO system 4 B4 KBE for product-process design 4 14 KBES. See knowledge-based experts systems KBS



as knowledge servers 4 45 CBR techniques 1 B4, 91 EIS 1 1')7 fuzzy SPM modeling 1 235 GDSS 1197 in purchasing 1 90 IPS 1 197 management information system 1 197 multi-attribute analysis 1 84 production systems 4 13 scheduling systems 3 62 smart organizations and 1 254



 356



Index



KM architecture components (Borghoff and Pareschi) knowledge cartography 1 172 knowledge flow 1 172 knowledge repositories and libraries 1 172 knowledge workers communities 1 172 KM architecture components (Wang) access and authentication 1 172



collaborative intelligence 1 172 interface 1 172 middleware 1 173 repositories 1 173



KM classification criteria formal aspects 1 169 organizational aspects 1 169, 170



process aspects 1 169 KM configuration cluster analysis 1 56-57 cudification 1 51, 57-58 contingencies 1 52, 60 factor analysis 1 56 impact on performances 1 56



network-based 1 51, 57-58 non-linear regression analysis 1 56 performance 1 52



traditional 1 50-51, 57-58 KM configuration clusters icr tools 1 57 organizational tools 1 57 KM in SMEs 1 37 configurations 1 54 ICT tools 1 54 performances 1 54 research sample 1 54, 55 survey development 1 56



KM process knowledge capitalization 1 43 knowledge creation '1 43 knowledge transfer 1 43 Sec also knowledge process KM technologies data warehouses 1 174 document management 1 174



groupware 1 174 helpdesks 1 174 intranets 1 174 project management 1 174



Web conferencing 1 174 workflow 1 174 KM tools AI-supported 1 173, 174 information technology-supported 1 173 KM tracking information technology track 1 165, 170 people track 1 165, 171 KMS 1 37,191,337 architecture 1 169, 172, 173 defined 1 40, 41 functions 1 194



organization's (OKMS) 1 194 with virtual reality 1 198 knowledge acquisition 1 188, 197,338; 3 32, 35-36, 57 agents 1 202 artifact 1 336, 341 base. See knowledge base based system. Sec KBS classification 1 40 concept of 1 38, 39, 40 defined 1 183, 331 discovery. Sec knowledge discovery



domain 38 elicitation 3 35 extemalization 1 335, 337, 341, 342 flow 1 172, 193, 200 formalization 1 340, 341, 342 graph3141 inferential 3 10 information teclinogloies and 1 247 internalization 1 335, 337, 341, 342 life-cycle model 1 338, 339 nature of 1 333 notion of 1 181 possessors 1 181 repository 1 28 representation. See knowledge representation



retrieval 1 196 sharing 1 257, 333 work 1 165, 167 workers 1 189 knowledge aspects



dynamic development 1 182 ownership 1 182 systemic nature 1 182



knowledge base 3 56, 133 agent knowledge base 3 195 business knowledge base 3 195 case-based 3 134, 135 CommonRules 3 136 domain 3 112 editor 3 57 event-handling 3 112 failure detection (FDKB) 4131 failure recovery (FRKB) 4 131 for e-business 3 136 GENESYS (case study) 3 66 in e-learning system 3 195 knowledge graph 3 141 problem solver (PSKB) 4131 rule-based 3 134, 135 knowledge-based agility measure 1 68 knowledge-based comparison chart builder



1140,157 knowledge-based engineering 3 21 knowledge-based expert system 3 3,7; 4 14 domain knowledge 3 8 film feeding unit (example) 3 44



 Index



inferential knowledge 3 10



CUI12H



Scc also knowledge engineering



implementation 1 28



knowledge-based functional reasoning strategy 3 40



Web-enabled 1 2H, 29



C13I339,41



knowledge interchange [annat 1 17H, 257



FSS 3 39, 41



knowledge level in knowledge engineering 37,8



knowledge-based functional representation 3 32



knowledge level manufacturing systems



fuzzy logic 3 37



CAD 2 29



object-oriented behavior 337



CAM 2 29



rule-based 3 36



CNC 2 29



knowledge-based image retrieval semantics 1 353, 354, 355, 356, 357 syntax 1 352 knowledge-based information projects 1 207



engineering workstations 2 2Y knowledge management. Sec KM knowledge manipulation activities acquiring 1 335



knowledge-based measurement of enterprise agility 1 67



internalizing 1 335



knowledge-based module 1 156



selecting 1 335



CoCITaNT library 1 157



using 1 336



domain knowledge 1 157



knowledge model 3 ~



product knowledge 1 157



knowledge modeling 1 20



URL Wizard 1 157



B- FES functional model 3 35



knowledge-based processes for product family design 1 6



CML310



knowledge-based techniques in engineering design



in product family design 1 23, 25



generic 3



~2



implementation specific 3 13 knowledge-based techniques, implementation specific frame-based representation 3 15



fuzzy logic 3 21



issues 1 22 knowledge possessors 1 188 artificial 1 1H2 natural 1 IH2 knowledge process



Ingic-based representation 3 19



combination 1 335



object-oriented representation 3 17



extemalizarion 1 335



rule-based representation 3 14



internalization 1 335



semantic networks 3 15



socialization 1 335



knowledge-based temporal abstraction 3 199



357



See also KM process



knowledge-based wrappers 1 153



knowledge query and manipulation language. See KQML



knowledge discovery 2 347; 3 199



knowledge representation 1 196



data mining and 3 201



and description logics 1 347



from databases 1 117 tools 1 197



for fault diagnosis 4 23H for product family design 1 25 for semantic networks 3 15



knowledge discovery information infrastructure cross-platform 2 356 neural fuzzy model 2 352 NOLAI'S 2 34H knowledge engineering computational level 3 7, X



frame-based 3 15 in MClS 4 126 logic-based 1 34H: 3 19 non-logic-based representations 1 348



knowledge level 3 7, H



object-oriented 3 17 rule-based 3 14



I'SM 312



XML-based 3 113; 4 127



knowledge environment contribution aggregation 1 173



See also knowledge-based functional representation knowledge resources



creation 1 173



content 1 336



storage 1 173



schematic 1 336



transfer 1 173



knowledge server, Web-based 445. Sec also WebDMME



usc/reuse 1 173



knowledge sources 3 311



knowledge-intensive method 1 Yl knowledge intensive support system architecture 1 2H



masters knowledge 1 186 observers knowledge 1 186



See also blackboard architecture



benefits for product family design 1 30



knowledge storage agent 3 123



client/knowledge server architecture 1 28



knowledge support for modular product family design



collaborative design interactions 1 2X



designing issues 1 21



Design Advisor system 1 2X, 30 for modular product family design 1 4



functional analysis 1 27 knowledge modeling issues 1 22



 358



Index



knowledge support process 1 22 product assessment 1 21



product platform generation 1 21 system requirement modeling 1 27



knowledge support for product design challenges 1 19 key issues 1 19 knowledge scheme 1 19 knowledge support strategy for modular product family design 1 19 for platform-based product design 1 3 for platform-based product development 1 3 for product development 1 4 product family design 1 20 product family planning stage 1 20 knowledge-supported modular product family design issues 1 21 knowledge technologies



Al1254 ANN 1 254, 255 expert systems 1 254 intelligent agents 1 256



KBS 1 254 KIF 1 257 smart organizations and 1 254



knowledge types 1 180 knowledge typologies artificial 1 185 built-in 1 185 declarative 1 185 explicit 1 184 hard 1 185 inherit 1 185 mamfested 1 186 natural 1 185 packaged 1 185 procedural 1 185 soft 1 185 tacit 1 184 Kohonen map 4 80 Kohonen networks 4 81 Kohonen's feature maps algorithm 4 156 KPL See key performance indicators KQML 1178,179 KQML 1178,179 KQML 385,112,194 KQML 4 34 KQML for e-leaming system 3 194 KR. See knowledge representation KS. Sec knowledge sources KSA. Sec knowledge storage agent KSDG-SOM 5 273 dynamic growth control criteria 5 283 expansion 5 280 for gene expression analysis 5 285 learning 5 275, 287 learning algorithm 5 276 model. See KSDG-SOM algorithm 5 285



nodes 5 275, 279 supervised training 5 276 training 5 275, 286 unsupervised learning 5 276 unsupervised training 5 276 KSDG-SOM algorithm 5 274 classification performances evaluation 5 279 expansion phase 5 278, 282



fme tuning adaptation phase 5 278 initialization phase 5 276 map adaptation rules 5 277 model selection step 5 279 node deletion 5 280



training run adaptation phase 5 276 training run convergence condition 5 278 KSDG-SOM expansion function InsertWholeColumnO 5 283 IsBoundaryNodeO 5 282 IsNearBoundaryNodeO 5 282 JoinSmoothlyNeighboursO 5 282 RandomLikeClustersRemainO 5 283 RippleWeightsToNeighboursO 5 283



Lagrangerelaxation method for integer linear programming problem 5 21)8 for location allocation problem 5 306, 308



for SSCFLP problem 5 298 Lamb mode tuning 4 401 Lamb wave modes 4 373 Lamb wave propagation 4 370, 374 Lamb waves excited by PWAS antisymmetric solution 4 394 complete solution 4 394 ideal bonding solution 4 395, 396 Lamb modes 4 388 mode tuning 4 396 Pitch-catch PWAS experiments 4 398 PWAS tuning 4388 shear stress distribution 4 388 symmetric solution 4 392 under harmonic shears-tress excitation 4 389, 390 LAMIS system 1 128, 129 LAN 2100 CSMAlCD 2 114, 115 for manufacturing system 2 65



performance for PCBA system 2 113 protocols 2 115 traffic congestion 2 112 utilization 2 112 Wi-Fi 1 261 wireless (WLAN) 1261 LAN transmission rates 2 111 vs. channel utilization 2 114 vs. maximum message delay 2 114 LCC analysis 2 301 method 2 299, 302 Sec also life cycle costing; life cycle management



 Index



LCC estimation



ANN-based 5 40, 42 in product development 5 39 neural technology for 5 3H training algorithms for 5 50, 51 training process of 5 41 LCC estimation model



ANN-based 5 4H data collection aspects 5 49 LCC factors development 5 42 product attributes 5 43 LCC model attributes fmal product attributes 5 45 general product attributes 5 44 maintainability attributes 5 44, 45 Sec also LCC estimation model LCM. See life cycle management LCNN. See local cluster neural networks LDS. Sec logical data structure lean manufacturing 2 34 learner preparation process 3 191, 192 learning 1 116 agents 1 J 77 for obstacle avoidance 3 399



KSDG-SOM 5 275, 287 machine. Sec machine learning process 3 191, 192 state 3 278 supervised 4 158 system 1 J 16 learning algorithms delta rule 5 137 FILER 4 86 for LCC estimatiun 5 41 for fault diagnosis 4 229 for ISG 4 IH3 for RBFN 5112 KSDG-SOM 5 276 unsupervised 5 25 learning and coordination cycle 3 274-275, 277 BAM planning 3 308 evolutionary algorithm 3 313 learning for obstacle avoidance 3 299 Seealso acquisition cycle; 'perception cycle least mean square fitting 5 70 least square (LS) error critenon for SYM. Sec LS-SYM method tor de-trending estimation 5 68 principle 5 127 legacy systems 2 41 legal rewriting 2 230 cost factors 2 235 cost of2 246 maintenance basics 2 234 overall efficiency 2 236, 243 QC-model 2 246 See also view rewritings



359



Levemberg-Marquardt approximation 5 26 levers. See ICT levers life cycle dimension of CERA 1 29H information system 2 4 model (CERA) 1 298 of networked organizations 1 249



phases of networked organization 1 284, 2H5 life cycle costing 4 315 basics 4 317 general aspects 4 316 life cycle management



cost and benefit analysis 2 301 customer's view 2 298



digital life cycle product 2 304 digital product tracing 2 305 economic assessment 2 299



iceberg effect 2 300 integrated information model 2 309, 310 maintenance aspects 2 296 manufacturer's view 2 298



modular product design 2 296 objectives 2 298, 299 OEM 2 303 optimization paradigm 2294,297,299,309 performance potential 2 303 product 2 297 product data management aspects 2 307



products traceability 2 306 recycling aspects 2 295 , 296 sustainable 2 297 technical support processes 2 309 Sec also LCC Life Quality Index 4 314 life time value, product 2 312, 313 line balancing requirement 2 107 linear model for financial time series 5 71 linear programming 5 295 linear programming problem branch-and-bound method 5 295, 296 Lagrange relaxation method for 5 298 linear temporal logic 4 433. Sec also branching temporal logic linearizing transformations for product model formulation 2328. 329 LINGO. Sec underoptimization software linguistic detector 1 125, 127, 130 linguistic level modeling 1 233 linguistic models 1 222, 223 linguistic variables and valus in fuzzy sets 5 115 link analysis of mining informative structure. Sec LAMIS system links in COM NET III models 2 87, 88 Lixto system 1 143, 145 LM. See linguistic models load vector 2189,219,220 local area network. Scc LAN local cluster neural networks 5 85



 360



Index



local search algorithm 5 302, 303. See also



machine learning methods 1 116



branch-and-bound method; Lagrange relaxation;



machine perception. See machine vision



simulated annealing



machine vision 3 277



localized response neurons 5 X3



MADM. See multiple attribute decision making



location-allocation problem



magnetic inductance 4 347



alternate. Sec alternate location-allocation method



ELAP 5



zvz, 306



magnetoactive actuators 4 355. Scc alsoelcctroacrive actuators



minimization 5 292, 293



Mahanalobis distance 5 132



Sec also facility location problem; mathematical



maintainability attributes in



Lee estimation 5 44,



maintenance of product 2 296,



programming



location-allocation problem, hybrid method for



make or buy decision 1



8~,



100



branch-and-bound 5 306



CUR systems in 1 92



GA 5 306



competitive implications 1 H5



lagrange relaxation 5 306 location-aware programming 3 234



cost issues 1 85



evaluation 1 84



location transparency 3 234



facility design issues 1 85



logic-based image retrieval 1 351



manufacturing capacity issues 1 85



logic-based representation 3 19



multi-attribute analysis 1 101



logic programming



organisational implications 1 H4



predicate logic 3 19



product development issues 1 85



prepositional logic 3 19



sourcing decision 1 84



logical data structure 2 68



make or buy decision methodology (Probert) initial business appraisal 1 86



logical team 2 133 in big company 2 134



internal! external analysis 1 H6



optimal strategy, choosing of 1 86



in SME 2136 logistic control agent 3



10~



strategic options Evaluation 1 H6 make or buy model



logistics 2 36



long term memory acquistion cycle 3 275 perception cycle 3 277, 285 recognition state 3 277



See also short term memory loop definer wrapper 1 151 looping CG-wrapper data collector 1 151 loop definer 1 151 loops (3- T) design 2 127 fcasibility 2 127 manufacturing 2 127 production 2 127 production planning 2 127 loops in concurrent engineering l-T2l27



CEil.. system description 1 ~3 for manufacturing processes 1 85



Strategic aspects 1 87 make or buy model stages internal/external technical capability profiles retrieval 1 8~. 9~



performance categories identification 1 H7, 93 suppliers' organisations analysis 1 89, 101



technical capability 1



~4



technical capability categories analysis 1 89 total acquisition cost analysis 1 90 make or buy system



application of Al techniques 1 106 as a consultancy tool 1 106 CUR and 1



~o



dynamic performance analysis 1 106



KBS and 1



~O



2-T 2127



malignant hyperpyrexia diagnosis 4 H7



3-T 2127



Mamdani HS 5 121.122. See also Sugeno F1S



LP Sec linear programming algorithm



Mamdani fuzzy model 5 121



LQ1. See Life Quality Index



Marndani's model 1 222, 223. See also



LS-SVM 5



sz,



45



2~7



~3



Takagi-Sugeno-Kanga model



LTM. Sec long term memory



management analysis resource scheduler. See MARS



LUPC program 3 215



management and control system 1 290, 323



M.ELMAN. See memory Elman



management by delegation (MbD) 3 22~, 230 code pushing software paradigm 3 241



MA. Sec mobile agents



delegated agents 3 242



MAA. Sec multi -attribute analysis



delegation protocol 3 242



MAC filtering 1 278 machine intelligence model 4 213



elastic servers 3 241



machine learning 1 116; 4 34; 5 111. See also rule learning



in dynamic monitoring 3 241



dynamic delegation 3 242



 Index



in OSI management context 3 243, 244



MRP1l228



Internet management and 3 242



object-oriented paradigm 2 45



management in uncertain conditions, models of 1 220



phases of support 2 58



management information base 3 236, 243 management information system 1 174, 197, 290, 323 management modeling



PLC in 2 48, 53 scope of2 43 seating systems example 2 46



framework 1 221



SFDC 2 28



fuzzy set theory 1 222



simulation modeling 264,65



hierarchical approach 1 221 simulation theories 1 221 use 1 221



value stream mapping methodology 2 46



Sec a/50 SPM model management modeling relationships predicative relevance 1 221



SPC 2 28 virtual organizations and 2 34



See also paradigms shifts in manufacturing manufacturing information systems classification knowledge level systems 2 29



repetitive relevance 1 221



operational systems 2 29



structurally relevance 1 221



strategic level systems 2 29



management systems genetic algorithm application 5 213 management information system (MIS) 1 174, 197, 290, 323 resource-constrained scheduling problem 5 213 manager role of agent 3 84, 103



tactical systems 2 29 manufacturing information systems classification (Randall) execution 2 30 infrastructure 2 30, 31 planning 2 311 manufacturing information systems development



manifested knowledge 1 186, 187



CASE 2 37



manual categorization 1 119. See also automatic



COTS 2 37



categorization manual specialization 1 146, 149 manufacturers view in defining products life cycle 2 298.



Sec also customer's view in defining products life cycle manufacturing enterprise activities collaboration 2 36



examples 2 40 prototyping 2 37 RAD 2 37 software development aspects 2 37, 38 UML 2 39 manufacturing loop 2 127 manufacturing organizations 2 36



decisions 2 36



business strategies 2 57



logistics 2 36



extended enterprise 2 35



partners 2 36



information system development 2 39



recovery 2 36



information systcms role 2 43



sensing 2 36



IT strategies 2 5h, 57 STEP standard 2 45 supply cham 2 47



manufacturing information systems (MIS) agile 2 41 AGV in 2 28 CAD 2 28 CAM 2 28 CIM 2 28, 43 CIMOSA 2 45 CNC 2 28



virtual cnrcrprisc 2 35 manufacturing resource planning 2 28, 53 MRPll, 29, 37 MRl'1l229 manufacturing simulation model (ARENA) 2 75, 79 manufacturing strategy



data management technology 2 28



defensive 2 325



e-commerce and 2 33



make or buy decision 1 83, 85



EDI 2 28, 43



on KBS-based 1 83



enterprise activities 2 36



361



purchasing function 1 83



enterprise-wide support (example) 2 45



manufacturing supply chain 2 48



evolution 2 29



manufacturing systems



information accuracy analysis 2 53



analysis 2 66



information dependency and intensity 2 46



design 2 66



information flow and operation 2 4R, 50



information flow diagram 3 SlJ



information systems acquisition process 2 40



integrated modelling simulation methods 2 66



integrated modeling techniques 2 64



LAN design 2 66



Internet technologies and 2 34



models (ARENA) 2 75, 79



mass customisation aspects 2 36 MRP 2 28, 48, 53



performance evaluation 2 66 Sec also manufacturing information systems



 362



Index



map adaptation rules. See under KSDG-SOM algorithm map building 3 2H6 map building by depth first search 3 2H7



for composite materials 2 20H for heteogcneous materials 2 209 RBO 2 210, 211



3D world map 3 293



material optimization model 2 lH5, 1H6



procedure map building algorithm 3 290, 292 procedure traverse boundary algorithm 3 2H9, 291, 292 simulation of 3 292



material region sets generation, multi-heterogeneous



MAI~



Sec mean arterial pressure



coding space 2 194 crossover operation 2 196



decision variables encoding 2 194



MAPE. Sec mean absolute percentage error



decision variables evaluation 2 194



market efficiency 1 14; 5 195



flywheel example 2 201 genetic algothims, use of 2 193



market infrastructure agility 1 71 measurement parameters 1 75



modeling 1 75



Sec also information infrastructure agility; people infrastructure agility market share models 2 328 marketing



agent 3 10H strategies 2 325 marketing-production perspective



mutation 2 198 population selection 2 196 population size, determining of 2 194 reproduction 2 199 solution space 2 194 stop criterion 2 199 materialized views 2 251. See also view rewritings mathematical models 1 7 mathematical programming



competitor entry and 2 324



integer linear programming 5 293



defensive marketing strategies 2 325



mixed integer nonlinear programming 5 293



equilibrium in positions 2 326



See also location-allocation problem; facility location problem



equilibrium in prices 2 326



product pricing aspects 2 324 product redesign aspects 2 324



mathematical system analysis for SPM modeling



concept of project patterns 1 220



Markov models 5 101



fuzzy models 1 220



MARS 362 MAS. See multi-agent system



game 1 220 interactive search method 1 220



mass customization 1 3; 2 35, 56



management in uncertain conditions 1 220



challenges 5 1H7 customer-driven design for 5 188 information sysyems for 2 36 modular product family design for 5190-191,193



matrix dominant complex union 2 212



product design selection and evaluation aspects 5 183



matrix dominant subtraction 2 211



strategies 5 lH7, lHH



matrix organizational structure 2 143 structure



mass customization supporting technologies



metaheuristic algorithms 1 220 Pareto-simulated annealing 1 220



relational 1 220



advantages 2 141 disadvantages 2 142 inSMEs2143,144 See also functional organizational structure; project



CAD 2 37 CAD/CAE 2 37 CAM 2 37 CIM 2 37 CNC 2 37



maturity levels. See process maturity levels



ED! 2 37 FMS 2 37



maximum likelihood



organizational



MAUT. See multiple attribute utility theory



mass customized goods 1 4



classifier 3 25H



master agent 3 147 master knowledge 1 lH7 material constituent composition 2 206, 207, 214-216



estimation (FMLE) 5 133



material design method of multi-heterogeneous material



CAD model 2 1H7 CAD/CAE software use 2 1H4, 1H5 microstructural aspects 2 186



criterion 5 132 maximum message delay 2 112



vs. channel utilization 2 116



vs. LAN transmission rates 2 114 vs. maximum message sizes 2 117 maximum message sizes 2 113



optimal material properties, determination of 2 187



vs. channel utilization 2 116



optimization model 2 185, lH6



vs. maximum message delay 2 117



stifTnes matrix 2 188



material flows 2 310 material microstructure models 2 206 coordinate system 2 208, 209



MAXNET 5 21. See a/50 BS13 MBD 4 224, 232 MbD. See management by delegation MER for fault diagnosis 4 230, 234



 Index 363



MBS. See medical benefits schedule



microarray expression. Sec gene expression



McCulloch-Pitts neuron 5 7H, H1, H3



micro-protocols 4 421



MCDM. Sce multi-criteria decision making



microstructure models 2 20R



Mel. See multiplicative competitive interaction



middleware 1 173



MCIS. Sec misson critical intelligent systems



mid-latency auditory evoked potentials 4



MCKS paradigm 4 23



mini-loader development (sequential product



MCE Sec model conversion protocol MeS. mean mean mean



See mobile code systems



n



development example) concurrent development process 2 166, 167



absolute percentage error 4 166



house of quality, building of 2 163, 164, 166



arterial blood pressure 4 97



responsibility matrix 2 167



pulmonary arterial pressure 4 1f)4



mean sguare error 4 166; 5 73, R2



two-level team structure 2167,171 minimality in path traversal criteria 3 301



measurement of product varieties 1 6



minimax strategy 4 302



mechanical displacement 4 3S3



mining



mcchatronics 4 330-331. Sec also adaptronics: piezoelectric water active sensor



media access control. Sec MAC medical benefits schedule 4 253 medical data analysis Al in3 19H data mining methods 3 19B expert systems in 3 19B



IDA in 3 19H medical imaging systems 4 105



medical reasoning. Sec reasoning methods for patient momtonng membership function 1 73, 7H; 3 22



algorithms 1 11H data. Sec data mining Web. See Web mining mining association 1 132 domain 1 133 granularity 1 133 rules 1 11H, 119 MINLP. See mixed integer nonlinear programming problem MINUTE 4135 protocol 4 136 QoS negotiation 4 137



Sec also TRACE



defined 1 241; 5 114



MIS. See manufacturing information systems



for IBF 4154



mission critical intelligent systems



for product development modeling 5 153, 154



continuous reasoning domain 4 125



for self-organized learning algorithm 4 157



DAI -based 4 127



formulation 5 115



data representation in 4 126



Gaussian 5 116



dependable intelligent system 4 130



generalized Bell 5 116



distributed. Sec DMCIS



in fuzzy modeling 1 223, 224; 4 73 in fuzzy SPM model 1 232



embedded operation 4 126



of agility infrastructures 1 70



enterprise architecture integration 4 127 fault tolerance in 4 127



of two dimensions 5 117



forward chaining techniques 4 125



paramertrization 5 115



knowledge representation scheme in 4 126



smooth 5116



reactive response behavior 4 125



trapezoidal 5 116



real time expert systems 4 124. 125



triangular 5 116



reasoning in 4 125



tuning 1 233



reusability 4 125



memory Elman network 5 1{)2



scalability 4 125



MERISE 2 6H



temporal representation of data 4 125



message delay. Scc maximum message delay message handling process 3 112-114



Sec also distributed intelligent systems mission critical systems



message passing 3 160. Scc also RPC



availability aspects 4 122



messaging agents 1 2()(J



failures and errors 4 122



meta knowledge base 2 237, 23H



fault tolerant 4 123



META tag 1 113, 127



HASE 4 124



mctadata 1 116



real time systcm 4 121



mctahcuristic algorithms 1 220



release time 4 122



meta-knowledge 1 173; 4 24



reliability aspects 4 122



meta-schema 1 297, 29<)



response time 4 122



meta-system 4 41



safety aspects 4 122, 123



Metropolis Monte Carlo simulation 5 304



security aspects 4 123



MIB. Scc management information base



survivability aspec-ts 4 124



 364



Index



timing constraint 4 122



trusted computing 4 123 See alsomission critical intelligent systems mixed integer nonlinear programming (MINLP) porblem



5293 GA-based solution for 5 310 hybrid method 5 306, 307 mixed type chromosome 5 315 MKB. See meta knowledge base



GSM 1 262 iModc 1 262 mobility mechanisms. Scc code mobility mechansims 3



235 modal analysis 4 370 model conversion protocol 4 38 model of process maturity. Sce CMM model



model predictive control 4 97, 9H. See also generalized predictive control



MLAEP. See mid-latency auditory evoked potentials MLP See multilayer perceptron



models of programming construction 1 219



MMAC. See multiple model adaptive controller MNIiZ tool 1 230, 231



modular design



mobile agents



for distributed monitoring 3 244 for monitoring systems 3 232



paradigm 3 232 See also client-server paradigm; code on demand; remote evaluation



mobile agents-based distributed monitoring



agent dissemination/deployment 3 245 agent migration/cloning 3245 mobility patterns 3 245 mobile agents-based management 3 244, 246. See also MbD mobile agents-based management issues



complexity 3 24H mteroperability 3 247 migration overheads 3 248 safety 3 247 secrecy 3 247 security 3 247 standardization 3 247 transactional support 3 247 mobile code systems computational environment 3 234 executing unit 3 234 higher-order mobility 3 235 strong mobility 3 235 mobile robot 3 273, 2HO acquisition cycle 3 27H incident and reflected rays alignment (acquisition cycle example) 3 27H-2HO



learning and coordination cycle 3 29H NOMAD Super Scoutt-ll 3 2HH perception cycle 3 2H5 sensor-detector pair assembly 3 27H, 279, 2HO, 2Hl mobile robot navigation planning bi-directional associative memory 3 308 evolutionary algorithm 3 313 temporal associative memory 3 309 mobile security authentication problem 1 279 end user requirements minimization 1 279 mobile technology CDMA 1262 CDPD 1 262 GPRS 1 262



MODM. See multiple objectives decision making of product 2 296 power supplies (mass eustomization case study) 5 204 modular design approach family design aspects 1 17 for product family design 1 4 product planning aspects 1 17 modular platform 1 9 modular product architecture 1 7



design paradigm 1 4 modular product family design 1 4, 1H architecture modeling 1 9 configuration design 1 12 evaluation for customization 1 13 evolution representation 1 10



knowledge support framework 1 19 knowledge support scheme 1 20, 21 Knowledge supported 1 27 sGA approach 1 12 modular product family modeling 1 9 modular software components 3 9H advantages 3 99 object-oriented approach in 3 !OO modularity for product family 1 4 index parameter 1 75 matrix 1 15 functional 1 6 modularization process. See module-based product family design module-based product architecture 1 5 module-based product family design



clustering algorithm 1 15 for mass customization 1 17 functional modules 1 15 genetic algorithms 1 17 heuristics 1 15 hierarchial building blocks 1 15 interface standardization 1 17 knowledge modeling 1 22 knowledge support process 1 22 modularity matrix 1 15 module reconfiguration 1 17 product variant evaluation 1 17 requirement analysis and modeling 1 15 structural modules 1 15



 Index 365



module classification s fun ction al 1 15 stru c tural 1 15



DAI and 1 179 ; 4 42 DMC I5 and 4 134 eng ineeri ng int eraction protocols 4 409 , 41 \



m odul es for product fami ly design 1 5



for airline ticket ing syste m (case stu dy) 3 138



moni torin g activities



for e-Iearning 3 183



disseminatio n 3 22 4



for KM 1 166



generation 3 224



in B2B e-commcrce 3 135 , 145. 150



present ation 3 224



in teraction isuue s 1 \79 ; 4 409



pro cessing 3 224



reso urce allo cation protocol 4 139



mo nitoring architectures



centralized 3 225 distribut ed 3 228 hi erarch ical 3 227 moni to ring data



task alloca tion pro to col 4 138 TRACE 4 143



See also intelligen t agents multi-attribute ana lysis 1 84, 101, 104 multi- crit eria o ptimizatio n 5 260-26 1



da ta pro cessing 3 224



multi-criteria utili ty analysis 5 185



genera tion 3 224



multidimension al fuzzy sets



monitoring systems



functi on s 3 224 intru siveness 3 225



scalability 3 225 mon ito ring systems software design paradigms



ex traction rules 5 114



neuro-fu zzy modeling 5 114 multidisciplinary product developm ent tea m 2 132, 133 multidisciplinary team s 2 133 multi - heterogeneo us materials



client -serve r 3 230



in high- tech app licati on 2 177



code on dem and 3 231



mic rostructures 2 1HO



m obile age nt 3 232



optimal prop ert ies 2 188



remote evaluatio n 3 23 1



o ptim izatio n model creation 2 1HH



mo nito ring tec hnogloies and standards



C O R BA 3 23 8 051 3237 m onitoring tech no log ies and stand ards



prop erti es 2 180 sensitivity analysis 2 188 . 190 stiffness matrix 2 18H, 189 m ulti -heterogeneo us mater ials design



Intern e


CA D 2 203



Jain 3 240



finite elem ent analysis model 2 21H



Java 3 239 OSA 3 240 Parlay 3 240 SO AP 3 240 mountain bikes de sign (design selecti on ex ample) 5 16H moving average 5 70 , 72 auto reg ressive (AR MA) 5 67, 72 au to regressive int eg rated (AR IMA) 5 72 d errending w ith 5 68 weighte d 5 67 M PAI~ Set' mean pulmonary arterial pressure M PC. Sec mo del predictive control MRP. Sce manufac turing resou rce planning



MSE. Sec mean square error



multi issue nego tiatio n under tim e co nstrained



environm en ts. See MINUTE multi -agent mo deling by goal m od el 3 187



agent com m unica tio n 3 1H9 agen t coordination 3 188 agent identificationJ 1XX mu lti-agent system 1 198, 256 ; 3 77 13m - based 3 134 cha racrcrisrics 1 180 co m petitio n 1 179 co mm unicatio n protocols 4 4 1() co o peratio n 1 179 coordination issue 1 179



multi-heterogeneo us ma te rials design (flyw heel exam ple) mater ial co mposi tio n and microstructure selection 2



202 mater ial design requ irem ents 2 200 ma teria l region " ge ne ratio n 2 201 o ptimal propert y vecto r 2 201 optimization mo del creatio n 2 20 I sensitivity analysis 2 20 1 mult i-h eterogen eou s materials design meth od axiomatic design 2 I HO, I H2 customer attri bu tes 2 180 , 182 decision variables 2 194 design par am eters 2 l HO, 184 design requirements 2 1H1 design solu tion 2 18 1 functional req uir em ent 2 IHO , 1H3 geometric design 2 I H2. 183, I H4 material com position and m icrostru ct ure selec tio n 2 191 ,1 93 material design 2 1H2. 1R3, 1H4 material regi on sets ge neratio n 2 193



o ptim ization design 2 1HI using ID EFI X 2 I 'J2 wo rkflo w of 2 183 m ultilayer feedforward network for fault diagnosis 4 230 multila yer per cept ron 4 80 , 94; 5 62 based time series 5 HI



 366



Index



drawbacks 5 125



NetWare management agent 1 198



fecdforward neural network 5 H2



network agents



for financial time series prediction 5 62



connection and access agents 1 199



for fault diagnosis 4 230



NetWare LANalyzer 1 19H



training algorithms 5 H2



NetWare management agent 1 198



multi-model adaptive control 4 97 multi-path agility 1 76



network software distribution agents 1 199 network-based configuration



multiple attribute decision making 1 101



factor analysis 1 61



multiple attribute utility theory 1 101



I CT technology and 1 61



multiple co-operative knowledge sources. Sec MCKS multiple criteria decision making



model for product development 5 165 multiple attribute decision making 1 101 multiple objectives decision making 1 101



KM configuration 1 51 network-based KM configuration cluster analysis 1 57, 58



Probit model 1 59 network enabled optimization system. Sec NEOS



multiple drug infusion 4 103



network load. See network traffic



multiple model adaptive controller 4 98



network model components



multiple objectives decision making 1101



arcs 2 H7



multiple regression method 4 160



links 2 H7



multiplicative competitive interaction 2 328



multi-product approach. Sec product families multi-project literature 1 45. See also organizational learning literature multi-project management 1 45. See also inter-organizational design



multivariate Gaussian functions 5 H3 neuron 5 83 multivariate model for financial time series 5 71



for time series 5 71, 99, 102



See also univariate model multivariate non-linear regression 5 82 Nash equilibrium



nodes 2 H7 network protocols



CDI'D 1 262 grid architecture 1 264 security protocols 1 260 TCP/IP 1260 WAP 1262 network security, computer system and 1 269



Grid 1 2H2 mobile security 1 279 WI-F1127H network simulation, COMNET III 2 93, 94 network technology for smart organizations 1 25X grid computing 1 264



in prices 2 327-334 in product positions 2 327-334



mobile 1 261 Powerline communications 1 263



lemmas 2 331



trends in 1 258



theorems 2 332



Wi-Fi 1 260



Sec also product model formulation natural agents 1 189



wired network 1 259 network topologies modeling in COMNET III



natural knowledge 1 1H5



arcs 2 H7



natural knowledge possessors 1 182



links 2 H7



natural language processing 1 141, 178



nodes 2 H7



navigation planning



subnets 2 8H



bi-directional associative memory 3 308 evolutionary algorithm 3 313 genetIc algorithm 3 314



transit networks 2 8X



WAN clouds 2 H8 network traffic



temporal associative memory 3 309



in COMNET III model 2 90, !OO



See also obstacle avoidance learning



load 2 100



NDBG. Sec non-directional blocking graphs necessity measure 5 155 negotiation mechanism 3 120



NEOS 5 312, 31H NEOS server 5 316 nested GA 5314



modeling 2 JOO scheduling in COM NET III model 2 90 networked information systems 1 250 networked organizations human aspects of security 1 283 life cycle phases 1 249, 2H4, 285



as optimization tool 5 307



security application 1 2H4, 2H5



MlNLP problem 5 307



trust factor 1 2H3, 2H4



NetWare LANalyzer 1 19H



Secalsosmart organization; virtual enterprise



 Index



networking parameter 1 76



new product development 1 46



neural fuzzy model. See neuro-fuzzy model



new product introduction 1 102



neural network 1 197; 2 348, 349



next generation manufacturing project 2 30



applications 5 98



NGM. Sec next generation manufacturing project



artificial. Sec ANN



NLP. Scc natural language processing



back-propagation 4 28



NMA. See NetWare management agent 1 198



BSB 5 21



NN. Scc neural network NN-MLP method for forecasting 5 t ()1



cross-validation 5 94 feedforward. See feedforward neural networks



NO. Sce networked organization



for CAD/CAM integration 5 3



nodes in COMNET III models



for fault diagnosis 4 227, 235



computer group 2 87



for financial time series prediction 5 61, 62



network device 2 H7



for lCC estimates, 38



processing 2 87



for time series data preparation 5 94



router and switch 2 B7



for virtual proto typing 4 43



NoDoSE interface 1 144



fuzzy rule-based 4 154



NOlAPS



hybrid systems 4 153



case example 2 351



MAXNET 5 21 NO lAPS 2 350-351



information flow 2 350



RBFN 5124



infrastructure 2 350



time delay (TO NN) 5 62



neural network module 2 350-351



validation 5 93



See also IBF network neural network criteria for obstacle avoidance energy minimization 3 301 first neural net 3 302 minimaliry in path traversal 3 301 second neural net 3 305 third neural net 3 305 time minimization 3 301 neural network paradigms for product design adaptive resonance theory 4 28 back-propagation 4 28 fuzzy associative memory 4 28 neural network training 5 124 backpropagation algorithm 5 104 bagging 5 94 bootstrapping 5 94 FFNN 5 105 MlP 5 125 RBFN 5 125, 127 RMS error estimation 5 93 supervised 5 114 TDNN 5103 neural online analytical processing system. See NOLAPS neural-fuzzy model 2 352, 354; 5 112, 114 neuro-fuzzy approach for forecasting 5 101 neuro-fuzzv control for intelligent patient monitoring 4 103 ncuro-fuzzv hybrid approach for product family design 4 29; 5 202 neuron localized response 5 83 McCulloch-Pitts 5 78, 79, 81, 83 multivariate Gaussian 5 83 neuron network methods fuzzy 1 227 unfuzzy 1 227



367



data conversion module 2 350



Sec alsa OlAP NOMAD Super Scoutr-H 3 288, 292 non-directional blocking graphs 5 238 non-equivalent rcwntings 2 2<)9 nonlinear model for financial time series 5 71 nonlinear multivariate regression 5 H2 nonlinear programming problems 5 293 nonlinear regression model 4 153 for contingencies 1 5H tor KM Configurations 1 5H non-SlMC assembly plans non-linear assembly sequences 5 246 non-monotone assembly sequences 5 246 non-sequential assembly plans 5 245 Pseudo-non-coherent assembly plans 5 247 normal distribution 2 l)1



not-fcnnalizablc knowledge 1 33H. Sec also forrnalizable knowledge NP-complete problem 5 236, 25H NP-hard problem 4127: 5 213, 2~3 NPD. Scc new product development NPI. Scc new product introduction OAL. Sec ordered adjacency list Object Management Group. Sec OMG object migration 3 235. Sec also process migration object models 3 35 object-oriented (00) design for agent-based systcm 3 l)3-l)S. Sec also DwO approach object-oriented based agent programming 3 156, 157 object-oriented behavior for functional design knowledge 3 37 object-oriented manufacturing simulation language 3 111 object-oriented programming 3 158. See also agent-oriented programming object-oriented representation abstraction 3 1K



 368



Index



classification 3 18



ontology definition 3 9



encapsulation 3 18



semantics 1 299



inheritance 3 18



symbology 1 299



polymorphism 3 18



syntax 1 299



object-oriented systems 3 156



terminology 1 299



Object Request Broker 3 100



open source API 3 150



object technologies



open system 1 269



COM 3100



architecture 1 259; 2 45



CORBA 3100



interconnection. See OSI interconnection-reference model. See OSI-RM



objects active 1 I H9



openEHR



and robots 1 190



archetype model 4 277



insertion 1 3HO



architecture 4 277



passive 1 1H9



blood pressure observation example 4 279



software agents and 1 190



early supported discharge (case study) 4 287



objects relationships



aggregation 3 18 co-operation 3 1H inheritance 3 18



observer knowledge 1 186, 187 obstacle avoidance learning 3 299 ANN configuration 3 300 navigation process 3 300, 301 neural net guidance 3 302



hypertension in diabetes (case study) 4 2B5 instruction construct 4 279



reference model 4 277, 27H operating rooms (OR). See under intelligent patient monitoring operation commonality parameter 1 74



operation research 3 71



CFLP 5290 FLP 5 290 location-allocation problem 5 292



OEM digital product tracing 2 305



mathematical programming 5 293



life cycle cost 2 304



SSCFLP 5 290-292



peformance potentials, activating of 2 303,



See alsoexpert system for production planning and scheduling



304 product development and 2 314 off-the-shelf software 2 37. Sec also RAD OIS]) projects 1 2R4 OKMS 1194 Engine Room 1 19H intelligent agents for 1 19H OLAP 1 197; 2 349, 356 data mining technology 2 34H module 2 350



operational manufacturing systems 2 29 operational systems aspects, pellA 2 97, 106 animated simulation 2 10H



ARENA model 2 73, 107 critical data collection 2 107 line balancing 2 107 OPIS 362 opportunistic scheduler. See OPIS



neural networks and 2 348



OPT scheduling system 3 62 optimal kinematic gain 4 342



query methods 2 35H



optimal material properties, multi-heterogeneous



technology 2 357



Sec also NOLAPS OLS. See orthogonal least squared learning OMG 3 100, 23H OMSL 3111 on-line analytical processing. Sec OLAP online process monitoring 2 315 ontology 1119,132; 310 A/S design ontology 3 9 based information extraction 1 144



CG-based 1 153 common-agent-ontology 3 148 domain 1 122, 123 internal-agent-ontology 3 148



materials 2 181, 1H7 flywheel example 2 201 material sensitivity 2 18H stiffness matrix 2 1HH optimization



agent 3 111 algorithms for time series 5 Y8



constraints for assembly plans 5 248, 249, 259 with genetic algorithms 4 43 optimization concept in engineering decisions earthquake-prone structural design (example)



4326 life cycle costing 4315,316 objective function maximization 4 311, 312



model 1 297



reliability-based optimization 4 311



server 3 195



resource allocation to transport networks (example)



system 1 113 Web 1 153



4320 risk-cost combinations 4 313



 Index 369



optimiza tion problem, O R-related



OSA 1 259



FLP 5 290



O SD. See or gan ization self design



locatio n-allocatio n pro blem 5 292



051 1260



math ematical programming 5 293 NP- hard problems 5 293 SSC FLP 5 291, 292 optimization softwa re C P LEX 5 312 LIN G O 5 312 NEOS 5 3 12 PGAPack 5 312 UW IPMI X D 5 313 Will iam Go ffe's sim ulated annea ling 5 3 13 optimi zatio n techniques GA 5 9H



m anage ment and MbD 3 243, 244 m ulti-layered approach 3 155



051 m onito ring 3 237 eMIP 3 23 H CMIS 3 23H OSI - R M 1 260 O utp u t Analyzer, AR ENA o utp ut layer in R B F ne two rk 5 127



least squa re prin ciple 5 127



Secalso hidden laye r o utranking preferen ce model. See fuzzy outra nking



sim ulated annealing 5 98



prefere nce model



optim ization too ls, OR-related



o utranking relation , fuzzy 5 \ 59



bran ch-and-bo und 5 295 gene tic algorithm 5 303 Lagrange relaxation with sub -gradient m eth od 5 29H



outsourced info rmation system developme nt. See alSD projects OWA operator 5 166



local search algo rith m 5 302 simulated ann ealing 5 304



SeeJIso op timization software



packaged knowledge 1 I H5, 187 PACT system 4 25



o ptim ize r role o f agen t 3 84, 103



PageR ank from C oog le 1 115



optim um sritfn ess ratio 4 360



palletized PC B 2 97, 9H



OR operato r 4 72



paradigm shift from data and informatio n management



O R , See o peratio n research



to KM 1 167, 16H



O R AM A system 4 86



paradigm atic models 1 302



O range boo k 1 269



paradigms shifts in manufacturing



O R B 3 100; 4 48



BPR 2 35



orde red adj acenc y list 5 13



BTO 2 36



order ed weight ed averaging op era to r 5 166 organ ic proj ects 1 211



JIT 2 35 mass cusrornization 2 35, 36



or ganization self design 4 135



virt ual en ter prise 2 35



o rganizatio n's knowl edge man agem ent system, SC'c OKI\AS o rganizational aspects. Scc under KM classification cr ite ria



Sec also manufacturin g informa tion systems parallel genet ic algo rithm 5 226 parame rtrization, mem bership functio n 5 115 Pareto-simulated an nealing 1 220 Parlay G roup 3 240



o rganizatio nal learnin g in Produ ct Innovatio n 1 47 o rganizational levers classification 1 53 organi zatio nal structures func tio nal 2 I3 H m atr ix 2 141 project 2 140 team work in SME 2142 organizations as communities of agents 1 1H9 as inte lligent agen ts I 191 as KM system 1 194, 195 as m ulti-agen t system 1 194. 195 as passive o bj ects 1 I H9 KM aspects I 16H, 194 profiles I H7, 96 , 10 I , 104 o rganize age nt in AKBM system 3 123 o rienta tio n , image 1 376 o riginal eq uipm en t manufacturers . Sec O EM ort hogonal least squared learning 5 127



part co mmo nality parameter 1 74 part variety parameter 1 74 partial ente rpr ise mod els 1 297 partial fitn ess functions 5 260 participant kn owledge 1 336, 340 passive knowl ed ge possessor s 1 182 passive objec ts 11 89. See also active objects pathway in patien t care defined 4 261, 262, 263 objectives 4 264 propert ies 4 263 See also care plan; clinical guidelines; clinical workflow; prot ocol in patient care patient care care plan 4 260 chronic disease management 4 253 c hro nic disease pre valence 4 250 clinical decision support system in 4 250 clinical workflow 4 264



organ ization view 1 320 defin ed I 319 ISO standard requ irem ents I 3 19



 370



In d e x



comm unicatio n and coordination issues -4 25 1



3D planning problem 3 285



EON mode 4 273



buildi ng 3 285



EPC fram ework 4 253



map bu ilding 3 286



eviden ce- based m edi cin e -4 254



Sec also acq uisitio n cycle; learn ing and coo rdin atio n



GLIF model 4 27 4



cycle



health care co sts 4 251



per ceptrons for feature recogni tion 5 8



pathway in 4 261



performance catego ries



PRODI GY mo del 4 273 PROfo rm a m odel 4 274 prot oco l in 4 258 qu ality of care 4 251



Secalso intelligent patient moni to ring



suppliers or ganiz atio n catego ries 1 H7 technical ca pability catego ries 1 87 perfo rmance criter ia identifi cation and weighting 1 93 perfor m ance indi cator s 1 3 12



PAT RIA R CH system 3 62



fina ncial I 324



PAT R ICI A system 4 83



key (KP I) 1 324



pattern an alysis 2 288



no n-financial 1 324



pattern creation (Acm e G izm o Superstore exa mple) 2 286



performance potentials for prod uct life cycle 2 30 3



patt ern designer 2 282. See a/50 Web data ex tractio n



performances and KM co nfiguratio ns 1 60



system architecture patt ern designe r tool 2 278, 282 patt ern discovery associatio n rule mining 3 256 clu stering 3 257



codification I 6 1 network-based 1 6 1 traditional 1 6 1 performan ces, Pro du ct Inn ovation facto r analysis 1 61



im age C lassification 3 258



ICT impact 1 (,1



viewpoint 3 260



KM co nfigurations impact 1 61



patt ern mat chin g 1 142; 2 286



per iodi c mi crostructure m od el 2 209 , 2 14, 2 16



patt ern min ing. Scc als() viewp o int patt ern minin g 3 264



pe rso nal agents 3 133



patt ern recogni tio n , temporal 4 65



assistant agent 1 200



PCBA system 2 98



filter ing age nt 1 200, 2UI



ARENA 3.U sim ulation tool 2 72 AR ENA m odel 2 73 - 75, 96 , 97



search age nt 1 200



simu lation results 2 104



wor k- flow agent 1 2 (~) persona l iden tification numbe r 1 275 person al trust ed device 1 275 perso nali zation strat egy 1 50. Sec als(1codifi cation strategy person nel management 1 2 10 per sonnel team 2 133, 134 PERT network s. fuzzy logic- based 5 152 Petri Net 5 227 model 4 35 theory 3 184 PF F. Sec partial fitn ess fun ct ion s PG A Pack 5 312, 315 PGP. See pret ty good priv acy pharrnacokinctic model 4 102



simulation tools 2 72



physical system 1 290



SM T machine 2 71



PI. See Product Innovat ion PID control for pat ient m onitoring 4 9H piec ew ise averag es 5 67 piezoelectric effecr 4 350 piezoelectr ic wafer activ e senso rs 4 33 1 as embedded modal senso r 4 37 1 as embedded ultra soni c transducers 4 370 , 373 as hig h-bandwidth strain ex citer 4 37 t as high- bandwidth m aio senso r 4 37 1 as resona tor 4 37 J elastic waves generated by 4 374 electromechanical im pe dan ce 4 37U, 37 1 for h ealth monito ring 4 37 0



com m unicatio n load 2 113 C O M N ET III m odel 2 85 C O M N ET III sim ulatio n tool 2 72 C O M NE T III-built co m municatio n system 2 10 1 C O M N ET III-built info rm ation system 2 100 IEEE prot ocol s 2113, 118 informati on pro cessing system 2 )00 informatio n system aspects 2 10C) integrat ed model establishment 2 lJ3 int egrated simulation mode ling (example) 2 70 LAN perfor mance 2 113 o pe ratio nal system 2 97, 106 palletiz cd PCBs 2 97, 98



PD M . See product data manag ement PDN. Sa protocol description notation P D S. Sr<' pro duc t des ign spec ifications PDT. Sec product de velopment team PEM . SC(' parti al enterpr ise mod els peop le infra structure agi lity 1 7 1, 76 people track o f KM 1 t 65, 171 . See also infor mation techn ology track KM PEPS 3 62 percen tage on- time delivery criterion 1 92 percept ion cycle 3 27 4, 275, 277 2D planni ng pro blem 3 285



 Index



Lamb wave modes 4 373



platform evolution 1 6



Lamb waves excited by 4 374, 388



renegotiation 1 6



modal analysis 4 370



variants design 1 6



pin-force model 4 383



platform-based product family, modular 1 Y



Pitch-catch PWAS experiments 4 3Y8



platform effectiveness 1 14



shear coupling with structure 4 374, 380



platform efficiency 1 14



shear lag model 4 378



platform engineering 1 14



wave propagation 4 370



platform extensions 1 11



piezomagnetic coupling coefficient 4 355. See also electromechanical coupling coefficient



platform information model 1 24 platform product development 1 3



PIE See process interchange format



platform representation 1 7



PIN. See personal identification number



bending moment 4 385



PLC. See programmable logic controllers PM. See project management PMM 1210



Dirac function 4 3R4



Poisson's ratio 2179



energy transfer through 4 3H7



polymorphism 3 1H



pin-force model



interfacial shear force 4 384



position, image 1 376



interfacial shear stress 4 3H4



possibility measure 5 155



PWAS actuation strain 4 384



post-assessment processe-leaming processes 3 192



PWAS displacement 4 384



power supply family design evaluation and selection



PWAS stress 4 384



5204-206



structure displacement 4 3H4



Powerline communications 1 263, 2H 1



structure strain 4 3H4



pragmatic KADS 1 210



structure stress 4 384



precedence constraint 5 236, 24Y, 250



Sec also shear-lag model Pitch-catch PWAS experiments excitation signal4 400 experimental results 4 403 experimental set up 4 3YH Lamb mode tuning 4 401 PKI. See public key infrastructure



prediction rules 3 214. Sec also association rules



plan management 1 21 ()



predictive control



planning problem 5 28Y platform-based family level evaluation investment efficiency 1 15 investment efficiency 5 195



market efficiency 1 14 market efficiency 5 195



platform-based product design contemporary design processes 1 3 customer requirements modeling 1 4



precedence relations



boolean relations 5 254, 258 extrinsic (EPR) 5 252, 264 intrinsic (IPR) 5 250, 263, 264 predicate logic 3 19-21. See also fuzzy logic; propositional logic



ere 4 Y7, 98



MPC: 4 Y7, Y8



preference index 5 167. Sec also incomparability index; indifference index preference modeling. Sec wider fuzzy set theory for product development Prescribing Rationally with Decision-support In General-practice studY. Sec PRODIGY model



iteration 1 H



in patient care 4 273 pretty good privacy 1 276 preventive risk number 2 161



knowledge-intensive support 1 3



price equilibrium 2 326



design requirements and models 1 7



modular product design 1 4 platform evaluation 1 8 product architecture modeling 1 4



printed circuit board assembly. Sec reBA system



priori knowledge 5 272 prioritization



product development model 1 4



fault prioritization 4 244



product family generation 1 4



requirements. Sec product design prioritization



product platform establishment 1 4 re-negotiation 1 H variants design 1 H



platform-based product family design



371



requirements



privacy 1 272, 27Y private key cryptography 1 277 PRN. Sec preventive risk number



design constraints 1 6



probabilistic analysis 4 306



design requirements 1 5



probabilistic distributions 2 H2



no



function requirements 1 :;



probabilistic models 1



iteration 1 6



probabilistic reasoning 4 77



platform design 1 6



probability density function 2 81; 5 125



 372



Index



probability distribution 2 91, 102; 5 94 exponential291 for building traffic load profile 2 92 functions 2 102 normal 2 91 triangular 2 91 uniform 2 91 Probit model 1 56 for codification KM configuration 1 59 for network-based KM configuration 1 59 for traditional KM configuration 1 59



problem solver knowledge base 4 131 problem solving blackboard 3 30 methods 3 10, 12 problem-solving agent 3 106 data management 3 109 decision-making 3 111 external models 3 110 knowledge management 3 110



product assessment 1 4, 21



product data management 4 3, 6 for high data continuity 2 307 of digital product 2 307 product data/information representation model 1 26 product design AI-based integration of 4 39 AI-supported 4 5 automated and integrated design



45



CAD/CAM 4 5 computer supported collaborative design 4 9 concurrent engineering 4 6, 7



design for X 4 7 DFA methodology 4 7 DFM methodology 4 7 ergonomic considerations 4 7 evaluation 1 13



process control systems 2 34 process decomposition principle 1 302



integrated intelligent design 4 29 intelligent CAD 4 8 LCC estimates 5 39 modular 1 4 neural network pradigms 4 28 neuro-fuzzy hybrid scheme 4 29 optimization 2 163 platform-based 1 3 process knowledge 1 20, 23 process models 1 8 product knowledge 1 20, 23 specifications 1 20; 5 192 usability considerations 4 7 use ofWWW 410 virtual proto typing 4 9



process instance 1 302



Web-based knowledge intensive design support



process interchange format 1 301



system 4 45 See also concurrent product design



optimization 3 111



simulation 3 110, 111 See also control agents; structural agents problem teams 1 213. See also executive teams procedural knowledge 1 185, 187 procedure map building algorithm 3 290. See also procedure traverse boundary algorithm



procedure traverse boundary algorithm 3289, 291. See also procedure map building algorithm process aspects of KM. See WIder KM classification criteria



process knowledge 1 23, 338. See also product knowledge process maturity levels



product design prioritization requirements



defined 1 214 initial 1 214, 215



copier design (example) 5 162 fuzzy outranking preference model 5 158



management 1 214, 215



problem formulation 5 156



optimizing 1 214, 215 repeatable 1 214, 215 process migration 3 235. See also object migration



process modeling language 1 302 process models granularity 1 303 process planning 2 153 CAPP 412 intelligent process planning 4 11 process threading 3 4 process type 1 302 processing nodes model 2 87 PRODIGY model in patient care 4 273. See also EON model in patient care; GLIF model in patient care; PROforma model in patient care product architecture



modeling 1 21 modular product 1 5, 7,10 product family 1 7 variety design and synthesis 1 7



QFD toolS 157 product design selection and evaluation



analytical hierarchy process 5 186 decision support problem 5 185 design alternative evaluation and selection 5 1H5



design analytic methodology 5 185 for mass customization 5 1H3



fuzzy analysis 5 185 multi-criteria utility analysis5 185 product family design evaluation 5 184, 186 QFD 5186 Sec also product family design evaluation and selection product design selection concept fuzzy outranking preference model 5 166



MCDM model 5 165 mountain bikes design (example) 5 16H



problem formulation 5 164 product development 1 4



CAD45



 In d ex



C AE 4 6



concurrent eng ineering 1 44



costs 2 145 evolut io n 4 4, .5



product famil y archit ectu re 1 7 , 19 . 20 genet ic algo rithm 1 17 m odular 1 17 product fam ily design 1 3 . 22



LCe estim ates 5 39 , 4H



de ci sion support 5 204



managem ent 1 6



design requirements and mo dels 1 7



product data managem ent 4 6



DFM C approa ch 1 5 for ma ss custo m izatio n 5 I H4, IX 7. 11)0



product life cycl e ma nage me n t 4 H tim e 2 145



iteration 1 H



Sec also product design



kn owled ge in tcuvive suppo rt syste m



prod uct de velopment pro cess



1 2H



co ncu rrent 2 123, 126



kn owled ge mod el ing 1 22 . 25



fuzzy decision modelin g 5 151



modular de sign meth o dology 1 4



pro blem s in 2 126



mo du larizar ion process 1 15



seque ntial 2 124, 126



m odul e- based 5 190



techni cal and economic problem s 2 126 product developm ent scheduling exa m ple 5 178 fuzzy para llel scheduling 5 176



platform design 1 H platfor m evaluation 1 H



re-negotiation 1 K variants de sign 1 H product family design charac teris ti cs



fuzzy temporal parame ters comparison 5 173



com monality 1 l lJ



ge neri c algorithm for 5 174, 175



modularity 1 19 reusab ility 1 1t)



problem formulatio n 5 172 produ ct developm ent team in big com panies 2 133 logical ream 2 133 m ultidisciplinary 2 132



standardization 1 10 product fam ily design evaluat io n and select ion 5 1H4, I H6.1 93



custo mization/ evaluatio n mctrics 5 195



pe rsonnel tea m 2 133



Design Advisor syste m 5 205



struc ture 2 133



design decision suppo n 5 20 4



technology team 2 133



de sign ran king meth od o logy 5 196



three-l evel struct ure 2 134



evaluation ind ex 5 20 1



virtual team 2 133



for ma ss custom ization 5 204



prod uct development, fu zzy log ic-based



fuzzy clusteri ng 5 I ()6



design requ irem ent s pr ior itization 5 156



fuzzy de cision SUp pO Tt syste m 5 207



de sign selection co nce pt 5 164 fuzzy set theory 5 152, 153 imp recision or preferenc e m odelin g 5 154



fu zzy rank in g 5 19'1, 200



necessity m easure 5 155



neuro -fuzzy approach 5 202



possibili ty measure 5 155 qu ality fun ction dep loyment 5 152



power supplies (case study ) 5 204 VoC 5 196



sched uling aspects 5 172 produ ct eng inee ring 1 14 prod u ct evaluation and selecti on 5 206 pro duc t fami ly assembly proc esses 1 5 catalog 1 10 co nce pt 1 5 de fine d 1 3



h euristic evaluatio n 5 201 knowledge suppo rt scheme 5 IY4



See also platform-b ased



f amilv level evaluatio n



produc t famil y design issues commonality 1 2() compon ent stand ardi zation I 7 de sign information an d knowledge mo deling



1 21 manufa cturability 1 7 modularity 1 20



develop m ent 1 4, 5, 11



product architecture m odeling 1 21



differentiation process 1 7



product assessme nt t 20. 2 1



evaluatio n aspec ts 1 13



product cha nge I 7



evolutio n 1 4, 5, HH I , 19



product devel opment managem ent 1 7



int egr ated design 1 5



product fami ly architec t ure 1 20



maps 1 5, 10 , 11



product family ge nera tion 1 20 . 2 1



planning 1 20



product info rmation modeling 1 20



platfo rm -base d I J , 4, 7



prod uct pe rformance 1 7



using m odule-ba sed design pro cess 1 15 SC i' also product variants



pro d uct platform 1 20, 21 pro du ct variet y 1 7. 1fl



373



 374



Index



product family design knowledge process knowledge 1 25 product information and knowledge 1 25 product familv evaluation



knowledge platform 2 312



lIfe cycle cost, Sec Lee:



maintenance aspects 2 296 manufacturer's view 2 298



investment efficiency 1 15



modular product design 2 296



market efficiency 1 14



objectives 2 29H, 299



platform effectiveness 1 14



OEM 2 303



platform efficiency 1 14



optimization paradigm 2 294, 297, 299, 309



product family generation



performance potentials 2 303



structured genetic algorithms 1 12



product life-time value paradigm 2 312



through modules configuration 1 12



product maintenance 2 297



product family modeling



recycling aspects 2 295, 296



architecture modeling 1 9



sustainable 2 297



building block model 1 5



system management aspects 2 314



grJph representation 1 5



technical support processes 2 309



graphic representation 1 4



vision 2 295



modular 1 9



Web-based platform for 2 311



module or building block (lJlJ) 1 4



product life-time value 2 312, 313



platform representation 1 7



product model formulation 2 327



product family architecture 1 4, 7



attraction model 2 328



product family evolution 1 4



econometric methods 2 328



product-modeling language 1 4



linearizing transformations 2 328, 329



variety design and synthesis 1 7



market share model 2 32H



Sec also product platform



product positioning 2 329



product family performance cycle time efficiency 1 14



platform effectiveness 1 14 platform cfhcicncv 1 14 product flexibility 1 6



profit maximization objective 2 330 product modeling language constraints modeling 1 ::; for product family modeling 1 4 graph structure 1 4



product information modeling 1 20



product performance 1 6



Product Innovation



product planning 2 153 product platform 5 1HH



concurrent engineering aspects 1 44 cantmuaus (CI'I) 1 4H



concept 1 13



defined 1 36



effectiveness method 1 14



rcr



tools 1 37, 50



efficiency method 1 14



inter-organizational design 1 4H



evaluation aspects 1 13



knowledge creation 1 42



generation 1 21



knowledge management issues 1 37



integral 1 9



knowledge management systems and 1 37



lifecycle 1 24



multi-project literature 1 45



modular 1 9



Nl']) projects 1 4H



platform representation 1 7



organizational learning 1 47



R&D aspects 1 13, 14



performances factor analysis 1 61



re-configurable 1 9



resource-based view 1 42



variety design and synthesis 1 7



product knowledge 1 23, 157. See also process knowledge product lite cycle 1 6; 2 124



cost estimates. Sec Lee estimates design process 3 4 products traceability 2 306



product life cycle management 2 2<)3; 4 6. H



Sec a/50 product family; product family modeling product position 2 329 equilibrium 2 326 optimal 2 3311



See alsa NASH equilibrium product pricing



cost and benefit analysis 2 301



competitor cntrv and 2 324, 332, 333



customer's view 2 298



defending marketing strategies 2 325



digital life cycle product 2 304



equilibrium prices 2 326



digital product tracing 2 305



optimal 2 3311



digitally networked 2 315



sensitivity analysis 2 334, 335



economic assessment 2 290 iceberg effect 2 300 integrated information model 2 309, 310



Sec also NASH equilibrium product-process design design prototype and cases 4 23



 Index 375



inte lligent proc ess planning 4 11



inp ut estima tio n 1 2 11



inte lligent product design 4 10



param etr ic model 1 21 1



int elligen t product system 4 12



cop down estimatio n 1 211



inte lligent sim ulatio n 4 13 intelli gent system for 4 13 product qua lity 2 145 product realizat ion pro cesses 1 3 17, 31 8 produ ct redesign co mperitor entry and 2 32 4, 332, 333



project management ph ases (IBM ) completion 1 210



ide ntifi cation 1 21 n imp lementa tion 1 210 initiation 1 210



proj ec t management tech niq ue s 5 22h



de fendi ng ma rketing strategies 2 325



C PM 5 226



sen sitivity ana lysis 2 334 , 33 5



Gan tt Cha rt 5 226



See also product pricing; product position prod uct rep resent ation 1 9 , 19



pro duc t variants



PERT cha rt 5 226 proj ect o rgan izationa l struct u re ad vantages 2 140



design 1 6



disad vanta ges 2 14 1



evaluatio n 1 6, 17



Sec also fun ction al organizational struc tu re; matrix



evaluation aspects 1 13



or ganizational structure



measu rem ent 1 6



project pro cesses 1 214 , 217



repr esentat io n 1 6



project proces ses assessing 1 214



produ cti on co ntrol



Base Pra ct ice Ad eq uacy Ra ting Scale 1 218



agen ts 3 108



method 1 216



system S 188



SPICE mod el 1 21 6



production in frastruc ture agili ty 1 7 1 defuzzi ficatio n method 1 79



See also proj ect team assessing proj ect sched uling m od el 5 2 15



m easu rem ent pa rame ters 1 74



project team assessing 1 21 4



mo de ling 1 74,75



proj ect team s 1 208. 212; 2 138



parameters 1 75



Seealso info rmatio n infrastructure agility; market infrastructure agility prod ucti on loop 21 27, 138 producti on parame ter agen t 3 108 prod ucti on planning 2 153; 3 60 G ENESYS (case stud y) 3 65 intelligent planni ng 4 12 loop 2 127, 138 kn owl edg e base J 66 use o f ex pert system s 3 55, 5K pro du ction sched uling 2 82 profil e in AKBM system 3 123



profit maxim ization obj ective 2 330 P R O for ma m odel in patient ca re 4 274. Sce 01,0 EO N mo del in patient care ; GLIF model in patient care; PRODI GY mo del in patient care program ma ble log ic controllers 2 28, 29



in big co mp any 2 134 in SME 2 136 managemem syste m 1 235 stru ctu re (mini- loade r ex am ple) 2 167, 170



See also or ganizati on al struc tures projec tion operatio n 1 140. Sec also generalization o peratio n; spe cializa tio n o peratio n proj ects classification (C OCO M O II) applica tion composition 1 2 11 early de sign 1 2 11 em bedded 1 2 11 o rganic 1 21 1 po st archi tecture 1 2 11 sem i- detac hed 1 21 I Prolog 3 13 proposi tionallogic 3 I \I ; 4 433. See also predi cate logic pro toc ol description no tation 4 414, 429 pro to co l developm ent (05 1 approa ch) 3 155



for digita l life cycle product m ana gem ent 2 305



protocol engineering 3 155 , 16 I



in manufacturing information system s 2 53



protocol in patient care



proj ect m ana gement 1 174 ; 5 226



defi ned 4 258 , 259



C O C O M O model 1 210



objectives 4 260



hierarchi cal mo del 1 235



properties 4 259



in teg rated m odel 1 237



Sec alsocare plan; clini cal gu idelines; clinical wo rkflow ;



meth od o logy 1 2 10 softw are . Sec SPM stru ct ur al model 1 237 proj ect man agem ent meth od s estim ation by ana logy 1 2 11



estimatio n in orde r to win 1 2 11 expert assessment 1 211 fun ction points analy sis 1 21 1, 2 12



path way in patie nt care pro to col synt hes is 4 434 di rect exe cution of pro tocols in C PD L lan gua ge 4 435 pha se ro le 4 433 protocol synthesis approach 4 43 4 pro tocols layer s 3 15 8



 376 Index



Internet. Sec Internet protocols



stiffness ratio 4 361



RPC 3 160



Sec alsodamped dynamic operation; undamped



sub-protocols 3 157, 164 prototype expert priority scheduler 3 62



dynamic operation



query composition 1 354



prototype system for image retrieval 1 378



query cost 2 224



prototypical image 1 363, 364



query extent



prototypmg 2 37



DOD on 2 233



PSA. Sec Pareto-simulated annealing



information preservation on 2 231



PSKB. See problem solver knowledge base



query interface



PSM. See problem solving methods



DOD on 2 232



PTD. Sec personal trusted device



information preservation on 2 22Y



public key cryptography 1 277



query model



public key infrastructure 1 273



cost aspects 2 225



for mobile devices 1 280



efficiency of 2 227



protocol 1 280



metric of quality 2 225, 232



public keys 1 277 publisher module 1 156, 157 pulse contour measurements 4 95, 96 purchasing decision



relaxed SQL. See E-SQL 2 238 view maintenance cost 2 234, 235 Secalso data warehouse maintenance system; quality-cast-model; query rewritings



CBR systems in 1 91



query optimization technique 2 224



KBS in 1 90



query processing 1 381



purchasing function role 1 83



query rewriting 2 224



purchasing importance 1 83



approximate 2 226



purchasing, strategic 1 83



degree of divergence 2 232



PWAS. Sec piezoelectric wafer active sensor



EVE project 2 237



Q-C model. Sec quality-cost-model



information preservation on view extent 2 231



example 2 228 QC-value 2 240, 243



information preservation on view interface 2 229



QFD 5 152



non-equivalent



for fuzzy logic-based product development 5 157



redundancy factor 2 227



for product selection and evaluation 5 186 imprecise information in 5 158



relaxation factor 2 227



method 2 146,147



Sec also concurrent engineering tools; house of quality QMS 1 309, 325, 329. See also ISO 9000 family of standards QP See quality procedures QRS complex detection 4 69, 70 quality assurance 1 210



quality case structure 1 99



SELECT clause 2 227 total view maintenance cost 2 236 view extent 2 231 view rewritings 2 234



See also query model query rewriting attributes



attribute dispensable (AD) 2 229 attribute replaceable (AR) 2 229 common subset of attributes 2 231



quality category example 1 97 quahty-cost-model



R&D projects and product platforms 1 14



EVE project 2237,241



RAD 2 37. See also COTS; off-the-shelf software



legal rewritings and 2 246



radial basis function 4 80; 5 80



quality of query 2 226 view rnaintenanace cost 2 242



radial-basis function network. Scc RBFN



RandomLikeClustersRemainO function 5 283, 285



quality function deployment. See QFD



range of volumes parameter 1 75



quality management



ranking scheme



methods 1 226



HITS algorithm 1 115



system. See QMS



Page Rank 1 115



quality management principles 1 312 process approach 1 310 system approach 1 310 quality procedures 1 325 quasi-static dynamic operation instantaneous power 4 361 of smart structures with ISA 4 361 stiffness match principle 4 362



ranking, search results 1 115



rapid application development. Sec RAD RAPID Assembly implementation 4 40, 41 system architecture 4 40



rapid prototyping 2 3" RBE Sec radial basis function RBF feed forward networks based time series 5 84



 Index 377



RllF N 4 171



defined 5 125 F1S an d 5 112, 12H Gaussi an (G R ll FN ) 5 139



fuzzy logic 4 7 1,



n



prob ab ilistic reaso ni ng 4 77 , 79 reaso ning servi ces of descriprion logics classification 1 34-9



learn ing algorithms for 5 112



retrieval 1 349



supe rvised training 5 114



subsumption 1 349



tree based (T B- R ll FN) 5 145



recogn ition state 3 277



validati on expe riment s 5 139



reconfigurabilirv parallleter of product mix 1 75



See als" AN FIS RDF N design backpropagation algo rith m 5 13H ERD FN design 5 13H FC M algorithm 5 13H



re-configu rable pr odu ct platform 1 17 recover y tec hni ques back ward recover y 4 129 for ward recove r y 4 12<) recu rrent net work s



fuzzy clu stering 5 130



back prop agation net wo rks 5 6



fuzzy rule ext ractio n 5 13H



for feature recognit ion 5 e-



ou tput layer training 5 136



for fault diagn osis 4 22 H



Secalso RDFN training



for tim e seri es 5 91



R llFN ne twork validation experim ent s 5 13H fuzz y clustering for 5 140 GRDFN versus ERllFN 5 139, 141 , 146, 147 G R ll FN ver sus T D- R BFN 5 146 , 147 T ll - R BFN vers us ER IlFN 5 145, 146,1 47 Rll FN traini ng 5 125



RTRL algor ithm 5 92 training 5 91



See also co m peti tive netw orks; fcedforwa rd networks recy cling techniqu es 2 295 redundancy, Web do cum ent int er- page 1 J 12 int ra-pa ge 1 112



back -p rop agati on algorithm for 5 137



re- engineering. Scc business pro cess re- en giue crin g



fuzzy ru les for 5 137



referen ce gro ups 1 213



globa l ridge regr ession 5 12H



reference m od els 1 294



hidden layer 5 127



activity 1 330



local ridge reg ression 5 12H



beha vior al 1 330



ou tput layer 5 127, 136 sup er vised learning 5 127, 136



RilO. Sec reasoning boolean operati ons R BV Sft' resource-based view RDA. Sec relat ional data ana lysis RJ) ~ See resource descri ptio n framework rcachability analysis 4 431 reactive respo nse beh avior 4 125 real system in management modeling t 221 Teal time expe rt syste ms 4 124 real-time recur rent lear nin g algorithm 5 92 reaso ning 3 10 and image ret rieval 1 361 abduc tive 3 12 analogical 4 1 1 co ntin uous 4 125 de duc tive 3 12 induct ive 3 11, 11



business process 1 315 , .n o



generic 1 3112



paradig mati c 1 302 reflective practice loop 4 303. 304 regu lar exp ressions 1 142, 149 regu lar exp ressions 2 2lH reinfo rcement lear ning 1 177 relati on al data analysis 2 6H relat ion al m odels 1 220, 223 relati on sh ip matr ix 2 151



relaxatio n parameters 2 239 rela xed SQL query m odel 2 23H reliability anal ysis 4 327 reliab ility ofSNMP 3237 R eMind, CBR libr ary 1 93 Remote Acc ess D ial- U p U ser Service. See RADIUS authen ti catio n remote evalu ation 3 231 . See also client-server paradigm;



state 3 277 te mpora l 4 ()S



remote m eth od invocation 3 23Y



St'C also infe ren ce type s



rem ote mo nit oring 3 237



reaso ning boolean o pe rations inclusion dominant co m plex uni on 2 2 12 mat r ix do min ant co m plex uni on 2 212 ma trix domi nant subtractio n 2 2 11 reaso ning me tho ds for pat ient mon itor ing ANN 4 79 Bayesian netwo rks 4 77 evide nc e- based reasoning 4 74



Code on D em and; mobi le agen ts parad igm



rem ote pro ced ur e call 3 156, 160, 240 protocol 3 160 schem es 3 16 1



S('{' also m essage passing re- nego tiatio n for pro du ct fami ly designi ng 1 0, H reo rder ing m eth ods. SeC' Widergen eti c algo rit hm s reorien tatio n co nstraint 5 26 1 replaceable atrnbutes 2 22'1, 232



 378



Index



repository 2 7, 23



rewriting process. See query rewriting



representation agent 3 123



RFQ. See request for quotation RippleWeightsToNeighboursO function 5 2H3 risk acceptance. See acceptability of risk risk analysis 4 306 as a decision tool 4 304 basic concept in 4 307 risk assessment 2161; 4 307 risk based decision process 4 305 acceptability of risk 4 30H



representation issues in GA binary representation 5 222 direct representation 5 223



domain-independent representation 5 222 indirect representation 5 221 list or order based representation 5 222



problem-specific representation 5 222 representation of product varieties 1 6 representation scheme, hierarchial 1 9



basic optimization concept 4 311



representation vector 5 10



earthquake-prone structural design (example) 4 323 framework 4 304 reflective practice loop 4 303, 304



representational models



product family architecture 1 19 product family evolution 1 19 request for quotation 2 31 requirements analysis 2 21



resource allocation to transport networks (example)



431H Seealso engineering decisions



for product family design 1 15



risk-cost combinations 4 313



requirements determination 2 23



risk definition 4 29H failure probability 4 29H, 299 probability oflosses 4 299



requirements evaluation 2 23



rereading inputs method 4 130 RES. See robot exclusion standard residual time series 5 67, 100 resonance frequency 4 352 resonator 4371 resource access control 1 273, 283 resource allocation protocol 4 139. Sec a/50 task allocation protocol resource allocation to transport networks (decision making



example) accessibility concept 4 319 basic considerations 4 318 central Colombia transport network case study 4 321 decision criteria 4 31 H



risk management



ICAF estimation 4 314 Life Quality Index (LQI) estimatiion 4 314 need for 4 297 Societal Life Quality Index (SLQI) estimation 4315 RMI. See remote method invocation RMON. See remote monitoring RMS error 5 17, H4, 93, 94 roadmap 2 6, H



robot exclusion standard 2 270, 273 robotics 2 29 robots



optimization concept 4 320



and objects 1 190



Seealso earthquake-prone structural design



and software agents 1 190



resource-constrained scheduling problem 5 213 exact solution methods 5 216



flow-shop problems 5 214 heuristic solution methods 5 217



Job-shop problems 5 214 production scheduling problems 5214 project scheduling problems 5214,215



cyberspace 1 1H9 in manufacturing 2 2H



Secalso Web crawlers robust designs 1 5 role as programming artifacts 3 167 communication process 3 164



resource description framework 1 113; 3 86



defined 3 162



resource discovery 1 120. Seealso knowledge discovery



generalization 3 166



resource planning cost accounting systems 2 34



enterprise (ERP) 2 34 manufacturing (MRP) 2 34 resource view 1 42, 319, 321, 331 respiratory therapy. Sec leu monitoring, respiratory therapy responsibility matrix (mini-loader example) 2167,169



RETE algorithm 4 127 retinal images (viewpoint pattern example) 3 267 retrieval reasoning service 1 349 reusable solution libraries 1 26 REV. See remote evaluation



in Dheli language 3 176 limiting methods 3 12 representation in Agent-UML 3 167 sequenced roles 3 169 variable 3 179 See alsoscenario root mean square error. See RMS error routing algorithms in COMNET HI models 2 8H RPC. Sec remote procedure call 3 156 RPL. See reflective practice loop RSA encryption 1 273 RTRL 5105. See real-time recurrent learning algorithm rule-based approach for fault diagnosis 4 234



 Index



ru le-ba sed expe rr system 3 62



SD F files 2 102



rul e- based fault detection 4 7 1



SDLC



rule- base d kn owledge base 3 134 , 135. See also case- based knowl edge base



ana lysis 2 4 de sign 2 4



rule- based language 3 13



m aintenan ce 2



rule -ba sed production systems 4 13



plan nin g 2 4



rule -based represent ation for functio nal design kno wl edge 3 36 o f kn owledge 3 14 rule-b ased system fo r fau lt diagn osis 4 2 14 for prod uct design 4 2tl rule learning 5 t 11. Sec also ma chin e lear ning



RV. See representation vecto r S3 IE. Sec suppo rr system s for soluti on intro duc tio n &



evaluation SA. Secsim ulated ann ealing 521 7 SADT. Sec stru ctu re analysis and design tech niqu e sampling distribution 2 R1 Sc. Sec smarr cards



scalability distr ibu ted mo nito ring 3 22 tl monito ring systems 3 225 SN M P 3 237



scalar state gene ralized 1 23 1, 232 o f management 1 2J I sce nario 3 162, 164 , 167_ See olso role SCE Scc serv ice capability featu res sche d uler 2 279 agent 3 147 role of agem 3 R3, 101 schedu ling co nce pr 5 2 14 expert system based 3 5X fuzzy proj ect scheduling 5 172 GA tec hniques for 5 220 j ob sho p 3 62 j ust-in-t ime co ncept 3 66 o ptim al 5 2 15



produc tion. Sec production schedulin g prod uctio n manag emant aspects 3 60 proj ect 5 172 reso urce co nstrained 5 213, 214 sche duling system s H ESS 3 62 ISIS 3 62 KD I'AG 3 63 knowledge-based 3 62 M AR S 3 62 PATRIAR C H 3 62 PEPS 3 62 schema knowledge 1 336 SCI'



S CI'



sec ure co py



Script - M IB 3 243 SDD. Sf'C service descri ption diagram SDF 2 96



-t



search agem 1 200 search engi nes 1 12C) . 132



performance issue 1 11 2 quer ying aspects 1 114 results ranking 1 115 scalability i,sut., 1 112 using IR tec hni ques 1 I 14. 115 search methods 1 227 search strategies best -firsr heuristic 3 26 hill climb in g 3 25 informed 3 24



uninformed 3 24 See also cont rol strategies search tree tor best-fi rst heu ristic search 3 17 br eadth-first search 3 25 depth-fi rst search 3 25 hill climbing search 3 26 seasonal variations as time ser ies compo nent 5 65, (lC) seating systems (manufacturing into rmario n systems exam ple) 2 46 sec ure cop y 1 276 secure hypert ext transpo rt pro toco l 1 276 secure shell 1 27(, secure socket layer 1 276 security archi tectu res 1 27,\ 2X3 security attacks deni al of service 1 26H financia l fraud 1 2 (19



insider abuse of ncr access 1 2(>H lapto p theft 1 26 H sabo tage 1 26H system pen etr atio n 1 2M'; telecor u eavesdropping 1 2M'; telecom fraud 1 26H unauthori zed access 1 26H virus 1 26H See also securi ty mechanisms; security technologies security frameworks 1 26c) , 270 secu rity mechanisms che cksum s 1 27 2 digital signatur es 1 27 2 encryptio n 1 272 for acl.lievin g confideuriality I 273 hash algorit hms 1 272 See also secu rity technologies secu rity protoco ls HT T PS 1 2HO P KI 1 2RO W p K I 1 2RO WT LS 1 2XO



379



 380



Index



security services



multi-heterogeneous materials 2 187



authentication 1 272



of material properties 2 201



integrity 1 272



price change 2 334 product redesign 2 334



privacy 1 272



sensor map 3 2HO



security standards ADP system security policy 1 269 ISO/IEC 1 2(,9



emptiness 3 2H2, 2H3 occupancy 3 2H3 sensor-detector pair assembly 3 27H, 279, 2HO, 2H1



security technologies



access controll 267 anti-virus software 1 267



separability constraint 5 260, 262



biometrics 1 267, 274



sequential engineering 2 124, 126. See also concurrent



SENTINEL 4 HH-H9, 111H 11



digitdl IDs 1 267



engineering



encryption 1 207, 273



sequential, linear, monotone and coherent assembly. Sec



fircwalls 1 267,273



SLMC assembly



for mobile communication 1 279 for Wi-Fi



1 27H



sequential product development 2 123-125, 146 service capability features 3 240



for WIreless communication 1 277



service description diagram 3 HI



in wired networks 1 276 intrusion detection 1 267 LAN-based 1 279 network security t 269 open systems environment 1 269 personal trusted device 1 275 physical security 1 267 role of trust 1 271 services and mechanisms 1 272



session IDs 2 273



SHADE system 4 25



smart cards 1 275



shape, image 1 375



steganography 1 273



shear force 4 3H3



Sec also security attacks seed URL 2 265, 266. Sec also crawling depth SELECT clause 2 227, 23') self-organizing nups (SUM) 4 H5 for fault diagnosis 4 230 kernel-based 5 272, 273



SCML 2 356 shape description 1 353, 357-359, 361-364 shape insertion, basic t 379 shape recognition 1 361 shape similarity 1 377



shear lag model 4 37H energy transfer through 4 3H6 interfacial shear stress 4 37<) PWAS actuation strain 4 379 PWAS displacement 4 379 PWAS stress 4 379 stress-strain relations 4 379 structure displacement 4 3HO



neural network 3 257



Sec al.'·" KSDG-SOM



structure stress 4 3HO



self-organizing model 1 226 self-organized learning 4 156



surface structure strain 4 370 shear-layer coupling between PWAS and structure 4 374



self-tuning fuzzy model 1 232 self-tuning model 1 226



antisymmetric distribution 4 377, 37H



semantic data model 1 37H



energy transfer aspects 4 3HS pin-force model 4 3H3



semantic issues of 13 system 1 123 content semantics associated



SFDC See shopfloor data collection sGA. Sec structured genetic algorithms



to



domain 1 124



gaps between writers and readers 1 124 linguistic information detection 1 124 stop and stemming words 1 124 semantic network for knowledge representation 3 15 frame representation of 3 16 Semantic Web 1 113, 123,142,199.



Sec also RDF semantics denotational t 209 operational 1 2<)0 semi-detached projects 1 211



shear lag model 4 37H symmetric distribution 4 376, 377 shear stress 4 3HO, 3H3



shell type environments t 226



Shopllot 1 122, 145 shopfloor data collection 2 2H shopfloor systems 2 4H short term memory acquisition cycle 3 275, 27H rccoginncn state 3 277 sensing and aquisition 3 277



Sec also long term memory



semiformal languages t 300



shotgunning 4 244 SHTTP 1 276



sensing and acquisition state 3 277 sensitivity analysis 2 1H6



SICA analyzer agent 3 146



 Index



C U I agent 3 147



for stru ctural health m oni to ring 4 370



maste r agen t 3 147



mech atro nics and 4 331



sch ed uler 3 147 sigmo id fu crio n 4 183; 5 117



piezoel ectr ic wafer active senso r 4 331 sm art stru ctu res with ind uced -strain actua to rs



SIM 2 65. 68



airfoil vane actu ation exam ple 4 365



SIMAN simu latio n language 2 73. Sr, also AR ENA



co m pliant me chanism 4 356



SIM ON system 4 84



damped dynam ic operation 4 363



sim ple n etwork management protocol 3 236-237. See also C M ]£,; C M IS; C O R BA



static design 4 357



simple obje ct access proto col 3 240 simplified skeleto n 5 15 sim ulated annealing 5 2 17. 224 . 304 search algor it hm fo r assem bly seque nc e plannin g 5 238



qua si-static dynamic o pera tion 4 36 1 unda mped dyna mic ope ration 4 362 SM E cluster analysis and 1 57 co nc urren t eng inee ring in 2 163



W illiam Go tTe's 5 3 13



co ntinge ncies aspects 1 5X. 60



See also branch-an d- bo und me tho d ; Lagrange



facto r analysis 1 (,(I



relaxation ; local search



lC T tool 1 50, 54, 56



SIm ulatio n agen t 3 110. 111



KM confi gurations imp act on perfo rmanc es 1 60



sim ulatio n anim ation . See animated simu lation



knowledge m anagem ent aspects 1 37- 38, 54. 55



simulation data analysis



manufacturing infor mation systems and 2 43



ARENA 2 84 C O M N ET III 2 93



simulation language



matrix or ganization 2 143. 144 Probit m od el for KM co n figurations 1 59 P rod uct Innovatio n and 1 52 . 56



MO DSIM II 2 85



tea m stru cture 2 134



o bject-o riented 2 85



tea m wo rk in 2 142



SIM AN 2 74 sim ulation theories in SPM m odeling 1 221 sim ulation to ols. See inte g rate d mod eling sim ulation too ls



SM E wo rkgroup log ical team 2 136 technology team 2 136



situation know ledge 1 340



smoo th m embership fun ction 5 116



situatio n-specific knowledge 1 34 1



smo ot hing 5 95 .



size. im age 1 376 SLMC assemb ly 5 238 assembly plan 5 241



Sec "Iso det rend ing ; nor malizin g data ;



scaling of data: structu ring of da ta smoothi ng func tion 1 376



SMSL. Sec systems management scr ipt lang uage



assembly sequen ces 5 244



SM T m achi nes 2 71 . 72 . 97



assem bly seque nces represen tation 5 244



SMX protoco l 3 24 3



Secalso EMA S; n on -S LM C assem bly plans SLQ I. See Soc ietal Li fe Q ualit y Index small and me dium enterprises. S iT S ME~ smart agents 1 20 I smart cards 1 275 smart o rganizatio n character istics t 24H defin ed 1 248



hum an aspec ts of security 1 283 human role in 1 251 know ledge sharing 1 255 kn owled ge technologies appli cations 1 254 network tech no logies trends 1 258 secur ity applications 1 282 security aspects 1 253. 246. 267. 28 2



See ulst> networked o rganization; virtua l enterpr ise smart senso rs cardiac output 4 94 dep th o f anaesthesia 4 9 1 eso phagea l intubatio n detecti on 4 90 smart str uc tures design techniqu es 4 330 embedded ultra sonics 4 370



381



SN M P. Sr, simple network ma nage me nt protoc ol S-n orm o pe rato r 5 119 so. See sm art o rganizatio n S OA I~ See Simple Object Access Protoco l soc ial cap ital 1 17 1. 172 social reasoning m ech an ism 4 135 Societal Life Quality Index 4 3 15 sodiu m nitrop ru sside. Sc(' drug infusion m on itor ing 4 98 soft factors 1 89 soft knowledge 1 1X5 sofrbots 1 1X9 software agent 11 75, 189. 190 software develop m ent for manufact uring infor m ation system s 2 37. 38 softw are proj ect man agem en t. Sa SPM solution libraries, reu sable 1 26 SO M . Srt' self-o rganizing mJps sourc e co de for H T ML page 2 287 so urcing decision co m petitive impl ication s 1 H5 inaccurate cos ting syste ms t H5 o rganizatio nal imp licatio ns 1 H4 spatial graph approa ch o f imag e retri eval 1 350 spatial virtual entit y 5 10



 382



Index



SPC. Sec statistical process control specialization operation 1 146, 149. See also generalization operation; projection operation specification pages



collecting and merging 1 153 locating 1 152 URL guessing heuristics 1 152 usage of HTML tags 1 153 Web pages 1 156 SPICE modell 216-218. Sec also CMM model spiders. See Web crawlers SPM 1 208; 5 226 KBS system modelling 1 208 of teams-fuzzy rule mechanism. See SPM T -RFM system



-rulc fuzzy model. Sec SPM,-RFM model SPMNet model 5 226 Sec also GA for software project management SPM model CMM modell 214 fuzzy 1 208, 227 hierarchical 1 208, 235, 236 integrated 1 237 SPICE model I 216 structural 1 208, 237 uses 1 221 SPM modeling possibilities diagnostic analysis 1 219 mathematical system analysis 1 220 models of programming construction 1 219 prognostic analysis 1 219



system analysis from cybernetics 1 21 Y



top-down approach 1 219 SPM modeling problems expert knowledge 1 208 supporting management processes methods 1 209



SPMNet model 5 226-227 SPM-RFM model 1 240 Sl'Mj.hierarchial model 1 236, 237 SPM T -RFM system 1 235 SPM, decision support systems 1 237 project team 1 230, 233 team project 1 232 SPM,-RFM 1 234, 235 decision support system 1 241 fuzzy model 1 240 integrated model 1 238, 240 structural model 1 237 system 1 237 SRM. Sec social reasoning mechanism SSADM. Sec structured system analysis and design method SSCFLP 5 294 SSCFLP problem 5 291 GA + branch-and-bound for 5 307 hybrid method for 5 306 Lagrange relaxation method 5 298



proof 5 292



SSH. See secure shell SSL and WTLS 1 280 protocol 1 276 solutions 1 281 standard for the exchange of product model data. Sec STEP tool standard generalized markup language. Sec SGML standardization 1 6



state control 3 161, 167 state inheritance 3 169 state objects in goal model



atomic state object 3 184 composite state object 3 184 See also transition objects state space inheritance 3 1h8 static design for smart structures with ISA energy transmission efficiency 4 357, 358 stiffness match condition 4 357



stiffness ratio 4 357, 358 See also dynamic design for smart structures with ISA static stiffness 4 359



match 4 357 ratio 4 364 See also dynamic stiffness static stroke 4 348. See also dynamic stroke statistical analysis for Lee estimation 5 45



of time series 5 61 statistical distribution functions 2 96, 102 statistical models 1 220 statistical probability distributions 2 81 statistical process control 2 28



steepest descend method 2 187, 190 steering law 1 225



steering regulator 1 233 steganography 1 273 stemming words 1 124, 125



STEP product-modeling standard 1 22, 24 STEP tool 2 45 stiffness match concept



dynamic 4 360, 363, 364, 405 for quasi-static dynamic operation 4 362



for undamped dynamic operation 4 362, 363 in actuator-structure interaction 4 336



in damped dynamic system 4 363 in displacement-amplified actuators 4 342



static 4 357 See also impedance match principle stiffness matrix 2188,218,219 stiffness ratio 4 388. See also optimum stiffness ratio STM. See short term memory stochastic model 2 82; 5 73 stochastic system models 2 82 stock market prediction 5 100



Elman net 5 101 FFNN-based 5 101 neuro-fuzzy approach 5 101



 Index 383



NN-MLP method S lOl TDNN S 101



stop words 1 124, 125. See also stemming words



for RBFN training S 127, 136



Sec also unsupervised learning supervised training S 279 KSDG-SOM S 276



strain exciter 4 371 strain sensor 4 371 strategic level manufacturing systems 2 29



supplier partnerships technology 2 29



strategic management of platform products 1 5



supplier relationships 2 34



strategic purchasing 1 83, 92 strategic sourcing model 1 86 Strathclyde integration methodology. Sec SIM 2 65 string-based indexing 1 114. Sec also term-based indexing



suppliers organization categories 1 94 current manufacturing capabilities 1 87 return on investment 1 87



string wrappers 1 142 strong mobility 3 235, 245. Sec also higher-order mobility; weak mobility structural agents (supply chain) 3 106, 107



RHFN S 114



suppliers organizations analysis 1 R9, 101, 104



sales growth 1 R7



top management compatibility 1 87 supply agents 3 lOR supply chain 2 29, 53



distribution center 3 lOB external supplier 3 lOR



agent-based 3 105



manufacturing plant 3 108



information broadcast in 2 54 information flow and operation 2 48, 49 of manufacturing organizations 2 47



marketing 3 lOR production parameter 3 lOR retailer 3 lOR supply 3 lOR



transportation 3 108



Sec also control agents; problem-solving agents structural analyzer 1 127



structural capital 1 172 structural model on linguistic level 1 233 SPM 1 20R, 237 SPM,-RFM 1 237



structural modules 1 15



ICSA agents 3 105



systems 2 27 supply chain agents architecture message handling process 3 112 XML-based message contents 3 114 supply chain management 2 34 supply control agents 3 109



support systems for solution introduction & evaluation 2 7, 25. Sec also integrated methodology for EIS support vector machines 5 42 surface mount technology. Sec SMT machines sustainable development 5 ]<J



structure analysis and design technique 2 68 structure event agent (problem solving agent) 3 110



sustainable product life cycles 2 297



structure model for fault diagnosis 4 218



swarm intelligence 1 255



structured genetic algorithms for evolutionary design 1 12 for product representation 1 12 regulatory genes, use of 1 12 structured processes 1 245 structured programming 3 161 structured programs 3 159 structured system analysis and design method 2 68 structured Web site 1 115 structuring of data 5 46. Sec also detrending; normalizing data; scaling of data; smoothing sub-gradient method S 29R, 299



SWEEP-algorithm 2 253 switching time series 5 75, 44, 105. Sec also chaotic time



subnets 2 RR



sub-networks. Sf£' subnets sub-protocol 3 157, 164,165,171-172 substitutability parameter 1 74 subsumption 1 359, 361, 364 subsumption reasoning service 1 349 Sugeno FIS S 121, 122. Sec also Mamdani FlS Sugeno fuzzy modelS 121, 122, 129



supervised bias 5 272 supervised learning 1 142; 4 15R as data mining method 1 117 for fault diagnosis 4 224



SVE. Sec spatial virtual entity



series; financial time series symbolic reasoning systems for product-process design 4 13 symbology, graphical 1 300 system development life cycle. Sce SDLC system level agents 3 133 system trust 1 284 Systems Management Script Language 3 244 TA. See temporal abstraction tabu Search S 217. 2214 TAC. See total acquisition cost tacit knowledge 1 I R4, 193, 333, 340 tactical manufacturing systems capacity planning 2 29 CIM 2 29 cost accounting systems 2 24 inventory control systems 2 29 labour costing systems 2 29 MRP 2 29 production schedules 2 24 tagging 1 116



 384



Index



Takagi-Sugcno-Kanga model 1 222, 223. See also Marndani's model TAM. See temporal associative memory TAl' See Testing Agents and Protocols task allocation protocol 4 137-138. Sec also resource allocation protocol



task and resource allocation in a computational economy



See TRACE task based design 4 19 task-based model 5 227 task force 1 213 TBD. See task-based design TBox 1 348 TB-RBFN 5 145-147 T-conorm5 119, 121 TCP protocols 1 276 TCpIll' protocol 1 260 TDNN. See time delay neural networks TDNNGF. See tlllder time delay neural networks team agents 1 200 team management



hierarchial model 1 235, 236 SpM /' system 1 236 team structure in big company



logical team 2 133, 134 personnel team 2 133, 134 technology team 2 133, 134 three-level structure 2 135 virtual team 2 133, 134 team structure in SMEs



logical team 2 136



technology team 2 136 two-level structure 2 136 team work



software/hardware architecture 2 320



system architecture 2 318 technical profiles 1 104 technology profiles 1 102, 103 technology team 2 133 in big company 2 134 in SME 2136 template-based methods for intelligent patient monitoring 4 66



fuzzy template 4 67 template interaction protocols 3 170 templates, data extraction 2 283



temporal abstraction 3 201 algorithms for abstraction patterns extraction



3209 and data mining methods 3 200 hepatitis database 3 202 knowledge-based 3 199 temporal abstraction method in hepatitis domain abstraction patterns determmination 3 206 abstraction patterns extraction 3 209, 210 abstraction patterns observation 3 207 temporal abstraction primitives 3 206, 207 temporal abstraction systems



RASTA 3201 RESUME 3201 VIE-VENT 3201 temporal associative memory 3 309



encoding and decoding process 3 311l in semi-dynamic environment 3 310



Sec also bi-directional associative memory



temporal logics branching 4 433 linear 4 433



house of quality and 2 154 in SME 2142 product development team 2 133



temporal pattern



requirements of 2 133



temporal reasoning 4 65



teams



matching 467 recognition 4 65-66 term-based document processing 1 114 term-based indexing 1 113. Sec also string-based indexing



assessing models 1 214 CMM model usage 1 214 functioning 1 215



Terminological Box. See TBox



in concurrent product development process



Testing Agents and Protocols 4 414, 428



2131 project 1 212, 213 technical capability 1 94, 97, 102 technical capability categories



analysis 1 89 calculating 1 98, 99 criteria 1 95 customer service 1 87 delivery 1 87 profiles comparison 1 H9, 99



quality 1 87 stage 1 98 technical descriptor of product 2 ISO, lSI, 165 technical products, Web controlled future vision 2 320



term frequency (TF) weighting 1 131



text analysis and summarization 1 135 text categorization 1 119. Secalso document categorization text clustering 1 119



text filtering 1 119 texture similarity 1 378 texture, image 1 376



thermodilution technique 4 94, 95 3G communication systems 3 240



3rd Generation Partnership Project (3GPpp) 3 240 three-level team structure 2 135. Sec also two-level team structure



Tightly-Knit Community 1 129 time delay neural networks 5 99, 85



based time series 5 85, 86 for time series prediction 5 62, 100



 In d e x



training 5 103



rokc nizarion 1 116



wi th glo bal feed bac k (T D N N G F) 5 105



toke nizer 1 130, 13 1



385



tim e minim izati on cr ite ria 3 30 I



to ke ns 4 424



tim e series and their techniq ues 5 63



top do wn appro ach 1 2 11, 2 19, 3114. Sf(' also bo ttom - up



com po nen ts of 5 66 defined 5 60



ap pro ach total acq uisitio n cost



analy sis 1 911



FFN N -based 5 HO financial. Scc fina nc ial tim e serie s



in ma ke o r buy m od el 1



l)()



tot al q uality m an agem ent 1 3 26; 2 3 4. 152



mod els S Z!



TQ M.



m ultilayer perceptron-b ased 5 HI



T RAC E 4 137



S CI'



to tal quality managem ent



stationary co ncept 5 74



adaptivcnc« of T R ACE MA S 4 143



str ictly statio nary co ncept 5 74



fairne ss of reso urce alloca tio n 4 142



sw itching.



MAS o rganizatio n in 4 l3 H



Sl'C



sw itching tim e series



wea kly statio nary co nc ept 5 74 tim e series analysis



de trending 5 67



neural netwo rks training 5 93 tim e series and neural networks



applications ch aot ic tim e series 5 99, 105 fore casting 5 99 multivariate m odel s 5 99-,-102 sw itch ing tim e series 5 99 , 105 tim e ser ies data preparatio n



red uct io n in d eco uun itm en ts 4 142



Sec also M INUT E traceable co m mu nication 1 253



tracing, digit al product 2 305



t-ack-and-loop tech nology 2 127 .1 311, 13 1 tr ad itio n al KM co n figuration cluster analysis 1 57, 5 H factor ana lysis 1 6 1



ICT tec hnology and 1 62 Probi t model 1 59 tr ain in g



detrc nd ing 5 YS



CG- wra pp ers 1 153



for neur al networks 5 94



feedforward net work 5 Y3



no r malizing an d da ta sca ling 5 96



KSDG-SO M 5 275, 276



smoothing 5 95



neural nets S 79



st ructuring 5 96



recurre nt net wo rks 5 () 1



tim e series pred ictio n 5 61 , 7 1



AN N for 5 76 , 77



wrappers 1 147, 14H training algorith m 5 ~ 1



Elm an nets 5 100



back pro pagation algo rithm 5 51), 51



fcedforward ne two rk for 5 76 , 77



BP T T algorithm 5



T DN N 5 l ao time series. ANN-base d 5 77 back propaga tio n algorith m 5 H2



n



fo r fecdfo rwar d netwo rk 5 lJ3 for Le e estim ation 5 50-5 1 for tim e ser ies analvvi v5 X2



FFN N 5 HO lo calized respo nse neu ro ns 5 H3 MLP 5 HI



trainin g instance 1 14H traini ng level paramc rc-r 1 7()



R ll F 5 H4 recu rren t networks 5 91



traini ng m eth od fo r A NN - baled CA PP 5 34 bac kprop agario » algo ri thm 5 25



single neu ron element co ncept 5 7H



T D N N 5 H5, H6



M LP 5 H2



unsupervised lear ning algorithm 5 25 tr aining m eth o d fo r featu re reco gnition



trainin g algo rithm 5 H2



back propagation algo ri thms 5 16



lise of fccdfc rw ard neural netwo rks 5 H2



by Prabhakar and H end erson 5 I H



tim eline- based m odel 5 230



tirneouts 4 424 T KC 1 129



Conjugate grad Ient algo r ithm 5 17 training run co nver gence co nd itio n . ,Sa /Indo



KSD G- SO M 5 27H S CI' .t /5Il v..ilidarion set



T LS protocols I 28 2



training set 5 10 1.



T- norm 5 I IH, 119, 12 1



transf er func tio n 4 370



TO-B E pro cess mo dels 1 32H, 33 0. Sec al'


token obj ects in goa l m od el 3 18 4 tok en passing LA Ns2 114-116 pro tocol s 2 I 11



transit ne tworks 2 XX



rransition



actio n sek..c tio n aspe(t s 3 I X(l firing 3 I H4, I H7 tran sit ion obj ects. SC(' dls( ) state o bjec ts fir ing rules 3 1H5 in goal mo del 3 1X4



 386



Index



input state 3 IX4. IX5 ootp ut state 3 1H4. IX5



undamped dy na m ic o peratio n dyn amic stiffness match 4 36 3 of sma rt stru ctur es with ISA 4 36 2



transitions in go al model co nc u r rent with transi tio n 3 1XS



stiffn ess match ing co nc ept 4 36 2



dir ect- to transition 3 1H5



Sec also damp ed dyn am ic operation ; q uasi-static



Jum p to transitio n 3 1HS



transmi ssion co ntrol protocol/I ntern et pro tocol.



Sec T C P/ II' pro tocol



dynam ic o pe ratio n un fuzzy neu ro n ne two rk method 1 227 U nified Agen t M odeling Langu age. See U AM L



transpor( b yef security protocols 1 2X2



uni fied mod eli ng lan gu age. See UML



traveling salesman problem 5 236 . 241



uni fied m odelin g technique 2 7. 23,3<). Scc also int egrated



TR EAT algo rithm 4 127



m eth od ology for EIS un ified m odel ing tools and repositor y 2 ~ 3



tree archit ecture



ann butc s 1 .5



uniform distr ibuti on 2 l) 1



hiera rch ical rep resentatio n sche me 1 5



unifo rm reso u rce locator. Sec U R L



m odules 1 :; svstcu e, 1 :;



uninformed searc h strategies



tree ba sed radial basis func tio n netwo rk 5 145



tree repr esent ation 1 143



breadth - first 3 ~4 . 25 depth- first 3 24, ~ 5



See also info rm ed search strategies



tree wr apper 1 142



un ion op eration 3 169 . Sec also aut om ata-related



trend as tun e set ies com ponent 5 65. 6H trend estimating 5 ()7- M{



uni on o perator , fuzzy 5 119



Tre nD ;.; sysh.' 111 4 6 (1-(17



uni variate mod el for financial time series 5 7 1. 103, 104



op eratio ns



tr iangular d istributio n 2 9 1



U n iver sal App roxim ation theorem 5 8 1. X2



rri.mgular



un str uctu red pro cesses 1 2YS



110r l11



o perJtor 5 118



troja n h on e 1 2(IS



uns upe rvised clusteri ng tech nique 5 114



tru st 1



un supervi sed lear ning t 142. Sec also sup ervised learn ing



~H3 . ~ X 4



tru st fo rms



os data m ining me th od 1 117 for fault diagnosis 4 229



inrer pcrso nal 1 272 inrr apc rso nal t 27 1



KSD G -SOM 5 276



o bject tr ust 1 272



system trust 1 1~2 trustin g belief constr ucts 1 2H4 trustin g int enti on and trustin g belief constr ucts 1 284 truth valued flow inferen ce 4 17H- 179



u nsu pe rv ised learni ng algorithm 5 25, 130. Sec also backp ropagation algorithm U R L 2261 guessin g heuri stics 1 t 52 wizard 1 157



Usn Datagram Pro to col 3 237



T S. S ec tab u search T SE S CI' typical shape funct io n T SK. 51'(' Takag i-S ugc no- Kanga m odel



user int erface 3 56 user int erface age nt 3 123



T SP. SCI' traveling salesman problem T V FI. S C(' truth valu ed flow infe rence two -leve l tC;1IH structure 2 137. Sa also three - level



user knowledge in AIM 3 U S, D R u tility func tio ns 4 302 UW IPMI XD 5 3 13, 3 15, 3 16



team struct ure typical shape function 4



<)()



VA. S ee value analysis V-a dj acency matrix 5 HI, 13



U AM L C ontract N et pro to col 4 42<) fo r interaction protocol m od eling 4 426 , 428 UAM Lc



validation set 5 Y3. Sa also training set value anal ysis 2147 benefits 2 156 ga ols 2 155



C o utr acr net pro tocol in 4 430



in pro du ct development ph ases 2 155



fo r in teracti on prot oco l modeling 4 426, 428



itera tion mod el of 2 15X



U J) I ~



Sec User Dat agr am Pro to co l



ultr ason ic me asure m ent 4 Y5 ultrason ic transd uc ers em bedded 4 370, 373 I'WA S- hased 4 .17l), 373 U M L 1 .>111 1 agent . ,S l' l' agcut- U lvll. artifacts 3 t 65



UM T. Scc unifi ed mod eling tech nique



me thod 2 157



Sec also con curr ent engineering tool s value stream m appi ng met ho dol ogy 2 46 varia nce esrim atio n 5 94 variant level evaluatio n 1 13 variants design t g variety design and synt hesis 1 7 variety of loads pa rame ter 1 74



VE. See virtu al enterpr ise



 Index



vector space model 1 114, 119 versatility parameter 1 74



view dimension of CERA 1 299 VIew extent common subset of attributes 2 231 information preservation on 2 231 view interface



virtual prototyping 4 9 AI-supported Internet-enabled 4 41 DAr-virtual design 4 42 feature-based 4 43 neural networks for 4 43



VRML 4 44 virtual proto typing system



boolean parameters 2 229 dispensable attribute 2 229



information flow 4 44



information preservation on 2 229



virtual workbench 4 44



preserved attributes, weights of 2 230, 231 replacable attribute 2 229 view knowledge base 2 237, 23H view maintenance cost 2 235, 244 basics 2 234 EVE project (example) 2 242, 243 total 2 236 view rewritings. See also legal rewritings



387



virtual catalog 4 44 virtual reality 1174, 19H virtual team 2 133, 134 virus 1 267 defense 1 274 Trojan horse 1 268 Vital- Trend- Visualization 4 lOH, III YO. See virtual organizations YoC. See voice of customers



attributes preservation 2 230



voice of customers 5 196



cost factor 2 235 incremental view maintenance 2 253 maintenance basics 2 234



YPN. See virtual private network



VRML 4 44 VTY. Sec Vital- Trend-Visualization



materialized views 2 251



QC-value 2 240 total view maintenance cost 2 236



view synchronization 2 226, 252 view synchronization algorithm 2 237, 240, 242 view synchronizer 2 23H, 245, 247 viewpoint pattern



discovery 3 260 frequent 3 260 kitchen plan example 3 254, 255 ViewpointMiner algortihm 3 261 viewpoint pattern mining



for general category images 3 264 for kitchen images 3 269 for retinal images 3 267 overview 3 261 ViewpointMiner algorithm build function 3 262 candidare-gen l function 3 262 candidate-gen2 function 3 263 generalize function 3 263 prune-obj function 3 264 prune-pattern function 3 264 virtual agency 3 H1, H2 virtual enterprise 2 35, 36, 40, 41 characteristics 1 251 datal communication security features 1 254



flexibility aspect 1 251 holon concept 1 251 holonic system 1 252 safe communication in 1 253 similarity aspect 1 251



Sec also networked organization; smart organizations virtual environment 4 44



virtual organizations 1 264, 265; 2 34 virtual private network 1 278, 279



W4F tool 1 143 WAN clouds 2 HH services modeling 2 87



Sec a/50 LAN WAP 1 262 gateway 1 2HO, 2H1 PKL SecWPKI WAP security model WAP 12HI WAP gateway 1 2HO WTLS protocol 1 2HO wave model of assembly 5 243. Sec also graph model of assembly wavelet transformation 4 70 WBS. Sec work breakdown structure weak mobility 3 235, 245. See alsohigher-rodcr mobility; strong mobility Web based control of technical products 2 318 based knowledge intensive design support system 4 45 based knowledge server 4 45 browser 1 157 conferencing 1 174 grammars 1 5 CUll 2H, 30 HTML 1111



hyperlink analysis 1 115 information extraction 1 143



intelligence 1 19H, 199 ontologies 1 153 page, Sec Web pages request 2 262 semantic 1 113 sites structure 1 115



 388



In d e x



server 2 261, 264



doc ument catego rizatio n 1 12 1



Web Cont ent and H vpcrlink Analyzer 1 125



gene ralization 1 120



Web cra wlers 1 114 , 122, 125, 126,130



informatio n ex traction



Web dan extract io n



t 120



ASP 2 262



resource discovery 1 t 20 structure min ing 1 121



craw ler- based data 2 265



system 1 1\ 2



databa se-centric data 2 264



usage mining 1 12 \



H TML technologies for 2 261 ,263 H T T P technolo gy for 2 261 J SP 2 262 XM L technologie s fo r 2 26 1, 263 Web data ex tractio n challeng es



Scc also data minin g Web pages 1 I II : 2 262 ex trac ting XM L data from HT M L 2 285 pattern creation aspec ts 2 2H6



tem plate tech nology 2 283



change manageme nt 2 272



Web Service s De scri pti on Language 3 122



design 2 27 1



Web techn ologies 2 260



legal 2 269



ASP 2 262



sema ntic 2 270



HTML 2 261



Web data extraction principles extracting X M L data from HTML 2 28 5 pattern creatio n 2 2H5






adm inistrative inte rface 2 276 , 279 AN D ES- based 2 275



H TT P 2 26\ JSP 2 26 2 XML 2 261 WebDMME archite ctur e 4 46, 47 frame work 4 46 implementa tion 4 46



Web KIDSS 4 47



data chec ker 2 276 , 278



weigh ted average approach fo r prodcut develop men t 5 152



da ta ex po rter 2 276



wei ghted movi ng average 5 67



data ex tract or 2 27 S, 276, 278



weigh ted perfo rm an ce rating 3 41, 43



data retri ever 2 275, 276



weighted smooth ing 5 67



patte rn design er 2 276 , 2H2



W E P 1 277



sched uler 2 279



WtMC. Sa wo rkflow managem ent coa litio n Wf}...1S. Sec w orkflow management system



Web data extraction technique s 2 259 X H T M L 2 273



XM L quer y 2 273 XML sto rage 2 273 XP ath exp ressions 2 273 XSL crylesbc cts 2 273



Web doc um ent hyperlinked 1 121 languag e backgrounds., 113 quality 1 112 redu ndant 1 112 scalability 1 112 searching of 1 111 silt, of 1 112 well-str uc tur ed 1 112 Web documents categorizatio n



w hat-i f analysis function 1 92 W HERE clause 2 239



W HERE- conditions 2 253 Widrow-H off learn in g rule 5 17 Wi-Fi LAN 1 26 1, 278



Wi - Fi ne twork standards 802.1 la 1 26 1 H02 . l l b 1 26 I 8m . l l g 1 26 I IEEE 802.3, 26 1 Wi - Fi secu rity



Kcrb eros 1 278 RADIUS 1 27 8 Wi-Fi technology 1 260 Wi-Fi transmissio n 1 27 8



auto m atic 1 1 \ 9



William Goffe's simu lated annea ling 5 313 , 3 \ 5



manu al 1 119



Windkessel mo dels 4 95



Web IR systems



wi red eq uivalent privacy (W E!') 1 277



craw ling and indexing 1 114



wi red ne twork



d irectory services 1 114 hyperlink ana lysis 1 115 im ple mentatio n 1 115



secur ity 1 276 tech nology 1 259 wirel ess application proto co l. Sec WAP wi reless co mmunication 1 277



que rying 1 114 representati on 1 114 Web m ining 1 113 age nt 1 122



co ntent mining 1 121



w ireless fidel ity. Sec Wi - Fi technol ogy



w ireless transport laye r sec urity. Sa WTLS



w isdom eng inee ring 4 30M W LAN 1 26 1



 Index



word stemming 1 125



work breakdown structure 1 211 work list handler 1 307 work teams 1 213 workcentre modules. Sec wider ARENA modules workflow 1 174 agents 1 200, 201 languages 1 300 modeling languages 1 308 workflow management



data perspective of 1 307 design principles 1 307 process management 1 305 workflow management coalition 4 2H5



389



training 1 143, 146-148, 153 tree wrapper 1 142 wrapper building tool DEEyE 1 144 Lixto 1 143 NoDoSE intrerface 1 144 W4F 1 143 XWRAP 1143 WSDDM 1 210 WSDL. Sec Web Services Description Language WTLS 1280 processes 1 281 security protocol 1 280 SSL and 1 280



workflow management system 1 200; 4 2H5 architecture 1 306



design issues 1 307 in patient care 4 265 remote enactment engines 1 306, 307



work list handler 1 306, 307



Xalan processor 2 273 XHTML 2 273. Sec also HTTP files 2 276, 278 pages 2 276 XML 1 111, 113



workgroup in big company



based knowledge representation 3 113; 4 12



logical team 2 133 personnel team 2 133 technology team 2 133, 134 virtual team 2 133 workgroup in SME logical team 2 134, 136 personnel team 2 134 technology team 2 136 virtual team 2 134 working knowledge 1 338 World Wide Web



based message contents 3 114 data from HTML, extracting 2 285 data interchange open standard 2 356 documents 2 273, 284 files 2 276 FIPA and 3 85, 86 for HIS 4 270 for information infrastructure implementation 2 356



business systems requirements and 2 263



for message handling process 3 113, 114 for query rewriting 2 225, 226 for Web data extraction 2 25~, 263 query 2 273



core technologies. Scc Web technologies



query optimization 2 273



Web data extraction 2 261 worldwise solution design and delivery methods 1 21 () worms 1 274



tools in manufacturing 230,37,41,42 XPath



WPKI1280



technology 2 284 XQuery standard 2 273 XSL 2 284 advantages 2 284, 285



wrapper



CC-based 1 143, 145 defined 1 141 dual wrapper approach 1 155 induction 1 141, 142 information extraction wrapper 1 146 instance. Scc training instance



knowledge-based 1 153 string wrapper 1 142



expressions 2 273, 27H, 2H4



as extraction template 2 2H4



drawback 2 284 extensions 2 2H4



files 2 276, 278 stylesheets 2 273, 278, 284 XWRAP system 1 143

Intelligent Support Systems Technology

Read more

New Technology-Based Firms in the New Millennium, Volume 6

Read more

New Technology-Based Firms in the New Millennium, Volume 6

Read more

Technology and Security: Governing Threats in the New Millennium

Read more

Intelligent transportation systems: new principles and architectures

Read more

Intelligent Transportation Systems: New Principles and Architectures

Read more

New Technology-Based Firms in the New Millennium IV (New Technology-Based Firms)

Read more

Astronomy and Astrophysics in the New Millennium

Read more

Neem: Today and in the New Millennium

Read more

Astronomy and Astrophysics in the New Millennium

Read more

Intelligent Robots and Systems

Read more

Intelligent and Evolutionary Systems

Read more

Intelligent Systems

Read more

Intelligent Systems

Read more

The Cosmos: Astronomy in the New Millennium

Read more

The Cosmos: Astronomy in the New Millennium

Read more

The Cosmos: Astronomy in the New Millennium

Read more

Balkans in the New Millennium, The

Read more

Business and Technology in China

Read more

New Technology-Based Firms in the New Millennium, V, Volume 5 (New Technology-Based Firms) (New Technology-Based Firms) (New Technology-Based Firms)

Read more

Fuzzy Logic and Intelligent Systems (International Series in Intelligent Technologies)

Read more

Advances in Intelligent Information Systems

Read more

Innovations in hybrid intelligent systems

Read more

Particle Physics in the New Millennium

Read more

Advances in Intelligent Information and Database Systems

Read more

Intelligent and Adaptive Systems in Medicine

Read more

Literacy for the new millennium

Read more

Hvac Controls In The New Millennium

Read more

Research and Development in Intelligent Systems XXVI: Incorporating Applications and Innovations in Intelligent Systems XVII

Read more

Research and Development in Intelligent Systems XXVIII

Read more

Recommend Documents

Intelligent Support Systems Technology

Intelligent Support Systems Technology: AM FL Y Knowledge TE Management Vijayan Sugumaran IRM PRESS Team-Fly® I...

New Technology-Based Firms in the New Millennium, Volume 6

Else_NTBF-Oakey_Prelims.qxd 5/12/2008 7:06 PM Page iii NEW TECHNOLOGY-BASED FIRMS IN THE NEW MILLENNIUM VOLUME VI E...

New Technology-Based Firms in the New Millennium, Volume 6

Else_NTBF-Oakey_Prelims.qxd 5/12/2008 7:06 PM Page iii NEW TECHNOLOGY-BASED FIRMS IN THE NEW MILLENNIUM VOLUME VI E...

Technology and Security: Governing Threats in the New Millennium

Technology and Security Governing Threats in the New Millennium Edited by Brian Rappert Technology and Security Ne...

Intelligent transportation systems: new principles and architectures

Intelligent Transportation Systems New Principles and Architectures © 2000 by CRC Press LLC CRC MECHANICAL ENGINEERIN...

Intelligent Transportation Systems: New Principles and Architectures

Intelligent Transportation Systems New Principles and Architectures Mechanical Engineering Series Frank Kreith—Serie...

New Technology-Based Firms in the New Millennium IV (New Technology-Based Firms)

NEW TECHNOLOGY-BASED FIRMS IN THE NEW MILLENNIUM VOLUME IV i New Technology-Based Firms in the New Millennium Volume...

Astronomy and Astrophysics in the New Millennium

Astronomy and Astrophysics in the New Millennium http://www.nap.edu/catalog/9839.html Copyright © National Academy of S...

Neem: Today and in the New Millennium

Neem: Today and in the New Millennium This page intentionally left blank Neem: Today and in the New Millennium Edit...

Astronomy and Astrophysics in the New Millennium

Astronomy and Astrophysics in the New Millennium http://www.nap.edu/catalog/9839.html Copyright © National Academy of S...

Report "Intelligent Knowledge-Based Systems: Business and Technology in the New Millennium"

Your name

Email

Reason

Description

Copyright © 2025 EPDF.TIPS. All rights reserved.
About Us | Privacy Policy | Terms of Service | Copyright | DMCA | Contact Us | Cookie Policy

Sign In

Email

Password

Remember me Forgot password?

Login with Facebook

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close